Hello.

Brief summary: in a recent svn -> bzr conversion, I wanted the resulting
bzr repo to contain timezone information for each commit, based on the
known timezone of each commiter (since Subversion only stores an UTC
timestamp). For this, code was needed in BzrWorkingDir._commit to handle
the case when date.tzinfo is not None, and a `before-commit` function in
the config file to make changeset.date tz-aware ("non-naive"), by
setting its tzinfo member to an appropriate value. Without this, the
resulting bzr repo records all commits as if happened in the current
local timezone.

However, right before starting to code the above, I found that a plain
bzr.py file was producing repositories with wrong timestamps, which I
traced down to the use of mktime(). As said above, the timestamps were
in the local timezone, but they did not match the ones in the svn source
repository (!).

This e-mail contains two patches, one to fix the bug in bzr.py that
makes timestamps be wrong, and a second one to implement my wishlist
feature above, having timezones. Apologies for the verbosity.

                                 * * *

The bug is easily reproduced as follows: let's work with a svn repository with
only 2 commits, the first one with date "2006-01-02 13:14:15 +0000", and the
second one, "2006-05-04 13:12:11 +0000" (note that the first one is standard
time, and the second one, summer time). To see the problem, just execute:

  % cd /tmp/tailor
  % wget 
http://people.debian.org/~adeodato/tmp/2006-06-19/tailor-mktime/dummy.svndump.gz
  % gunzip dummy.svndump.gz
  % mkdir dummy
  % svnadmin create dummy
  % svnadmin load dummy <dummy.svndump
  
  % wget 
http://people.debian.org/~adeodato/tmp/2006-06-19/tailor-mktime/dummy.config
  % doit() {
      env TZ=$1 tailor --configfile dummy.config
      mv bzr bzr.$2
      rm  -rf svn *.log *.state*
   }
  % doit UTC utc
  % doit Europe/Madrid madrid
  % doit US/Eastern eastern
  
  % for t in utc madrid eastern; do echo $t; bzr log bzr.$t | grep timestamp; 
echo; done
  utc
  timestamp: Thu 2006-05-04 13:12:11 +0000
  timestamp: Mon 2006-01-02 13:14:15 +0000
  
  madrid
  timestamp: Thu 2006-05-04 13:12:11 +0200
  timestamp: Mon 2006-01-02 14:14:15 +0200
  
  eastern
  timestamp: Thu 2006-05-04 13:12:11 -0400
  timestamp: Mon 2006-01-02 14:14:15 -0400

The Madrid and Eastern repositories have clearly wrong timestamps. With
01a_mktime-utc.diff (attached) applied, the output for the same steps is:

  utc
  timestamp: Thu 2006-05-04 13:12:11 +0000
  timestamp: Mon 2006-01-02 13:14:15 +0000

  madrid
  timestamp: Thu 2006-05-04 15:12:11 +0200
  timestamp: Mon 2006-01-02 15:14:15 +0200

  eastern
  timestamp: Thu 2006-05-04 09:12:11 -0400
  timestamp: Mon 2006-01-02 09:14:15 -0400

Which is correct, but does not reflect DST. This can be achieved with an
improved version of the patch, 01b_mktime-utc-final.diff:

  utc
  timestamp: Thu 2006-05-04 13:12:11 +0000
  timestamp: Mon 2006-01-02 13:14:15 +0000

  madrid
  timestamp: Thu 2006-05-04 15:12:11 +0200
  timestamp: Mon 2006-01-02 14:14:15 +0100

  eastern
  timestamp: Thu 2006-05-04 09:12:11 -0400
  timestamp: Mon 2006-01-02 08:14:15 -0500

One last comment about this: I'm not familiar at all with the tailor
codebase, and not particularly knowledgeable about times and timezones,
either. I've just been scratching my own itch, which reduced itself to
vcpx/bzr.py, until I thought I got it right, but I don't know if this
problem may be affecting other backends, and should be treated in an
upper level of the code instead.

                                     * * *

As for supporting committers' timezone for bzr targets, it's a small
patch on top of 01b_mktime-utc-final.diff (diff -b output):

  @@ -200,8 +200,14 @@
           t_l = int(mktime(date.timetuple()))
           t_u = int(mktime(date.utctimetuple()))

  +        if not date.tzinfo:
           timezone  = t_u - mktime(gmtime(t_l))
           timestamp = t_l + timezone
  +        else:
  +            timezone = abs(date.utcoffset()).seconds
  +            if date.utcoffset().days == -1:
  +                timezone *= -1
  +            timestamp = 2*t_u - mktime(gmtime(t_u))

           # Guess sane email address
           email = search("<([EMAIL PROTECTED])>", author)

In other words, if the date is tz-aware, use the information provided.
The attached 02_bzr-with-timezones-and-mktime-utc.diff combines this
with 01b above, and it's the one I'd love to see accepted into Tailor.
The functionality it provides can be seen by executing the steps above,
but with a different config file:

  
http://people.debian.org/~adeodato/tmp/2006-06-19/tailor-mktime/dummy.config.tz

It just uses a `before-commit` function that maps each committer to a
timezone, and changes changeset.date to be in that timezone. The results
are, of course, independent of the TZ tailor is run in:

  TZ=UTC
  committer: Guy in America/Los_Angeles
  timestamp: Thu 2006-05-04 06:12:11 -0700
  committer: Guy in Europe/Helsinki
  timestamp: Mon 2006-01-02 15:14:15 +0200
  
  TZ=Europe/Madrid
  committer: Guy in America/Los_Angeles
  timestamp: Thu 2006-05-04 06:12:11 -0700
  committer: Guy in Europe/Helsinki
  timestamp: Mon 2006-01-02 15:14:15 +0200
  
  TZ=US/Eastern
  committer: Guy in America/Los_Angeles
  timestamp: Thu 2006-05-04 06:12:11 -0700
  committer: Guy in Europe/Helsinki
  timestamp: Mon 2006-01-02 15:14:15 +0200

With the author -> tz mapping reversed:

  committer: Guy in Europe/Helsinki
  timestamp: Thu 2006-05-04 16:12:11 +0300
  committer: Guy in America/Los_Angeles
  timestamp: Mon 2006-01-02 05:14:15 -0800

I don't know if somebody had ever expressed interest in this feature,
but well, here it is.

                                 * * *

And that was all. Please let me know if there are any concerns I should
address.

Cheers,

-- 
Adeodato Simó                                     dato at net.com.org.es
Debian Developer                                  adeodato at debian.org
 
                              Listening to: Boards of Canada - Oirectine
--- vcpx/bzr.py.orig
+++ vcpx/bzr.py
@@ -180,7 +180,7 @@
         """
         Commit the changeset.
         """
-        from time import mktime
+        from time import mktime, gmtime
         from binascii import hexlify
         from re import search
         from bzrlib.osutils import compact_date, rand_bytes
@@ -196,7 +196,9 @@
         else:
             self.log.info('Committing...')
             logmessage = "Empty changelog"
-        timestamp = int(mktime(date.timetuple()))
+
+        t = int(mktime(date.utctimetuple()))
+        timestamp = 2*t - mktime(gmtime(t))
 
         # Guess sane email address
         email = search("<([EMAIL PROTECTED])>", author)
--- vcpx/bzr.py.orig
+++ vcpx/bzr.py
@@ -180,7 +180,7 @@
         """
         Commit the changeset.
         """
-        from time import mktime
+        from time import mktime, gmtime
         from binascii import hexlify
         from re import search
         from bzrlib.osutils import compact_date, rand_bytes
@@ -196,7 +196,12 @@
         else:
             self.log.info('Committing...')
             logmessage = "Empty changelog"
-        timestamp = int(mktime(date.timetuple()))
+
+        t_l = int(mktime(date.timetuple()))
+        t_u = int(mktime(date.utctimetuple()))
+
+        timezone  = t_u - mktime(gmtime(t_l))
+        timestamp = t_l + timezone
 
         # Guess sane email address
         email = search("<([EMAIL PROTECTED])>", author)
@@ -216,7 +221,7 @@
         self._working_tree.commit(logmessage, committer=author,
                                   specific_files=entries, rev_id=revision_id,
                                   verbose=self.repository.projectref().verbose,
-                                  timestamp=timestamp)
+                                  timestamp=timestamp, timezone=timezone)
 
     def _removePathnames(self, names):
         """
--- vcpx/bzr.py.orig
+++ vcpx/bzr.py
@@ -180,7 +180,7 @@
         """
         Commit the changeset.
         """
-        from time import mktime
+        from time import mktime, gmtime
         from binascii import hexlify
         from re import search
         from bzrlib.osutils import compact_date, rand_bytes
@@ -196,7 +196,23 @@
         else:
             self.log.info('Committing...')
             logmessage = "Empty changelog"
-        timestamp = int(mktime(date.timetuple()))
+
+        t_l = int(mktime(date.timetuple()))
+        t_u = int(mktime(date.utctimetuple()))
+
+        if not date.tzinfo:
+            timezone  = t_u - mktime(gmtime(t_l))
+            timestamp = t_l + timezone
+        else:
+            # XXX I would expect date.utcoffset().seconds to just work, as
+            # in, return a signed number of seconds <= 12*3600. However, the
+            # utcoffset() method of some tzinfo objects as defined by pytz seem
+            # to return eg. timedelta(-1, 68400) instead of timedelta(0, 
-18000),
+            # thus making the abs() and .days play necessary.  --dato
+            timezone = abs(date.utcoffset()).seconds
+            if date.utcoffset().days == -1:
+                timezone *= -1
+            timestamp = 2*t_u - mktime(gmtime(t_u))
 
         # Guess sane email address
         email = search("<([EMAIL PROTECTED])>", author)
@@ -216,7 +232,7 @@
         self._working_tree.commit(logmessage, committer=author,
                                   specific_files=entries, rev_id=revision_id,
                                   verbose=self.repository.projectref().verbose,
-                                  timestamp=timestamp)
+                                  timestamp=timestamp, timezone=timezone)
 
     def _removePathnames(self, names):
         """

Attachment: signature.asc
Description: Digital signature

_______________________________________________
Tailor mailing list
[email protected]
http://lists.zooko.com/mailman/listinfo/tailor

Reply via email to