On 10/11/2016 06:25 PM, Stanislau Hlebik wrote:
# HG changeset patch
# User Stanislau Hlebik <st...@fb.com>
# Date 1476195835 25200
#      Tue Oct 11 07:23:55 2016 -0700
# Node ID 718ed86a3698631077a087efaf668d70513056f5
# Parent  6f5a3300a5457c92eb09170a30c98328ebe3bcce
bookmarks: introduce binary encoding

Bookmarks binary encoding will be used for `bookmarks` bundle2 part.
The format is: <4 bytes - bookmark size, big-endian><bookmark>
               <1 byte - 0 if node is empty 1 otherwise><20 bytes node>

CCing greg as our binary format overlord,

I'm not sure how useful it is to have the node to be optional. using nullid should be fine (and most stream/bundle will be compressed anyway.

This would give us the option to have the node+size first (as fixed width) then the bookmark name. Though ?

I do not thing it is necessary to point that the value is big-endian, we use big endian for all binary encoding.

Beside that, the change seems good. I've been tempted to advise for moving the encoding/decoding directly into the bundle2 part to benefit from the bundle2 utility to pack/unpack but it eventually felt more at home here.

I've some small comment inline for your consideration.

BookmarksEncodeError and BookmarksDecodeError maybe thrown if input is
incorrect.

diff --git a/mercurial/bookmarks.py b/mercurial/bookmarks.py
--- a/mercurial/bookmarks.py
+++ b/mercurial/bookmarks.py
@@ -7,8 +7,10 @@

 from __future__ import absolute_import

+import StringIO
 import errno
 import os
+import struct

 from .i18n import _
 from .node import (
@@ -23,6 +25,77 @@
     util,
 )

+_NONEMPTYNODE = chr(1)
+_EMPTYNODE = chr(0)

I would use 'pack' directly here. Who know what awe-full thing around char and str python might be preparing for pytnon3.

+
+def _packbookmarksize(size):
+    return struct.pack('>i', size)

Any reason we don't just inline this?

+
+def _unpackbookmarksize(stream):
+    """Returns 0 if stream is empty.
+    """
+
+    expectedsize = struct.calcsize('>i')
+    encodedbookmarksize = stream.read(expectedsize)
+    if len(encodedbookmarksize) == 0:
+        return 0
+    if len(encodedbookmarksize) != expectedsize:
+        raise error.BookmarksDecodeError(
+            _('cannot decode bookmark size: '
+              'expected size: %d, actual size: %d') %
+            (expectedsize, len(encodedbookmarksize)))
+    return struct.unpack('>i', encodedbookmarksize)[0]

small nits, The bundle2 code have a lot of utility around reading and unpacking, it seems like it could be worthwhile to move this unpacking code directly in the part.

+def encodebookmarks(bookmarks):
+    """Encodes bookmark to node mapping to the byte string.
+
+    Format: <4 bytes - bookmark size, big-endian><bookmark>
+            <1 byte - 0 if node is empty 1 otherwise><20 bytes node>
+    Node may be 20 byte binary string, 40 byte hex string or empty

The part about 'Hex' is surprising is it still up to date.

+    parts = []
+    for bookmark, node in bookmarks.iteritems():
+        encodedbookmarksize = _packbookmarksize(len(bookmark))
+        parts.append(encodedbookmarksize)
+        bookmark = encoding.fromlocal(bookmark)
+        parts.append(bookmark)
+        if node:
+            if len(node) != 20 and len(node) != 40:
+                raise error.BookmarksEncodeError()
+            if len(node) == 40:
+                node = bin(node)
+            parts.append(_NONEMPTYNODE)
+            parts.append(node)
+        else:
+            parts.append(_EMPTYNODE)
+    return ''.join(parts)

We should probably have something a bit more subtle than 1 unique byte blob. bundle2 part can be fed an iterator yielding the lines one by one would be a good way to have this generation smoother.

+
+def decodebookmarks(buf):
+    """Decodes buffer into bookmark to node mapping.
+
+    Node is either 20 bytes or empty.
+    """
+
+    stream = StringIO.StringIO(buf)
+    bookmarks = {}
+    while True:
+        bookmarksize = _unpackbookmarksize(stream)
+        if not bookmarksize:
+            break

Could we use 'while bookmark size' as an iteration value. A bare 'while True:' is a bit scary.

+        bookmark = stream.read(bookmarksize)
+        if len(bookmark) != bookmarksize:
+            raise error.BookmarksDecodeError(
+                'cannot decode bookmark: expected size: %d, '
+                'actual size: %d' % (bookmarksize, len(bookmark)))
+        bookmark = encoding.tolocal(bookmark)
+        emptynode = stream.read(1)
+        node = ''
+        if emptynode != _EMPTYNODE:
+            node = stream.read(20)
+        bookmarks[bookmark] = node
+    return bookmarks
+
 def _getbkfile(repo):
     """Hook so that extensions that mess with the store can hook bm storage.

_______________________________________________
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


--
Pierre-Yves David
_______________________________________________
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Reply via email to