Re: Exact format of tree objets

2013-06-18 Thread Chico Sokol
Thanks!

By the way, where can I find this kind of specification? I couldn't
find the spec of tree objects here:
https://github.com/git/git/tree/master/Documentation


--
Chico Sokol


On Wed, Jun 12, 2013 at 11:06 AM, Jakub Narebski jna...@gmail.com wrote:
 Junio C Hamano gitster at pobox.com writes:
 Chico Sokol chico.sokol at gmail.com writes:

  Is there any official documentation of tree objets format? Are tree
  objects encoded specially in some way? How can I parse the inflated
  contents of a tree object?
 
  We're suspecting that there is some kind of special format or
  encoding, because the command git cat-file -p sha show me ...
  While git cat-file tree sha generate ...

 cat-file -p is meant to be human-readable form.  The latter gives
 the exact byte contents read_sha1_file() sees, which is a binary
 format.  Essentially, it is a sequence of:

  - mode of the entry encoded in octal, without any leading '0' pad;
  - pathname component of the entry, terminated with NUL;
  - 20-byte SHA-1 object name.

 I always wondered why this is the sole object format where SHA-1 is in 20-
 byte binary format and not 40-chars hexadecimal string format...

 --
 Jakub Narębski




 --
 To unsubscribe from this list: send the line unsubscribe git in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Exact format of tree objets

2013-06-18 Thread Chico Sokol
What is the encoding of the filename?


--
Chico Sokol


On Tue, Jun 11, 2013 at 3:26 PM, Ilari Liusvaara
ilari.liusva...@elisanet.fi wrote:
 On Tue, Jun 11, 2013 at 01:25:14PM -0300, Chico Sokol wrote:
 Is there any official documentation of tree objets format? Are tree
 objects encoded specially in some way? How can I parse the inflated
 contents of a tree object?

 Tree object consists of entries, each concatenation of:
 - Octal mode (using ASCII digits 0-7).
 - Single SPACE (0x20)
 - Filename
 - Single NUL (0x00)
 - 20-byte binary SHA-1 of referenced object.

 At least following octal modes are known:
 4: Directory (tree).
 100644: Regular file (blob).
 100755: Executable file (blob).
 12: Symbolic link (blob).
 16: Submodule (commit).

 The entries are always sorted in (bytewise) lexicographical order,
 except directories sort like there was impiled '/' at the end.

 So e.g.:
 !  0  9  a  a-  a- (directory)  a (directory)  a0  ab  b  z.


 The idea of sorting directories specially is that if one recurses
 upon hitting a directory and uses '/' as path separator, then the
 full filenames are in bytewise lexicographical order.

 -Ilari
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Exact format of tree objets

2013-06-12 Thread Jakub Narebski
Junio C Hamano gitster at pobox.com writes:
 Chico Sokol chico.sokol at gmail.com writes:
 
  Is there any official documentation of tree objets format? Are tree
  objects encoded specially in some way? How can I parse the inflated
  contents of a tree object?
 
  We're suspecting that there is some kind of special format or
  encoding, because the command git cat-file -p sha show me ...
  While git cat-file tree sha generate ...
 
 cat-file -p is meant to be human-readable form.  The latter gives
 the exact byte contents read_sha1_file() sees, which is a binary
 format.  Essentially, it is a sequence of:
 
  - mode of the entry encoded in octal, without any leading '0' pad;
  - pathname component of the entry, terminated with NUL;
  - 20-byte SHA-1 object name.

I always wondered why this is the sole object format where SHA-1 is in 20-
byte binary format and not 40-chars hexadecimal string format...

-- 
Jakub Narębski




--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Exact format of tree objets

2013-06-11 Thread Chico Sokol
Is there any official documentation of tree objets format? Are tree
objects encoded specially in some way? How can I parse the inflated
contents of a tree object?

We're suspecting that there is some kind of special format or
encoding, because the command git cat-file -p sha show me the
expected output, something like:

100644 blob 2beae51a0e14b3167fd7e81119972caef95779f4.gitignore
100644 blob 7c817960e954f0278a6eee8d58611f61445167e8LICENSE.txt
100644 blob 30e849cba985d74bfd29696f6dee5a40abaacb03README
...


While git cat-file tree sha generate an strange output, which
indicate some kink of encoding problem. Something like:

100644 .gitignore+��▒,��Wy�100644
LICENSE.txt|�y`�T�'�n��XaaDQg�100644 README0�I˩��K�)


Thanks,







--
Chico Sokol
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Exact format of tree objets

2013-06-11 Thread Ilari Liusvaara
On Tue, Jun 11, 2013 at 01:25:14PM -0300, Chico Sokol wrote:
 Is there any official documentation of tree objets format? Are tree
 objects encoded specially in some way? How can I parse the inflated
 contents of a tree object?

Tree object consists of entries, each concatenation of:
- Octal mode (using ASCII digits 0-7).
- Single SPACE (0x20)
- Filename
- Single NUL (0x00)
- 20-byte binary SHA-1 of referenced object.

At least following octal modes are known:
4: Directory (tree).
100644: Regular file (blob).
100755: Executable file (blob).
12: Symbolic link (blob).
16: Submodule (commit).

The entries are always sorted in (bytewise) lexicographical order,
except directories sort like there was impiled '/' at the end.

So e.g.:
!  0  9  a  a-  a- (directory)  a (directory)  a0  ab  b  z.


The idea of sorting directories specially is that if one recurses
upon hitting a directory and uses '/' as path separator, then the
full filenames are in bytewise lexicographical order.

-Ilari
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Exact format of tree objets

2013-06-11 Thread Junio C Hamano
Chico Sokol chico.so...@gmail.com writes:

 Is there any official documentation of tree objets format? Are tree
 objects encoded specially in some way? How can I parse the inflated
 contents of a tree object?

 We're suspecting that there is some kind of special format or
 encoding, because the command git cat-file -p sha show me ...
 While git cat-file tree sha generate ...

cat-file -p is meant to be human-readable form.  The latter gives
the exact byte contents read_sha1_file() sees, which is a binary
format.  Essentially, it is a sequence of:

 - mode of the entry encoded in octal, without any leading '0' pad;
 - pathname component of the entry, terminated with NUL;
 - 20-byte SHA-1 object name.

sorted in a particular order.


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html