Re: [PATCH/RFC v2 09/16] Read index-v5

2012-08-08 Thread Thomas Gummerer
On 08/05, Junio C Hamano wrote:
 Thomas Gummerer t.gumme...@gmail.com writes:
 
  +static struct directory_entry *read_directories_v5(unsigned int 
  *dir_offset,
  +   unsigned int *dir_table_offset,
  +   void *mmap,
  +   int mmap_size)
  +{
  +   int i, ondisk_directory_size;
  +   uint32_t *filecrc, *beginning, *end;
  +   struct directory_entry *current = NULL;
  +   struct ondisk_directory_entry *disk_de;
  +   struct directory_entry *de;
  +   unsigned int data_len, len;
  +   char *name;
  +
  +   ondisk_directory_size = sizeof(disk_de-flags)
  +   + sizeof(disk_de-foffset)
  +   + sizeof(disk_de-cr)
  +   + sizeof(disk_de-ncr)
  +   + sizeof(disk_de-nsubtrees)
  +   + sizeof(disk_de-nfiles)
  +   + sizeof(disk_de-nentries)
  +   + sizeof(disk_de-sha1);
  +   name = (char *)mmap + *dir_offset;
  +   beginning = mmap + *dir_table_offset;
 
 Notice how you computed name with pointer arithmetic by first
 casting mmap (which is void *) and when computing beginning, you
 forgot to cast mmap and attempted pointer arithmetic with void *.
 The latter does not work and breaks compilation.
 
 The pointer-arith with void * is not limited to this function.

Sorry for not noticing this, it always compiled fine for me. Guess
I should use -pedantic more often ;-)

 Please check the a band-aid (I wouldn't call it a fix-up) patch I
 added on top of the series before queuing the topic to 'pu'; it is
 primarily to illustrate the places I noticed that have this issue.
 
 I do not necessarily suggest that the way the band-aid patch makes
 it compile is the best approach.  It might be cleaner to use a saner
 type like char * (or perhaps const char *) as the type to point
 at a piece of memory you read from the disk.  I haven't formed an
 opinion.
 
 Thanks.

I've used the type of the respective assignment for now. e.g. i have
struct cache_header *hdr, so I'm using
hdr = (struct cache_header *)mmap + x;

read-cache-v5.c compiles with -pedantic without warnings.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v2 09/16] Read index-v5

2012-08-08 Thread Junio C Hamano
Thomas Gummerer t.gumme...@gmail.com writes:

  +  name = (char *)mmap + *dir_offset;
  +  beginning = mmap + *dir_table_offset;
 
 Notice how you computed name with pointer arithmetic by first
 casting mmap (which is void *) and when computing beginning, you
 forgot to cast mmap and attempted pointer arithmetic with void *.
 The latter does not work and breaks compilation.
 
 The pointer-arith with void * is not limited to this function.
 ...
 I've used the type of the respective assignment for now. e.g. i have
 struct cache_header *hdr, so I'm using
 hdr = (struct cache_header *)mmap + x;

You need to be careful when rewriting the above to choose the right
value for 'x' if you go that route (which I wouldn't recommend).

With

hdr = ptr_add(mmap, x);

you are making hdr point at x BYTES beyond mmap, but

hdr = (struct cache_header *)mmap + x;

means something entirely different, no?  hdr points at x entries
of struct cache_header beyond mmap (in other words, if mmap[] were
defined as struct cache_header mmap[], the above is saying the
same as hdr = mmap[x]).

I think the way you casted to compute the value for the name
pointer is the (second) right thing to do.  The cast (char *)
applied to mmap is about mmap is a typeless blob of memory I want
to count bytes in.  Give me *dir_offset bytes into that blob.  It
is not tied to the type of LHS (i.e. name) at all.  The result
then needs to be casted to the type of LHS (i.e. name), and in
this case the types happen to be the same, so you do not have to
cast the result of the addition but that is mere luck.

The next line is not so lucky and you would need to say something
like:

beginning = (uint32_t *)((char *)mmap + *dir_table_offset);

Again, inner cast is about mmap is a blob counted in bytes, the
outer cast is about type mismatch between a byte-address and LHS of
the assignment.

If mmap variable in this function were not void * but something
more sane like const char *, you wouldn't have to have the inner
cast to begin with, and that is why I said the way you did name is
the second right thing.  Then you can write them like

name = mmap + *dir_offset;
beginning = (uint32_t *)(mmap + *dir_offset);

After thinking about this, the ptr_add() macro might be the best
solution, even though I originally called it as a band-aid.  We know
mmap is a blob of memory, byte-offset of each component of which we
know about, so we can say

name = ptr_add(mmap, *dir_offset);
beginning = ptr_add(mmap, *dir_offset);

Hmmm..


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v2 09/16] Read index-v5

2012-08-08 Thread Thomas Gummerer


On 08/08, Junio C Hamano wrote:
 Thomas Gummerer t.gumme...@gmail.com writes:
 
   +name = (char *)mmap + *dir_offset;
   +beginning = mmap + *dir_table_offset;
  
  Notice how you computed name with pointer arithmetic by first
  casting mmap (which is void *) and when computing beginning, you
  forgot to cast mmap and attempted pointer arithmetic with void *.
  The latter does not work and breaks compilation.
  
  The pointer-arith with void * is not limited to this function.
  ...
  I've used the type of the respective assignment for now. e.g. i have
  struct cache_header *hdr, so I'm using
  hdr = (struct cache_header *)mmap + x;
 
 You need to be careful when rewriting the above to choose the right
 value for 'x' if you go that route (which I wouldn't recommend).
 
 With
 
 hdr = ptr_add(mmap, x);
 
 you are making hdr point at x BYTES beyond mmap, but
 
 hdr = (struct cache_header *)mmap + x;
 
 means something entirely different, no?  hdr points at x entries
 of struct cache_header beyond mmap (in other words, if mmap[] were
 defined as struct cache_header mmap[], the above is saying the
 same as hdr = mmap[x]).
 
 I think the way you casted to compute the value for the name
 pointer is the (second) right thing to do.  The cast (char *)
 applied to mmap is about mmap is a typeless blob of memory I want
 to count bytes in.  Give me *dir_offset bytes into that blob.  It
 is not tied to the type of LHS (i.e. name) at all.  The result
 then needs to be casted to the type of LHS (i.e. name), and in
 this case the types happen to be the same, so you do not have to
 cast the result of the addition but that is mere luck.
 
 The next line is not so lucky and you would need to say something
 like:
 
 beginning = (uint32_t *)((char *)mmap + *dir_table_offset);
 
 Again, inner cast is about mmap is a blob counted in bytes, the
 outer cast is about type mismatch between a byte-address and LHS of
 the assignment.

This is what I tried in v3 of the series, but it didn't seem quiet
right.

 If mmap variable in this function were not void * but something
 more sane like const char *, you wouldn't have to have the inner
 cast to begin with, and that is why I said the way you did name is
 the second right thing.  Then you can write them like
 
 name = mmap + *dir_offset;
 beginning = (uint32_t *)(mmap + *dir_offset);
 
 After thinking about this, the ptr_add() macro might be the best
 solution, even though I originally called it as a band-aid.  We know
 mmap is a blob of memory, byte-offset of each component of which we
 know about, so we can say
 
 name = ptr_add(mmap, *dir_offset);
 beginning = ptr_add(mmap, *dir_offset);
 
 Hmmm..

I start to think so too. Casting the mmap variable to const char *
in the method call doesn't feel right to me, even though it would work.
Unless there are any objections I'll use ptr_add in the next version.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC v2 09/16] Read index-v5

2012-08-05 Thread Thomas Gummerer
Make git read the index file version 5 without complaining.

This version of the reader doesn't read neither the cache-tree
nor the resolve undo data, but doesn't choke on an index that
includes such data.

Helped-by: Thomas Rast tr...@student.ethz.ch
Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
---
 cache.h  |   72 +++
 read-cache.c |  590 +-
 2 files changed, 657 insertions(+), 5 deletions(-)

diff --git a/cache.h b/cache.h
index 076d6af..98adcd9 100644
--- a/cache.h
+++ b/cache.h
@@ -110,6 +110,15 @@ struct cache_time {
unsigned int nsec;
 };
 
+/*
+ * The *next pointer is used in read_entries_v5 for holding
+ * all the elements of a directory, and points to the next
+ * cache_entry in a directory.
+ *
+ * It is reset by the add_name_hash call in set_index_entry
+ * to set it to point to the next cache_entry in the
+ * correct in-memory format ordering.
+ */
 struct cache_entry {
struct cache_time ce_ctime;
struct cache_time ce_mtime;
@@ -128,11 +137,58 @@ struct cache_entry {
char name[FLEX_ARRAY]; /* more */
 };
 
+struct directory_entry {
+   struct directory_entry *next;
+   struct directory_entry *next_hash;
+   struct cache_entry *ce;
+   struct cache_entry *ce_last;
+   struct conflict_entry *conflict;
+   struct conflict_entry *conflict_last;
+   unsigned int conflict_size;
+   unsigned int de_foffset;
+   unsigned int de_cr;
+   unsigned int de_ncr;
+   unsigned int de_nsubtrees;
+   unsigned int de_nfiles;
+   unsigned int de_nentries;
+   unsigned char sha1[20];
+   unsigned short de_flags;
+   unsigned int de_pathlen;
+   char pathname[FLEX_ARRAY];
+};
+
+struct conflict_part {
+   struct conflict_part *next;
+   unsigned short flags;
+   unsigned short entry_mode;
+   unsigned char sha1[20];
+};
+
+struct conflict_entry {
+   struct conflict_entry *next;
+   unsigned int nfileconflicts;
+   struct conflict_part *entries;
+   unsigned int namelen;
+   unsigned int pathlen;
+   char name[FLEX_ARRAY];
+};
+
+struct ondisk_conflict_part {
+   unsigned short flags;
+   unsigned short entry_mode;
+   unsigned char sha1[20];
+};
+
+#define CE_NAMEMASK  (0x0fff)
 #define CE_STAGEMASK (0x3000)
 #define CE_EXTENDED  (0x4000)
 #define CE_VALID (0x8000)
 #define CE_STAGESHIFT 12
 
+#define CONFLICT_CONFLICTED (0x8000)
+#define CONFLICT_STAGESHIFT 13
+#define CONFLICT_STAGEMASK (0x6000)
+
 /*
  * Range 0x in ce_flags is divided into
  * two parts: in-memory flags and on-disk ones.
@@ -166,6 +222,18 @@ struct cache_entry {
 #define CE_EXTENDED_FLAGS (CE_INTENT_TO_ADD | CE_SKIP_WORKTREE)
 
 /*
+ * Representation of the extended on-disk flags in the v5 format.
+ * They must not collide with the ordinary on-disk flags, and need to
+ * fit in 16 bits.  Note however that v5 does not save the name
+ * length.
+ */
+#define CE_INTENT_TO_ADD_V5  (0x4000)
+#define CE_SKIP_WORKTREE_V5  (0x0800)
+#if (CE_VALID|CE_STAGEMASK)  (CE_INTENTTOADD_V5|CE_SKIPWORKTREE_V5)
+#error v5 on-disk flags collide with ordinary on-disk flags
+#endif
+
+/*
  * Safeguard to avoid saving wrong flags:
  *  - CE_EXTENDED2 won't get saved until its semantic is known
  *  - Bits in 0x have been saved in ce_flags already
@@ -203,6 +271,8 @@ static inline unsigned create_ce_flags(unsigned stage)
 #define ce_skip_worktree(ce) ((ce)-ce_flags  CE_SKIP_WORKTREE)
 #define ce_mark_uptodate(ce) ((ce)-ce_flags |= CE_UPTODATE)
 
+#define conflict_stage(c) ((CONFLICT_STAGEMASK  (c)-flags)  
CONFLICT_STAGESHIFT)
+
 #define ce_permissions(mode) (((mode)  0100) ? 0755 : 0644)
 static inline unsigned int create_ce_mode(unsigned int mode)
 {
@@ -249,6 +319,8 @@ static inline unsigned int canon_mode(unsigned int mode)
 }
 
 #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)
+#define directory_entry_size(len) (offsetof(struct directory_entry,pathname) + 
(len) + 1)
+#define conflict_entry_size(len) (offsetof(struct conflict_entry,name) + (len) 
+ 1)
 
 struct index_state {
struct cache_entry **cache;
diff --git a/read-cache.c b/read-cache.c
index 4243606..70334f9 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -234,6 +234,55 @@ static int ce_match_stat_basic_v2(struct cache_entry *ce,
return changed;
 }
 
+static int match_stat_crc(struct stat *st, uint32_t expected_crc)
+{
+   uint32_t data, stat_crc = 0;
+   unsigned int ctimens = 0;
+
+   data = htonl(st-st_ctime);
+   stat_crc = crc32(0, (Bytef*)data, 4);
+#ifdef USE_NSEC
+   ctimens = ST_MTIME_NSEC(*st);
+#endif
+   data = htonl(ctimens);
+   stat_crc = crc32(stat_crc, (Bytef*)data, 4);
+   data = htonl(st-st_ino);
+   stat_crc = crc32(stat_crc, (Bytef*)data, 4);
+   data = htonl(st-st_size);
+   stat_crc = crc32(stat_crc, (Bytef*)data, 4);
+   data = htonl(st-st_dev);
+