Re: [PATCH/RFC v2 15/16] update-index.c: add a force-rewrite option

2012-08-08 Thread Thomas Gummerer
On 08/05, Junio C Hamano wrote:
 Thomas Gummerer t.gumme...@gmail.com writes:
 
  Add a force-rewrite option to update-index, which allows the user
  to rewrite the index, even if there are no changes. This can be used
  to do performance tests of both the reader and the writer.
 
  Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
  ---
   builtin/update-index.c |5 -
   1 file changed, 4 insertions(+), 1 deletion(-)
 
 I do not think this is wrong per-se, but is a new command that needs
 to be documented?  If it is only for benchmarking and debugging, it
 might be sufficient to make --index-version n always rewrite the
 index.

The command is only for benchmarking, I don't see another case where
it makes sense for anyone to rewrite the whole index, without changing
anything. I've made --index-version rewrite the index for the re-roll.

  diff --git a/builtin/update-index.c b/builtin/update-index.c
  index 4ce341c..7fedc8f 100644
  --- a/builtin/update-index.c
  +++ b/builtin/update-index.c
  @@ -24,6 +24,7 @@ static int allow_remove;
   static int allow_replace;
   static int info_only;
   static int force_remove;
  +static int force_rewrite;
   static int verbose;
   static int mark_valid_only;
   static int mark_skip_worktree_only;
  @@ -728,6 +729,8 @@ int cmd_update_index(int argc, const char **argv, const 
  char *prefix)
  OPT_BIT(0, unmerged, refresh_args.flags,
  refresh even if index contains unmerged entries,
  REFRESH_UNMERGED),
  +   OPT_SET_INT(0, force-rewrite, force_rewrite,
  +   force a index rewrite even if there is no change, 1),
  {OPTION_CALLBACK, 0, refresh, refresh_args, NULL,
  refresh stat information,
  PARSE_OPT_NOARG | PARSE_OPT_NONEG,
  @@ -886,7 +889,7 @@ int cmd_update_index(int argc, const char **argv, const 
  char *prefix)
  strbuf_release(buf);
  }
   
  -   if (active_cache_changed) {
  +   if (active_cache_changed || force_rewrite) {
  if (newfd  0) {
  if (refresh_args.flags  REFRESH_QUIET)
  exit(128);
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v2 09/16] Read index-v5

2012-08-08 Thread Thomas Gummerer
On 08/05, Junio C Hamano wrote:
 Thomas Gummerer t.gumme...@gmail.com writes:
 
  +static struct directory_entry *read_directories_v5(unsigned int 
  *dir_offset,
  +   unsigned int *dir_table_offset,
  +   void *mmap,
  +   int mmap_size)
  +{
  +   int i, ondisk_directory_size;
  +   uint32_t *filecrc, *beginning, *end;
  +   struct directory_entry *current = NULL;
  +   struct ondisk_directory_entry *disk_de;
  +   struct directory_entry *de;
  +   unsigned int data_len, len;
  +   char *name;
  +
  +   ondisk_directory_size = sizeof(disk_de-flags)
  +   + sizeof(disk_de-foffset)
  +   + sizeof(disk_de-cr)
  +   + sizeof(disk_de-ncr)
  +   + sizeof(disk_de-nsubtrees)
  +   + sizeof(disk_de-nfiles)
  +   + sizeof(disk_de-nentries)
  +   + sizeof(disk_de-sha1);
  +   name = (char *)mmap + *dir_offset;
  +   beginning = mmap + *dir_table_offset;
 
 Notice how you computed name with pointer arithmetic by first
 casting mmap (which is void *) and when computing beginning, you
 forgot to cast mmap and attempted pointer arithmetic with void *.
 The latter does not work and breaks compilation.
 
 The pointer-arith with void * is not limited to this function.

Sorry for not noticing this, it always compiled fine for me. Guess
I should use -pedantic more often ;-)

 Please check the a band-aid (I wouldn't call it a fix-up) patch I
 added on top of the series before queuing the topic to 'pu'; it is
 primarily to illustrate the places I noticed that have this issue.
 
 I do not necessarily suggest that the way the band-aid patch makes
 it compile is the best approach.  It might be cleaner to use a saner
 type like char * (or perhaps const char *) as the type to point
 at a piece of memory you read from the disk.  I haven't formed an
 opinion.
 
 Thanks.

I've used the type of the respective assignment for now. e.g. i have
struct cache_header *hdr, so I'm using
hdr = (struct cache_header *)mmap + x;

read-cache-v5.c compiles with -pedantic without warnings.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Documentation: list git-credential in plumbing commands

2012-08-08 Thread Matthieu Moy
Commit e30b2feb1b (Jun 24 2012, add 'git credential' plumbing command)
forgot to add git-credential to command-list.txt, hence the command was
not appearing in the documentation, making it hard for users to discover
it.

While we're there, capitalize the description line for git-crendential
for consistancy with other commands.

Signed-off-by: Matthieu Moy matthieu@imag.fr
---
 Documentation/git-credential.txt | 2 +-
 command-list.txt | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-credential.txt b/Documentation/git-credential.txt
index 53adee3..810e957 100644
--- a/Documentation/git-credential.txt
+++ b/Documentation/git-credential.txt
@@ -3,7 +3,7 @@ git-credential(1)
 
 NAME
 
-git-credential - retrieve and store user credentials
+git-credential - Retrieve and store user credentials
 
 SYNOPSIS
 
diff --git a/command-list.txt b/command-list.txt
index 14ea67a..ec64cac 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -25,6 +25,7 @@ git-commit  mainporcelain common
 git-commit-tree plumbingmanipulators
 git-config  ancillarymanipulators
 git-count-objects   ancillaryinterrogators
+git-credential  purehelpers
 git-cvsexportcommit foreignscminterface
 git-cvsimport   foreignscminterface
 git-cvsserver   foreignscminterface
-- 
1.7.12.rc1.183.gb94da76

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v2 0/16] Introduce index file format version 5

2012-08-08 Thread Thomas Rast
Junio C Hamano gits...@pobox.com writes:

 Thomas Rast tr...@student.ethz.ch writes:

 I like the general idea, too, but I think there is a long way ahead, and
 we shouldn't hold up v5 on this.

 We shouldn't rush, only to keep some deadline, and regret it later
 that we butchered the index format without thinking things through.
 When this was added to the GSoC idea page, I already said upfront
 that this was way too big a topic to be a GSoC project, didn't I?

Let me spell out my concern.  There are two v5s here:

* The extent of the GSoC task.

* The eventual implementation of index-v5 that goes into Git mainline.

IMHO this thread is mixing up the two.  There indeed must not be any
rush in the final implementation of index-v5.  However, the GSoC ends in
less than two weeks, and I have to evaluate Thomas on whatever is
finished until then.

AFAIK Thomas is now cleaning up the existing code to be in readable
shape, using your feedback, which is great.  However, the above
suggestion is such a fuzzily-specified task that there is no way to even
find out what needs to be done within the next two weeks.  Perhaps it
makes sense, at this point, to wrap anything that ended up having _v[25]
suffixes in an index_ops like Duy did.  That's a long way from actually
following through on the idea, though.

 [...] The new on-disk format is different from
 the current one, and as it is different from the current one, we can
 easily enhance it even more by hooking anything interesting to it!
 does not sound like a valid argument.  

 For example, for v5 it
 would be far better if conflicted and resolve-undo entries were a
 property of the normal index entry, instead of something that so happens
 to be consecutive entries and in a completely different place,
 respectively.

 I am not sure I am convinced.  Conflicts are already expressed by an
 attribute on a normal index entry (it is called stage), and
 because we check for is the index fully merged fairly often, it
 makes sense to have it in each entry.  Actually having an unmerged
 entry is a rare event (happens only during a mergy operation that
 gave control back to you), so we do not lose much by expressing them
 as consecutive entries.  Resolve-undo is far less often used, and is
 not an essential feature, so it makes perfect sense to have it as an
 optional index extension to allow versions of Git that are unaware
 of it to still use an index file that has it.

I picked this example because in the big picture, the current code goes
to silly contortions to shuffle data around.  Conflicts and resolve-undo
entries are two faces of the same coin, but the code does not express
this at all.  Whenever the user resolves a conflict, it removes the
existing index entries (consecutive in a flat table) and inserts them in
the resolve-undo tree (tree-shaped where every entry has all stages
embedded).  When using 'checkout -m' to recover the conflict, it goes
the other way.

v5 would simplify this: the difference between a conflict and a
resolve-undo entry is only one bit.  But because it needs to maintain v2
compatiblity, it first untangles the mixed conflict/resolve-undo data
and puts them in the right format, then later reassembles them.

So v5 could do it faster if all the code were written for it is only
half of it.  v5's data layout would also result in simpler data flow,
but as long as it is not allowed to exploit this, it's actually *more*
layers of complexity.

I think the part you snipped

 the loops that iterate over the index [...] either
 skip unmerged entries or specifically look for them.  There are subtle
 differences between the loops on many points: what do they do when they
 hit an unmerged entry?  Or a CE_REMOVED or CE_VALID one?

is a symptom of the same general problem: the data structures are sound,
but they are leaking all over the code and now we have lots of
complexity to do even simple operations like for each unmerged entry.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v2 0/16] Introduce index file format version 5

2012-08-08 Thread Nguyen Thai Ngoc Duy
On Wed, Aug 8, 2012 at 5:31 AM, Thomas Rast tr...@student.ethz.ch wrote:
 Thomas and me -- it was mostly my bad idea -- spent some time going
 through all the loops that iterate over the index.  You can get some
 taste of it with 'git grep ce_stage', mostly because many of them either
 skip unmerged entries or specifically look for them.  There are subtle
 differences between the loops on many points: what do they do when they
 hit an unmerged entry?

Most of them ignore unmerged entries, git-add and git-update-index can
remove unmerged entries, unpack-trees (reset, merge, checkout...) can
generate them. What's the problem with it?

 Or a CE_REMOVED or CE_VALID one?

CE_VALID is assume-unchanged feature. I don't think we have problems with it.

CE_REMOVED is to say we are going to remove this entry both in index
and worktree, but if we remove it now we would have no way to know
which file in worktree to be removed later on, so we just mark it here
as a ghost entry in index. It's only used by unpack-trees, I think.
From the index pov, CE_REMOVED entries never get written to file. It
may complicate tree building for v5.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] Add index-v5

2012-08-08 Thread Thomas Gummerer


On 08/07, Robin Rosenberg wrote:
 Nguyễn Thái Ngọc Duy skrev 2012-08-06 16.36:
 
 +++ b/read-cache-v5.c
 @@ -0,0 +1,1170 @@
 +#include cache.h
 +#include read-cache.h
 +#include resolve-undo.h
 +#include cache-tree.h
 +
 +struct cache_header_v5 {
 +unsigned int hdr_ndir;
 +unsigned int hdr_nfile;
 +unsigned int hdr_fblockoffset;
 +unsigned int hdr_nextension;
 +};
 +
 +struct ondisk_cache_entry_v5 {
 +unsigned short flags;
 +unsigned short mode;
 +struct cache_time mtime;
 +int stat_crc;
 +unsigned char sha1[20];
 +};
 
 I mentioned this before in another thread, but for JGit I'd like
 to see size as a separate attribute. The rest of stat_crc is not
 available to Java so when this index gets its way into JGit,
 stat_crc will be zero and will never be checked.
 

I'm sorry for forgetting to add this, it will be included in the
re-roll.  The stat_crc will be ignored if it is 0 in the ondisk
index.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC v3 05/13] Make in-memory format aware of stat_crc

2012-08-08 Thread Thomas Gummerer
Make the in-memory format aware of the stat_crc used by index-v5.
It is simply ignored by index version prior to v5.

Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
---
 cache.h  |1 +
 read-cache.c |   25 +
 2 files changed, 26 insertions(+)

diff --git a/cache.h b/cache.h
index c77cdbe..bfe3099 100644
--- a/cache.h
+++ b/cache.h
@@ -122,6 +122,7 @@ struct cache_entry {
unsigned int ce_flags;
unsigned int ce_namelen;
unsigned char sha1[20];
+   uint32_t ce_stat_crc;
struct cache_entry *next;
struct cache_entry *dir_next;
char name[FLEX_ARRAY]; /* more */
diff --git a/read-cache.c b/read-cache.c
index 125e6a0..d8f8b74 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -51,6 +51,29 @@ void rename_index_entry_at(struct index_state *istate, int 
nr, const char *new_n
add_index_entry(istate, new, 
ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE);
 }
 
+static uint32_t calculate_stat_crc(struct cache_entry *ce)
+{
+   unsigned int ctimens = 0;
+   uint32_t stat, stat_crc;
+
+   stat = htonl(ce-ce_ctime.sec);
+   stat_crc = crc32(0, (Bytef*)stat, 4);
+#ifdef USE_NSEC
+   ctimens = ce-ce_ctime.nsec;
+#endif
+   stat = htonl(ctimens);
+   stat_crc = crc32(stat_crc, (Bytef*)stat, 4);
+   stat = htonl(ce-ce_ino);
+   stat_crc = crc32(stat_crc, (Bytef*)stat, 4);
+   stat = htonl(ce-ce_dev);
+   stat_crc = crc32(stat_crc, (Bytef*)stat, 4);
+   stat = htonl(ce-ce_uid);
+   stat_crc = crc32(stat_crc, (Bytef*)stat, 4);
+   stat = htonl(ce-ce_gid);
+   stat_crc = crc32(stat_crc, (Bytef*)stat, 4);
+   return stat_crc;
+}
+
 /*
  * This only updates the non-critical parts of the directory
  * cache, ie the parts that aren't tracked by GIT, and only used
@@ -73,6 +96,8 @@ void fill_stat_cache_info(struct cache_entry *ce, struct stat 
*st)
 
if (S_ISREG(st-st_mode))
ce_mark_uptodate(ce);
+
+   ce-ce_stat_crc = calculate_stat_crc(ce);
 }
 
 static int ce_compare_data(struct cache_entry *ce, struct stat *st)
-- 
1.7.10.GIT

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC v3 02/13] t2104: Don't fail for index versions other than [23]

2012-08-08 Thread Thomas Gummerer
t2104 currently checks for the exact index version 2 or 3,
depending if there is a skip-worktree flag or not. Other
index versions do not use extended flags and thus cannot
be tested for version changes.

Make this test update the index to version 2 at the beginning
of the test. Testing the skip-worktree flags for the default
index format is still covered by t7011 and t7012.

Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
---
 t/t2104-update-index-skip-worktree.sh |1 +
 1 file changed, 1 insertion(+)

diff --git a/t/t2104-update-index-skip-worktree.sh 
b/t/t2104-update-index-skip-worktree.sh
index 1d0879b..bd9644f 100755
--- a/t/t2104-update-index-skip-worktree.sh
+++ b/t/t2104-update-index-skip-worktree.sh
@@ -22,6 +22,7 @@ H sub/2
 EOF
 
 test_expect_success 'setup' '
+   git update-index --index-version=2 
mkdir sub 
touch ./1 ./2 sub/1 sub/2 
git add 1 2 sub/1 sub/2 
-- 
1.7.10.GIT

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC v3 13/13] p0002-index.sh: add perf test for the index formats

2012-08-08 Thread Thomas Gummerer
From: Thomas Rast tr...@student.ethz.ch

Add a performance test for index version [23]/4/5 by using
git update-index --update-index=[345], thus testing both the reader
and the writer speed of all index formats.

Signed-off-by: Thomas Rast tr...@student.ethz.ch
Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
---
 t/perf/p0002-index.sh |   33 +
 1 file changed, 33 insertions(+)
 create mode 100755 t/perf/p0002-index.sh

diff --git a/t/perf/p0002-index.sh b/t/perf/p0002-index.sh
new file mode 100755
index 000..140c7a0
--- /dev/null
+++ b/t/perf/p0002-index.sh
@@ -0,0 +1,33 @@
+#!/bin/sh
+
+test_description=Tests index versions [23]/4/5
+
+. ./perf-lib.sh
+
+test_perf_large_repo
+
+test_expect_success 'convert to v3' '
+   git update-index --index-version=3
+'
+
+test_perf 'v[23]: update-index' '
+   git update-index --index-version=3 /dev/null
+'
+
+test_expect_success 'convert to v4' '
+   git update-index --index-version=4
+'
+
+test_perf 'v4: update-index' '
+   git update-index --index-version=4 /dev/null
+'
+
+test_expect_success 'convert to v5' '
+   git update-index --index-version=5
+'
+
+test_perf 'v5: update-index' '
+   git update-index --index-version=5 /dev/null
+'
+
+test_done
-- 
1.7.10.GIT

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC v3 06/13] Read index-v5

2012-08-08 Thread Thomas Gummerer
Make git read the index file version 5 without complaining.

This version of the reader doesn't read neither the cache-tree
nor the resolve undo data, but doesn't choke on an index that
includes such data.

Helped-by: Nguyen Thai Ngoc Duy pclo...@gmail.com
Helped-by: Thomas Rast tr...@student.ethz.ch
Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
---
 Makefile|1 +
 cache.h |   72 +++
 read-cache-v5.c |  589 +++
 read-cache.c|1 -
 4 files changed, 662 insertions(+), 1 deletion(-)
 create mode 100644 read-cache-v5.c

diff --git a/Makefile b/Makefile
index b4a7c73..77be175 100644
--- a/Makefile
+++ b/Makefile
@@ -770,6 +770,7 @@ LIB_OBJS += quote.o
 LIB_OBJS += reachable.o
 LIB_OBJS += read-cache.o
 LIB_OBJS += read-cache-v2.o
+LIB_OBJS += read-cache-v5.o
 LIB_OBJS += reflog-walk.o
 LIB_OBJS += refs.o
 LIB_OBJS += remote.o
diff --git a/cache.h b/cache.h
index bfe3099..a0a1781 100644
--- a/cache.h
+++ b/cache.h
@@ -110,6 +110,15 @@ struct cache_time {
unsigned int nsec;
 };
 
+/*
+ * The *next pointer is used in read_entries_v5 for holding
+ * all the elements of a directory, and points to the next
+ * cache_entry in a directory.
+ *
+ * It is reset by the add_name_hash call in set_index_entry
+ * to set it to point to the next cache_entry in the
+ * correct in-memory format ordering.
+ */
 struct cache_entry {
struct cache_time ce_ctime;
struct cache_time ce_mtime;
@@ -128,11 +137,58 @@ struct cache_entry {
char name[FLEX_ARRAY]; /* more */
 };
 
+struct directory_entry {
+   struct directory_entry *next;
+   struct directory_entry *next_hash;
+   struct cache_entry *ce;
+   struct cache_entry *ce_last;
+   struct conflict_entry *conflict;
+   struct conflict_entry *conflict_last;
+   unsigned int conflict_size;
+   unsigned int de_foffset;
+   unsigned int de_cr;
+   unsigned int de_ncr;
+   unsigned int de_nsubtrees;
+   unsigned int de_nfiles;
+   unsigned int de_nentries;
+   unsigned char sha1[20];
+   unsigned short de_flags;
+   unsigned int de_pathlen;
+   char pathname[FLEX_ARRAY];
+};
+
+struct conflict_part {
+   struct conflict_part *next;
+   unsigned short flags;
+   unsigned short entry_mode;
+   unsigned char sha1[20];
+};
+
+struct conflict_entry {
+   struct conflict_entry *next;
+   unsigned int nfileconflicts;
+   struct conflict_part *entries;
+   unsigned int namelen;
+   unsigned int pathlen;
+   char name[FLEX_ARRAY];
+};
+
+struct ondisk_conflict_part {
+   unsigned short flags;
+   unsigned short entry_mode;
+   unsigned char sha1[20];
+};
+
+#define CE_NAMEMASK  (0x0fff)
 #define CE_STAGEMASK (0x3000)
 #define CE_EXTENDED  (0x4000)
 #define CE_VALID (0x8000)
 #define CE_STAGESHIFT 12
 
+#define CONFLICT_CONFLICTED (0x8000)
+#define CONFLICT_STAGESHIFT 13
+#define CONFLICT_STAGEMASK (0x6000)
+
 /*
  * Range 0x in ce_flags is divided into
  * two parts: in-memory flags and on-disk ones.
@@ -166,6 +222,18 @@ struct cache_entry {
 #define CE_EXTENDED_FLAGS (CE_INTENT_TO_ADD | CE_SKIP_WORKTREE)
 
 /*
+ * Representation of the extended on-disk flags in the v5 format.
+ * They must not collide with the ordinary on-disk flags, and need to
+ * fit in 16 bits.  Note however that v5 does not save the name
+ * length.
+ */
+#define CE_INTENT_TO_ADD_V5  (0x4000)
+#define CE_SKIP_WORKTREE_V5  (0x0800)
+#if (CE_VALID|CE_STAGEMASK)  (CE_INTENTTOADD_V5|CE_SKIPWORKTREE_V5)
+#error v5 on-disk flags collide with ordinary on-disk flags
+#endif
+
+/*
  * Safeguard to avoid saving wrong flags:
  *  - CE_EXTENDED2 won't get saved until its semantic is known
  *  - Bits in 0x have been saved in ce_flags already
@@ -203,6 +271,8 @@ static inline unsigned create_ce_flags(unsigned stage)
 #define ce_skip_worktree(ce) ((ce)-ce_flags  CE_SKIP_WORKTREE)
 #define ce_mark_uptodate(ce) ((ce)-ce_flags |= CE_UPTODATE)
 
+#define conflict_stage(c) ((CONFLICT_STAGEMASK  (c)-flags)  
CONFLICT_STAGESHIFT)
+
 #define ce_permissions(mode) (((mode)  0100) ? 0755 : 0644)
 static inline unsigned int create_ce_mode(unsigned int mode)
 {
@@ -249,6 +319,8 @@ static inline unsigned int canon_mode(unsigned int mode)
 }
 
 #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)
+#define directory_entry_size(len) (offsetof(struct directory_entry,pathname) + 
(len) + 1)
+#define conflict_entry_size(len) (offsetof(struct conflict_entry,name) + (len) 
+ 1)
 
 struct index_state {
struct cache_entry **cache;
diff --git a/read-cache-v5.c b/read-cache-v5.c
new file mode 100644
index 000..ec1201d
--- /dev/null
+++ b/read-cache-v5.c
@@ -0,0 +1,589 @@
+#include cache.h
+#include read-cache.h
+#include resolve-undo.h
+#include cache-tree.h
+
+struct cache_header {
+   unsigned int hdr_ndir;
+   unsigned int hdr_nfile;
+   unsigned int 

[PATCH/RFC v3 08/13] Read cache-tree in index-v5

2012-08-08 Thread Thomas Gummerer
Since the cache-tree data is saved as part of the directory data,
we already read it at the beginning of the index. The cache-tree
is only converted from this directory data.

The cache-tree data is arranged in a tree, with the children sorted by
pathlen at each node, while the ondisk format is sorted lexically.
So we have to rebuild this format from the on-disk directory list.

Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
---
 cache-tree.c|   93 +++
 cache-tree.h|   10 ++
 read-cache-v5.c |1 +
 3 files changed, 104 insertions(+)

diff --git a/cache-tree.c b/cache-tree.c
index 28ed657..440cd04 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -519,6 +519,99 @@ struct cache_tree *cache_tree_read(const char *buffer, 
unsigned long size)
return read_one(buffer, size);
 }
 
+static struct cache_tree *convert_one(struct directory_queue *queue, int dirnr)
+{
+   int i, subtree_nr;
+   struct cache_tree *it;
+   struct directory_queue *down;
+
+   it = cache_tree();
+   it-entry_count = queue[dirnr].de-de_nentries;
+   subtree_nr = queue[dirnr].de-de_nsubtrees;
+   if (0 = it-entry_count)
+   hashcpy(it-sha1, queue[dirnr].de-sha1);
+
+   /*
+   * Just a heuristic -- we do not add directories that often but
+   * we do not want to have to extend it immediately when we do,
+   * hence +2.
+   */
+   it-subtree_alloc = subtree_nr + 2;
+   it-down = xcalloc(it-subtree_alloc, sizeof(struct cache_tree_sub *));
+   down = queue[dirnr].down;
+   for (i = 0; i  subtree_nr; i++) {
+   struct cache_tree *sub;
+   struct cache_tree_sub *subtree;
+   char *buf, *name;
+
+   name = ;
+   buf = strtok(down[i].de-pathname, /);
+   while (buf) {
+   name = buf;
+   buf = strtok(NULL, /);
+   }
+   sub = convert_one(down, i);
+   if(!sub)
+   goto free_return;
+   subtree = cache_tree_sub(it, name);
+   subtree-cache_tree = sub;
+   }
+   if (subtree_nr != it-subtree_nr)
+   die(cache-tree: internal error);
+   return it;
+ free_return:
+   cache_tree_free(it);
+   return NULL;
+}
+
+static int compare_cache_tree_elements(const void *a, const void *b)
+{
+   const struct directory_entry *de1, *de2;
+
+   de1 = ((const struct directory_queue *)a)-de;
+   de2 = ((const struct directory_queue *)b)-de;
+   return subtree_name_cmp(de1-pathname, de1-de_pathlen,
+   de2-pathname, de2-de_pathlen);
+}
+
+static struct directory_entry *sort_directories(struct directory_entry *de,
+   struct directory_queue *queue)
+{
+   int i, nsubtrees;
+
+   nsubtrees = de-de_nsubtrees;
+   for (i = 0; i  nsubtrees; i++) {
+   struct directory_entry *new_de;
+   de = de-next;
+   new_de = xmalloc(directory_entry_size(de-de_pathlen));
+   memcpy(new_de, de, directory_entry_size(de-de_pathlen));
+   queue[i].de = new_de;
+   if (de-de_nsubtrees) {
+   queue[i].down = xcalloc(de-de_nsubtrees,
+   sizeof(struct directory_queue));
+   de = sort_directories(de,
+   queue[i].down);
+   }
+   }
+   qsort(queue, nsubtrees, sizeof(struct directory_queue),
+   compare_cache_tree_elements);
+   return de;
+}
+
+struct cache_tree *cache_tree_convert_v5(struct directory_entry *de)
+{
+   struct directory_queue *queue;
+
+   if (!de-de_nentries)
+   return NULL;
+   queue = xcalloc(1, sizeof(struct directory_queue));
+   queue[0].de = de;
+   queue[0].down = xcalloc(de-de_nsubtrees, sizeof(struct 
directory_queue));
+
+   sort_directories(de, queue[0].down);
+   return convert_one(queue, 0);
+}
+
 static struct cache_tree *cache_tree_find(struct cache_tree *it, const char 
*path)
 {
if (!it)
diff --git a/cache-tree.h b/cache-tree.h
index d8cb2e9..7f29d26 100644
--- a/cache-tree.h
+++ b/cache-tree.h
@@ -20,6 +20,11 @@ struct cache_tree {
struct cache_tree_sub **down;
 };
 
+struct directory_queue {
+   struct directory_queue *down;
+   struct directory_entry *de;
+};
+
 struct cache_tree *cache_tree(void);
 void cache_tree_free(struct cache_tree **);
 void cache_tree_invalidate_path(struct cache_tree *, const char *);
@@ -27,6 +32,11 @@ struct cache_tree_sub *cache_tree_sub(struct cache_tree *, 
const char *);
 
 void cache_tree_write(struct strbuf *, struct cache_tree *root);
 struct cache_tree *cache_tree_read(const char *buffer, unsigned long size);
+/*
+ * This function modifys the directory argument that 

[PATCH/RFC v3 10/13] Write index-v5 cache-tree data

2012-08-08 Thread Thomas Gummerer
Write the cache-tree data for the index version 5 file format. The
in-memory cache-tree data is converted to the ondisk format, by adding
it to the directory entries, that were compiled from the cache-entries
in the step before.

Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
---
 cache-tree.c |   52 
 cache-tree.h |1 +
 read-cache.c |1 +
 3 files changed, 54 insertions(+)

diff --git a/cache-tree.c b/cache-tree.c
index 440cd04..e167b61 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -612,6 +612,58 @@ struct cache_tree *cache_tree_convert_v5(struct 
directory_entry *de)
return convert_one(queue, 0);
 }
 
+
+static void convert_one_to_ondisk_v5(struct hash_table *table, struct 
cache_tree *it,
+   const char *path, int pathlen, uint32_t crc)
+{
+   int i;
+   struct directory_entry *found, *search;
+
+   crc = crc32(crc, (Bytef*)path, pathlen);
+   found = lookup_hash(crc, table);
+   search = found;
+   while (search  strcmp(path, search-pathname + search-de_pathlen - 
strlen(path)) != 0)
+   search = search-next_hash;
+   if (!search)
+   return;
+   /*
+* The number of subtrees is already calculated by
+* compile_directory_data, therefore we only need to
+* add the entry_count
+*/
+   search-de_nentries = it-entry_count;
+   if (0 = it-entry_count)
+   hashcpy(search-sha1, it-sha1);
+   if (strcmp(path, ) != 0)
+   crc = crc32(crc, (Bytef*)/, 1);
+
+#if DEBUG
+   if (0 = it-entry_count)
+   fprintf(stderr, cache-tree %.*s (%d ent, %d subtree) %s\n,
+   pathlen, path, it-entry_count, it-subtree_nr,
+   sha1_to_hex(it-sha1));
+   else
+   fprintf(stderr, cache-tree %.*s (%d subtree) invalid\n,
+   pathlen, path, it-subtree_nr);
+#endif
+
+   for (i = 0; i  it-subtree_nr; i++) {
+   struct cache_tree_sub *down = it-down[i];
+   if (i) {
+   struct cache_tree_sub *prev = it-down[i-1];
+   if (subtree_name_cmp(down-name, down-namelen,
+prev-name, prev-namelen) = 0)
+   die(fatal - unsorted cache subtree);
+   }
+   convert_one_to_ondisk_v5(table, down-cache_tree, down-name, 
down-namelen, crc);
+   }
+}
+
+void cache_tree_to_ondisk_v5(struct hash_table *table, struct cache_tree *root)
+{
+   convert_one_to_ondisk_v5(table, root, , 0, 0);
+}
+
 static struct cache_tree *cache_tree_find(struct cache_tree *it, const char 
*path)
 {
if (!it)
diff --git a/cache-tree.h b/cache-tree.h
index 7f29d26..e08bc31 100644
--- a/cache-tree.h
+++ b/cache-tree.h
@@ -37,6 +37,7 @@ struct cache_tree *cache_tree_read(const char *buffer, 
unsigned long size);
  * Don't use it if the directory entries are still needed after.
  */
 struct cache_tree *cache_tree_convert_v5(struct directory_entry *de);
+void cache_tree_to_ondisk_v5(struct hash_table *table, struct cache_tree 
*root);
 
 int cache_tree_fully_valid(struct cache_tree *);
 int cache_tree_update(struct cache_tree *, struct cache_entry **, int, int);
diff --git a/read-cache.c b/read-cache.c
index 199ba75..962d6a2 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1310,6 +1310,7 @@ void update_index_if_able(struct index_state *istate, 
struct lock_file *lockfile
else
rollback_lock_file(lockfile);
 }
+
 int write_index(struct index_state *istate, int newfd)
 {
set_istate_ops(istate);
-- 
1.7.10.GIT

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC v3 04/13] Add documentation of the index-v5 file format

2012-08-08 Thread Thomas Gummerer
Add a documentation of the index file format version 5 to
Documentation/technical.

Helped-by: Michael Haggerty mhag...@alum.mit.edu
Helped-by: Junio C Hamano gits...@pobox.com
Helped-by: Thomas Rast tr...@student.ethz.ch
Helped-by: Nguyen Thai Ngoc Duy pclo...@gmail.com
Helped-by: Robin Rosenberg robin.rosenb...@dewire.com
Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
---
 Documentation/technical/index-file-format-v5.txt |  285 ++
 1 file changed, 285 insertions(+)
 create mode 100644 Documentation/technical/index-file-format-v5.txt

diff --git a/Documentation/technical/index-file-format-v5.txt 
b/Documentation/technical/index-file-format-v5.txt
new file mode 100644
index 000..6707f06
--- /dev/null
+++ b/Documentation/technical/index-file-format-v5.txt
@@ -0,0 +1,285 @@
+GIT index format
+
+
+== The git index file format
+
+   The git index file (.git/index) documents the status of the files
+ in the git staging area.
+
+   The staging area is used for preparing commits, merging, etc.
+
+   All binary numbers are in network byte order. Version 5 is described
+ here.
+
+   - A 20-byte header consisting of
+
+ sig (32-bits): Signature:
+   The signature is { 'D', 'I', 'R', 'C' } (stands for dircache)
+
+ vnr (32-bits): Version number:
+   The current supported versions are 2, 3, 4 and 5.
+
+ ndir (32-bits): number of directories in the index.
+
+ nfile (32-bits): number of file entries in the index.
+
+ fblockoffset (32-bits): offset to the file block, relative to the
+   beginning of the file.
+
+   - Offset to the extensions.
+
+ nextensions (32-bits): number of extensions.
+
+ extoffset (32-bits): offset to the extension. (Possibly none, as
+   many as indicated in the 4-byte number of extensions)
+
+ headercrc (32-bits): crc checksum for the header and extension
+   offsets
+
+   - diroffsets (ndir * directory offsets): A directory offset for each
+   of the ndir directories in the index, sorted by pathname (of the
+   directory it's pointing to) (see below). The diroffsets are relative
+   to the beginning of the direntries block. [1]
+
+   - direntries (ndir * directory entries): A directory entry for each
+   of the ndir directories in the index, sorted by pathname (see
+   below). [2]
+
+   - fileoffsets (nfile * file offsets): A file offset for each of the
+   nfile files in the index (see below). The file offsets are relative
+   to the beginning of the fileentries block. [1]
+
+   - fileentries (nfile * file entries): A file entry for each of the
+   nfile files in the index (see below).
+
+   - crdata: A number of entries for conflicted data/resolved conflicts
+   (see below).
+
+   - Extensions (Currently none, see below in the future)
+
+ Extensions are identified by signature. Optional extensions can
+ be ignored if GIT does not understand them.
+
+ GIT supports an arbitrary number of extension, but currently none
+ is implemented. [3]
+
+ extsig (32-bits): extension signature. If the first byte is 'A'..'Z'
+ the extension is optional and can be ignored.
+
+ extsize (32-bits): size of the extension, excluding the header
+   (extsig, extsize, extchecksum).
+
+ extchecksum (32-bits): crc32 checksum of the extension signature
+   and size.
+
+- Extension data.
+
+
+== Directory offsets (diroffsets)
+
+  diroffset (32-bits): offset to the directory relative to the beginning
+of the index file. There are ndir + 1 offsets in the diroffset table,
+the last is pointing to the end of the last direntry. With this last
+entry, we can replace the strlen when reading each filename, by
+calculating its length with the offsets.
+
+  This part is needed for making the directory entries bisectable and
+thus allowing a binary search.
+
+== Directory entry (direntries)
+  
+  Directory entries are sorted in lexicographic order by the name 
+of their path starting with the root.
+  
+  pathname (variable length, nul terminated): relative to top level
+directory (without the leading slash). '/' is used as path
+separator. A string of length 0 ('') indicates the root directory.
+The special path components ., and .. (without quotes) are
+disallowed. The path also includes a trailing slash. [9]
+
+  foffset (32-bits): offset to the lexicographically first file in 
+the file offsets (fileoffsets), relative to the beginning of
+the fileoffset block.
+
+  cr (32-bits): offset to conflicted/resolved data at the end of the
+index. 0 if there is no such data. [4]
+
+  ncr (32-bits): number of conflicted/resolved data entries at the
+end of the index if the offset is non 0. If cr is 0, ncr is
+also 0.
+
+  nsubtrees (32-bits): number of subtrees this tree has in the index.
+
+  nfiles (32-bits): number of files in the directory, that are in
+the index.
+
+  nentries 

[PATCH/RFC v3 03/13] t3700: Avoid interfering with the racy code

2012-08-08 Thread Thomas Gummerer
The new git racy code uses the mtime of cache-entries as smudge
marker for racily clean entries. The work of checking the file-system
if the entry really changed is offloaded to the reader. This interferes
with this test, because the entry is racily smudged and thus has
mtime 0.

To avoid interfering with the racy code, we use a time relative
to the time returned by time(3), instead of a time relative to
the mtime of the cache entries.

Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
---
 t/t3700-add.sh |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t3700-add.sh b/t/t3700-add.sh
index 874b3a6..829d36d 100755
--- a/t/t3700-add.sh
+++ b/t/t3700-add.sh
@@ -184,7 +184,7 @@ test_expect_success 'git add --refresh with pathspec' '
echo foo  echo bar  echo baz 
git add foo bar baz  H=$(git rev-parse :foo)  git rm -f foo 
echo 100644 $H 3   foo | git update-index --index-info 
-   test-chmtime -60 bar baz 
+   test-chmtime =-60 bar baz 
expect 
git add --refresh bar actual 
test_cmp expect actual 
-- 
1.7.10.GIT

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC v3 01/13] Move index v2 specific functions to their own file

2012-08-08 Thread Thomas Gummerer
Move index version 2 specific functions to their own file,
to prepare for the addition of a new index file format. With
the split into two files we have the non-index specific
functions in read-cache.c and the index-v2 specific functions
in read-cache-v2.c

Helped-by: Nguyen Thai Ngoc Duy pclo...@gmail.com
Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
---
 Makefile |2 +
 cache.h  |   13 +-
 read-cache-v2.c  |  581 +++
 read-cache.c |  613 +++---
 read-cache.h |   57 +
 test-index-version.c |7 +-
 6 files changed, 683 insertions(+), 590 deletions(-)
 create mode 100644 read-cache-v2.c
 create mode 100644 read-cache.h

diff --git a/Makefile b/Makefile
index 4b58b91..b4a7c73 100644
--- a/Makefile
+++ b/Makefile
@@ -645,6 +645,7 @@ LIB_H += progress.h
 LIB_H += prompt.h
 LIB_H += quote.h
 LIB_H += reachable.h
+LIB_H += read-cache.h
 LIB_H += reflog-walk.h
 LIB_H += refs.h
 LIB_H += remote.h
@@ -768,6 +769,7 @@ LIB_OBJS += prompt.o
 LIB_OBJS += quote.o
 LIB_OBJS += reachable.o
 LIB_OBJS += read-cache.o
+LIB_OBJS += read-cache-v2.o
 LIB_OBJS += reflog-walk.o
 LIB_OBJS += refs.o
 LIB_OBJS += remote.o
diff --git a/cache.h b/cache.h
index 67f28b4..c77cdbe 100644
--- a/cache.h
+++ b/cache.h
@@ -94,16 +94,8 @@ unsigned long git_deflate_bound(git_zstream *, unsigned 
long);
  */
 #define DEFAULT_GIT_PORT 9418
 
-/*
- * Basic data structures for the directory cache
- */
 
 #define CACHE_SIGNATURE 0x44495243 /* DIRC */
-struct cache_header {
-   unsigned int hdr_signature;
-   unsigned int hdr_version;
-   unsigned int hdr_entries;
-};
 
 #define INDEX_FORMAT_LB 2
 #define INDEX_FORMAT_UB 4
@@ -267,6 +259,7 @@ struct index_state {
unsigned name_hash_initialized : 1,
 initialized : 1;
struct hash_table name_hash;
+   struct index_ops *ops;
 };
 
 extern struct index_state the_index;
@@ -471,8 +464,8 @@ extern int index_name_is_other(const struct index_state *, 
const char *, int);
 #define CE_MATCH_RACY_IS_DIRTY 02
 /* do stat comparison even if CE_SKIP_WORKTREE is true */
 #define CE_MATCH_IGNORE_SKIP_WORKTREE  04
-extern int ie_match_stat(const struct index_state *, struct cache_entry *, 
struct stat *, unsigned int);
-extern int ie_modified(const struct index_state *, struct cache_entry *, 
struct stat *, unsigned int);
+extern int ie_match_stat(struct index_state *, struct cache_entry *, struct 
stat *, unsigned int);
+extern int ie_modified(struct index_state *, struct cache_entry *, struct stat 
*, unsigned int);
 
 struct pathspec {
const char **raw; /* get_pathspec() result, not freed by 
free_pathspec() */
diff --git a/read-cache-v2.c b/read-cache-v2.c
new file mode 100644
index 000..38f1791
--- /dev/null
+++ b/read-cache-v2.c
@@ -0,0 +1,581 @@
+#include cache.h
+#include read-cache.h
+#include resolve-undo.h
+#include cache-tree.h
+#include varint.h
+
+/* Mask for the name length in ce_flags in the on-disk index */
+#define CE_NAMEMASK  (0x0fff)
+
+struct cache_header {
+   unsigned int hdr_entries;
+};
+
+/*
+ * Index File I/O
+ */
+
+/*
+ * dev/ino/uid/gid/size are also just tracked to the low 32 bits
+ * Again - this is just a (very strong in practice) heuristic that
+ * the inode hasn't changed.
+ *
+ * We save the fields in big-endian order to allow using the
+ * index file over NFS transparently.
+ */
+struct ondisk_cache_entry {
+   struct cache_time ctime;
+   struct cache_time mtime;
+   unsigned int dev;
+   unsigned int ino;
+   unsigned int mode;
+   unsigned int uid;
+   unsigned int gid;
+   unsigned int size;
+   unsigned char sha1[20];
+   unsigned short flags;
+   char name[FLEX_ARRAY]; /* more */
+};
+
+/*
+ * This struct is used when CE_EXTENDED bit is 1
+ * The struct must match ondisk_cache_entry exactly from
+ * ctime till flags
+ */
+struct ondisk_cache_entry_extended {
+   struct cache_time ctime;
+   struct cache_time mtime;
+   unsigned int dev;
+   unsigned int ino;
+   unsigned int mode;
+   unsigned int uid;
+   unsigned int gid;
+   unsigned int size;
+   unsigned char sha1[20];
+   unsigned short flags;
+   unsigned short flags2;
+   char name[FLEX_ARRAY]; /* more */
+};
+
+/* These are only used for v3 or lower */
+#define align_flex_name(STRUCT,len) ((offsetof(struct STRUCT,name) + (len) + 
8)  ~7)
+#define ondisk_cache_entry_size(len) align_flex_name(ondisk_cache_entry,len)
+#define ondisk_cache_entry_extended_size(len) 
align_flex_name(ondisk_cache_entry_extended,len)
+#define ondisk_ce_size(ce) (((ce)-ce_flags  CE_EXTENDED) ? \
+   ondisk_cache_entry_extended_size(ce_namelen(ce)) : \
+   

[PATCH/RFC v3 07/13] Read resolve-undo data

2012-08-08 Thread Thomas Gummerer
Make git read the resolve-undo data from the index.

Since the resolve-undo data is joined with the conflicts in
the ondisk format of the index file version 5, conflicts and
resolved data is read at the same time, and the resolve-undo
data is then converted to the in-memory format.

Helped-by: Thomas Rast tr...@student.ethz.ch
Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
---
 read-cache-v5.c |1 +
 resolve-undo.c  |   36 
 resolve-undo.h  |2 ++
 3 files changed, 39 insertions(+)

diff --git a/read-cache-v5.c b/read-cache-v5.c
index ec1201d..b47398d 100644
--- a/read-cache-v5.c
+++ b/read-cache-v5.c
@@ -494,6 +494,7 @@ static struct directory_entry *read_entries(struct 
index_state *istate,
int i;
 
conflict_queue = read_conflicts(de, mmap, mmap_size, fd);
+   resolve_undo_convert_v5(istate, conflict_queue);
for (i = 0; i  de-de_nfiles; i++) {
ce = read_entry(de,
entry_offset,
diff --git a/resolve-undo.c b/resolve-undo.c
index 72b4612..f96c6ba 100644
--- a/resolve-undo.c
+++ b/resolve-undo.c
@@ -170,3 +170,39 @@ void unmerge_index(struct index_state *istate, const char 
**pathspec)
i = unmerge_index_entry_at(istate, i);
}
 }
+
+void resolve_undo_convert_v5(struct index_state *istate,
+   struct conflict_entry *ce)
+{
+   int i;
+
+   while (ce) {
+   struct string_list_item *lost;
+   struct resolve_undo_info *ui;
+   struct conflict_part *cp;
+
+   if (ce-entries  (ce-entries-flags  CONFLICT_CONFLICTED) 
!= 0) {
+   ce = ce-next;
+   continue;
+   }
+   if (!istate-resolve_undo) {
+   istate-resolve_undo = xcalloc(1, sizeof(struct 
string_list));
+   istate-resolve_undo-strdup_strings = 1;
+   }
+
+   lost = string_list_insert(istate-resolve_undo, ce-name);
+   if (!lost-util)
+   lost-util = xcalloc(1, sizeof(*ui));
+   ui = lost-util;
+
+   cp = ce-entries;
+   for (i = 0; i  3; i++)
+   ui-mode[i] = 0;
+   while (cp) {
+   ui-mode[conflict_stage(cp) - 1] = cp-entry_mode;
+   hashcpy(ui-sha1[conflict_stage(cp) - 1], cp-sha1);
+   cp = cp-next;
+   }
+   ce = ce-next;
+   }
+}
diff --git a/resolve-undo.h b/resolve-undo.h
index 8458769..ab660a6 100644
--- a/resolve-undo.h
+++ b/resolve-undo.h
@@ -13,4 +13,6 @@ extern void resolve_undo_clear_index(struct index_state *);
 extern int unmerge_index_entry_at(struct index_state *, int);
 extern void unmerge_index(struct index_state *, const char **);
 
+extern void resolve_undo_convert_v5(struct index_state *, struct 
conflict_entry *);
+
 #endif
-- 
1.7.10.GIT

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC v3 11/13] Write resolve-undo data for index-v5

2012-08-08 Thread Thomas Gummerer
Write the resolve undo data to the ondisk format, by joining the data
in the resolve-undo string-list with the already existing conflicts
that were compiled before, when searching the directories and add
them to the corresponding directory entries.

Helped-by: Thomas Rast tr...@student.ethz.ch
Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
---
 read-cache-v5.c |3 ++
 resolve-undo.c  |   93 +++
 resolve-undo.h  |1 +
 3 files changed, 97 insertions(+)

diff --git a/read-cache-v5.c b/read-cache-v5.c
index 45f7acd..3d03111 100644
--- a/read-cache-v5.c
+++ b/read-cache-v5.c
@@ -861,6 +861,9 @@ static struct directory_entry 
*compile_directory_data(struct index_state *istate
previous_entry-next = no_subtrees;
}
}
+   if (istate-cache_tree)
+   cache_tree_to_ondisk_v5(table, istate-cache_tree);
+   resolve_undo_to_ondisk_v5(table, istate-resolve_undo, ndir, 
total_dir_len, de);
return de;
 }
 
diff --git a/resolve-undo.c b/resolve-undo.c
index f96c6ba..4568dcc 100644
--- a/resolve-undo.c
+++ b/resolve-undo.c
@@ -206,3 +206,96 @@ void resolve_undo_convert_v5(struct index_state *istate,
ce = ce-next;
}
 }
+
+void resolve_undo_to_ondisk_v5(struct hash_table *table,
+   struct string_list *resolve_undo,
+   unsigned int *ndir, int *total_dir_len,
+   struct directory_entry *de)
+{
+   struct string_list_item *item;
+   struct directory_entry *search;
+
+   if (!resolve_undo)
+   return;
+   for_each_string_list_item(item, resolve_undo) {
+   struct conflict_entry *conflict_entry;
+   struct resolve_undo_info *ui = item-util;
+   char *super;
+   int i, dir_len, len;
+   uint32_t crc;
+   struct directory_entry *found, *current, *new_tree;
+
+   if (!ui)
+   continue;
+
+   super = super_directory(item-string);
+   if (!super)
+   dir_len = 0;
+   else
+   dir_len = strlen(super);
+   crc = crc32(0, (Bytef*)super, dir_len);
+   found = lookup_hash(crc, table);
+   current = NULL;
+   new_tree = NULL;
+   
+   while (!found) {
+   struct directory_entry *new;
+
+   new = init_directory_entry(super, dir_len);
+   if (!current)
+   current = new;
+   insert_directory_entry(new, table, total_dir_len, ndir, 
crc);
+   if (new_tree != NULL)
+   new-de_nsubtrees = 1;
+   new-next = new_tree;
+   new_tree = new;
+   super = super_directory(super);
+   if (!super)
+   dir_len = 0;
+   else
+   dir_len = strlen(super);
+   crc = crc32(0, (Bytef*)super, dir_len);
+   found = lookup_hash(crc, table);
+   }
+   search = found;
+   while (search-next_hash  strcmp(super, search-pathname) != 
0)
+   search = search-next_hash;
+   if (search  !current)
+   current = search;
+   if (!search  !current)
+   current = new_tree;
+   if (!super  new_tree) {
+   new_tree-next = de-next;
+   de-next = new_tree;
+   de-de_nsubtrees++;
+   } else if (new_tree) {
+   struct directory_entry *temp;
+
+   search = de-next;
+   while (strcmp(super, search-pathname))
+   search = search-next;
+   temp = new_tree;
+   while (temp-next)
+   temp = temp-next;
+   search-de_nsubtrees++;
+   temp-next = search-next;
+   search-next = new_tree;
+   }
+
+   len = strlen(item-string);
+   conflict_entry = create_new_conflict(item-string, len, 
current-de_pathlen);
+   add_conflict_to_directory_entry(current, conflict_entry);
+   for (i = 0; i  3; i++) {
+   if (ui-mode[i]) {
+   struct conflict_part *cp;
+
+   cp = xmalloc(sizeof(struct conflict_part));
+   cp-flags = (i + 1)  CONFLICT_STAGESHIFT;
+   cp-entry_mode = ui-mode[i];
+   cp-next = NULL;
+ 

[PATCH/RFC v3 12/13] update-index.c: always rewrite the index when index-version is given

2012-08-08 Thread Thomas Gummerer
Make git update-index always rewrite the index, if a index-version
is given. This is used for performance testing, to have a reader
and writer for the whole index.

Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
---
 builtin/update-index.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 4ce341c..c31d176 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -6,6 +6,7 @@
 #include cache.h
 #include quote.h
 #include cache-tree.h
+#include read-cache.h
 #include tree-walk.h
 #include builtin.h
 #include refs.h
@@ -861,6 +862,7 @@ int cmd_update_index(int argc, const char **argv, const 
char *prefix)
if (the_index.version != preferred_index_format)
active_cache_changed = 1;
the_index.version = preferred_index_format;
+   set_istate_ops(the_index);
}
 
if (read_from_stdin) {
@@ -886,7 +888,7 @@ int cmd_update_index(int argc, const char **argv, const 
char *prefix)
strbuf_release(buf);
}
 
-   if (active_cache_changed) {
+   if (active_cache_changed || preferred_index_format) {
if (newfd  0) {
if (refresh_args.flags  REFRESH_QUIET)
exit(128);
-- 
1.7.10.GIT

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC v3 09/13] Write index-v5

2012-08-08 Thread Thomas Gummerer
Write the index version 5 file format to disk. This version doesn't
write the cache-tree data and resolve-undo data to the file.

The main work is done when filtering out the directories from the
current in-memory format, where in the same turn also the conflicts
and the file data is calculated.

Helped-by: Nguyen Thai Ngoc Duy pclo...@gmail.com
Helped-by: Thomas Rast tr...@student.ethz.ch
Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
---
 cache.h |   10 +-
 read-cache-v5.c |  589 ++-
 read-cache.c|   19 +-
 read-cache.h|3 +
 4 files changed, 611 insertions(+), 10 deletions(-)

diff --git a/cache.h b/cache.h
index a0a1781..3fa348d 100644
--- a/cache.h
+++ b/cache.h
@@ -98,7 +98,7 @@ unsigned long git_deflate_bound(git_zstream *, unsigned long);
 #define CACHE_SIGNATURE 0x44495243 /* DIRC */
 
 #define INDEX_FORMAT_LB 2
-#define INDEX_FORMAT_UB 4
+#define INDEX_FORMAT_UB 5
 
 /*
  * The cache_time is just the low 32 bits of the
@@ -510,6 +510,7 @@ extern int verify_path(const char *path);
 extern struct cache_entry *index_name_exists(struct index_state *istate, const 
char *name, int namelen, int igncase);
 extern int index_name_stage_pos(const struct index_state *, const char *name, 
int namelen, int stage);
 extern int index_name_pos(const struct index_state *, const char *name, int 
namelen);
+extern struct directory_entry *init_directory_entry(char *pathname, int len);
 #define ADD_CACHE_OK_TO_ADD 1  /* Ok to add */
 #define ADD_CACHE_OK_TO_REPLACE 2  /* Ok to replace file/directory */
 #define ADD_CACHE_SKIP_DFCHECK 4   /* Ok to skip DF conflict checks */
@@ -1244,6 +1245,13 @@ static inline ssize_t write_str_in_full(int fd, const 
char *str)
return write_in_full(fd, str, strlen(str));
 }
 
+/* index-v5 helper functions */
+extern char *super_directory(const char *filename);
+extern void insert_directory_entry(struct directory_entry *, struct hash_table 
*, int *, unsigned int *, uint32_t);
+extern void add_conflict_to_directory_entry(struct directory_entry *, struct 
conflict_entry *);
+extern void add_part_to_conflict_entry(struct directory_entry *, struct 
conflict_entry *, struct conflict_part *);
+extern struct conflict_entry *create_new_conflict(char *, int, int);
+
 /* pager.c */
 extern void setup_pager(void);
 extern const char *pager_program;
diff --git a/read-cache-v5.c b/read-cache-v5.c
index 57d0fb5..45f7acd 100644
--- a/read-cache-v5.c
+++ b/read-cache-v5.c
@@ -583,9 +583,596 @@ static void read_index_v5(struct index_state *istate, 
void *mmap, int mmap_size,
istate-cache_tree = cache_tree_convert_v5(root_directory);
 }
 
+#define WRITE_BUFFER_SIZE 8192
+static unsigned char write_buffer[WRITE_BUFFER_SIZE];
+static unsigned long write_buffer_len;
+
+static int ce_write_flush(int fd)
+{
+   unsigned int buffered = write_buffer_len;
+   if (buffered) {
+   if (write_in_full(fd, write_buffer, buffered) != buffered)
+   return -1;
+   write_buffer_len = 0;
+   }
+   return 0;
+}
+
+static int ce_write(uint32_t *crc, int fd, void *data, unsigned int len)
+{
+   if (crc)
+   *crc = crc32(*crc, (Bytef*)data, len);
+   while (len) {
+   unsigned int buffered = write_buffer_len;
+   unsigned int partial = WRITE_BUFFER_SIZE - buffered;
+   if (partial  len)
+   partial = len;
+   memcpy(write_buffer + buffered, data, partial);
+   buffered += partial;
+   if (buffered == WRITE_BUFFER_SIZE) {
+   write_buffer_len = buffered;
+   if (ce_write_flush(fd))
+   return -1;
+   buffered = 0;
+   }
+   write_buffer_len = buffered;
+   len -= partial;
+   data = (char *) data + partial;
+   }
+   return 0;
+}
+
+static int ce_flush(int fd)
+{
+   unsigned int left = write_buffer_len;
+
+   if (left)
+   write_buffer_len = 0;
+
+   if (write_in_full(fd, write_buffer, left) != left)
+   return -1;
+
+   return 0;
+}
+
+static void ce_smudge_racily_clean_entry(struct cache_entry *ce)
+{
+   /*
+* This method shall only be called if the timestamp of ce
+* is racy (check with is_racy_timestamp). If the timestamp
+* is racy, the writer will just set the time to 0.
+*
+* The reader (match_stat_basic) will then take care
+* of checking if the entry is really changed or not, by
+* taking into account the stat_crc and if that hasn't changed
+* checking the sha1.
+*/
+   ce-ce_mtime.sec = 0;
+   ce-ce_mtime.nsec = 0;
+}
+
+char *super_directory(const char *filename)
+{
+   char *slash;
+
+   slash = strrchr(filename, '/');
+   if (slash)
+   return 

Re: [PATCH/RFC v3 01/13] Move index v2 specific functions to their own file

2012-08-08 Thread Nguyen Thai Ngoc Duy
On Wed, Aug 8, 2012 at 6:17 PM, Thomas Gummerer t.gumme...@gmail.com wrote:
 Move index version 2 specific functions to their own file,
 to prepare for the addition of a new index file format. With
 the split into two files we have the non-index specific
 functions in read-cache.c and the index-v2 specific functions
 in read-cache-v2.c

You still mix code changes and code move in one patch, but we can skip
it for now.

 --- a/cache.h
 +++ b/cache.h
 @@ -267,6 +259,7 @@ struct index_state {
 unsigned name_hash_initialized : 1,
  initialized : 1;
 struct hash_table name_hash;
 +   struct index_ops *ops;
  };

Do we really need to modify ops content? If not make it const
struct index_ops *ops; which makes..

 @@ -471,8 +464,8 @@ extern int index_name_is_other(const struct index_state 
 *, const char *, int);
  #define CE_MATCH_RACY_IS_DIRTY 02
  /* do stat comparison even if CE_SKIP_WORKTREE is true */
  #define CE_MATCH_IGNORE_SKIP_WORKTREE  04
 -extern int ie_match_stat(const struct index_state *, struct cache_entry *, 
 struct stat *, unsigned int);
 -extern int ie_modified(const struct index_state *, struct cache_entry *, 
 struct stat *, unsigned int);
 +extern int ie_match_stat(struct index_state *, struct cache_entry *, struct 
 stat *, unsigned int);
 +extern int ie_modified(struct index_state *, struct cache_entry *, struct 
 stat *, unsigned int);

..this hunk go away
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v3 06/13] Read index-v5

2012-08-08 Thread Nguyen Thai Ngoc Duy
uOn Wed, Aug 8, 2012 at 6:17 PM, Thomas Gummerer t.gumme...@gmail.com wrote:
 +static struct cache_entry *read_entry(struct directory_entry *de,
 +   unsigned long *entry_offset,
 +   void **mmap,
 +   unsigned long mmap_size,
 +   unsigned int *foffsetblock,
 +   int fd)
 +{
 
 +   if (crc_wrong) {
 +   /* wait for 10 milliseconds */
 +   usleep(10*1000);
 +   munmap(*mmap, mmap_size);
 +   *mmap = xmmap(NULL, mmap_size, PROT_READ | 
 PROT_WRITE, MAP_PRIVATE, fd, 0);
 +   }

Do we really need to munmap and mmap again? I don't see mmap man page
mention anything about refreshing the mmap'd memory with file
changes, not sure how it works. msync() seems for writing only.

If remapping is necessary, how about mremap? What I want to see is
whether we could avoid passing fd down to here.

 +struct index_ops v5_ops = {
 +   match_stat_basic,
 +   verify_hdr,
 +   read_index_v5,
 +   NULL
 +};

If you do it right, putting write_index_v2 here should work because
in-core structure is not changed (except that write_index_v2 is static
function, well..). Maybe putting write_index to this struct is a wrong
decision. We should be able to read_index_v5+write_index_v2 and
read_index_v2+write_index_v5.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v3 06/13] Read index-v5

2012-08-08 Thread Johannes Sixt
Am 8/8/2012 14:05, schrieb Nguyen Thai Ngoc Duy:
 uOn Wed, Aug 8, 2012 at 6:17 PM, Thomas Gummerer t.gumme...@gmail.com wrote:
 +static struct cache_entry *read_entry(struct directory_entry *de,
 +   unsigned long *entry_offset,
 +   void **mmap,
 +   unsigned long mmap_size,
 +   unsigned int *foffsetblock,
 +   int fd)
 +{
 
 +   if (crc_wrong) {
 +   /* wait for 10 milliseconds */
 +   usleep(10*1000);
 +   munmap(*mmap, mmap_size);
 +   *mmap = xmmap(NULL, mmap_size, PROT_READ | 
 PROT_WRITE, MAP_PRIVATE, fd, 0);
 +   }
 
 Do we really need to munmap and mmap again?

Yes. mmap may be the pread()-based implementation from compat/mmap.c.

-- Hannes
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Bug Report: Git sometimes locks file when running git difftool

2012-08-08 Thread Ben Blamey
Hi,

I'm using git for windows, PortableGit-1.7.11-preview20120710 (I've
just upgraded from PortableGit-1.7.6-preview20110709, which also had
the problem), on windows 7 (SP1). I find it handy when running git
difftool to make changes to the file in the difftool (I personally
use diffmerge http://www.sourcegear.com/diffmerge/ ) - especially if
files are very long, and you can remove spurious new lines, etc. etc.
with ease minimizing the number of changes you are going to make in
your commit. This is usually no problem.

The problem is that, I have found that sometimes files seem to be
locked, and whilst the diff program is open, I cannot write to the
file - neither from the difftool nor from, say, another text editor.
When I close diffmerge, the file is writable again. Logging off and on
again and the problem persists.

Diffmerge says:
Could not open this file for writing. Try to override file
permissions? [I click 'Yes'] Error! Cannot open file.

Notepad++ says:
Save failedPlease check whether if this file is opened in another program

Other programs give similar errors.

I've tried running the command both as an administrator and as a
normal user, and there seems to be nothing wrong with
permissions/ownership on the file. Using the Process Explorer
Sysinternals tool, I can see that git.exe has a file handle open - and
I guess the mode it has opened the file is preventing writes - thing
is - it doesn't always seem to do this. Makes me wonder if the file
handle isn't being closed properly in these circumstances? If this is
by-design, it is not happening consistently, because 99% of the time I
can edit no problem.

Thanks,
Ben Blamey
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Sync production with Git

2012-08-08 Thread kiranpyati
I am new to github,

Earlier we used to manually upload files on the production through FTP
although git was present on the production. Due to this now git status shows
many modified and untrack files.

To sync that with git we have downloaded all files from production and
committed to git. Now git has all files same as production.

We have not pulled on production since last 6 months and because of this it
shows modified and untracked files.

Now if we pull on the production there any 100% chances of the conflict
happened on all modified files. As there are hundreds of modified files
since last since month. Git pull will show conflict to all those files. In
that case site will get down and we can not afford this.

We want a way to seamlessly sync production and Git.

Can anybody please help me on this?

Thanks in advance..!!



--
View this message in context: 
http://git.661346.n2.nabble.com/Sync-production-with-Git-tp7564617.html
Sent from the git mailing list archive at Nabble.com.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v2 0/16] Introduce index file format version 5

2012-08-08 Thread Nguyen Thai Ngoc Duy
On Wed, Aug 8, 2012 at 8:38 AM, Junio C Hamano gits...@pobox.com wrote:
 If the workload we _care_ about is served better by using an API
 that works over an in-core tree-shaped index data structure, I do
 not think it is unreasonable to read the v2 on-disk format and
 represent it as a tree-shaped index while we read it.  Of course,
 there are things that are not as effective when reading from the
 flat v2 on-disk format (e.g. path limited reading will have to at
 least _scan_ the whole thing, even though it may process only the
 entries that match the pathspec) compared to reading from a
 tree-shaped on-disk format, but I doubt that the difference between
 the cost of reading into a flat array and the cost of reading and
 forming whatever non-flat data structure you seem to think is better
 is so big that it would negate the benefit of using a better in-core
 structure.

OK how about this. The general idea is preserve/extend current flat
index API and add a new (tree-based) one. Index users can use either.
They can even mix them up (which they do because we can't just flip
the API in one day for about 200 source files).

The day that unpack_trees() is converted to tree API, I will declare
v5 victory ;)

= Cleanup =

struct cache_entry becomes partly opaque. ce_ctime..ce_gid are hidden
in -v2.c and -v5.c. We only expose ce_size, ce_flags, ce_namelen, name
and sha1 to index users. Extra v5 fields like ce_stat_crc, next and
dir_next are also hidden. These fields can be put in a real struct in
read-cache.h, which is supposedly included by -v2.c and -v5.c

= Updating =

All index update API (add_index_entry, add_to_index,
remove_index_entry_at, remove_marked_cached_entries) are hooked by v5
when the loaded index is v5. v5 can update internal data when these
are called (e.g. conflict resolution), or just mark them dirty to be
worked on later in flush_index().

Anybody who updates a cache_entry is supposed to call
cache_entry_updated() function, which is no-op for v2 but v5 may want
to watch this activity.

Refreshing index is a special operation. Of course it's hooked by v5.
v5 may need its own implementation because it could walk working tree
and index tree at the same time. Of course v5 impl must also update
flat API data structure along the way.

A new function flush_index() is introduced, where v5 can update all
internal data and keep it in sync with index_state. When flat/tree
APIs are mixed, flush_index() must be called when switching from flat
API to tree API.

To help v5 deal with index rewrite in unpack_trees(),
index_bulk_update() may be introduced, which tells v5 we are going to
do a lot of adding/removing/shuffling, keep your actions to minimum,
you most likely have to rebuild the trees at flush_index() anyway

New API may be introduced for some big operations if it proves
v5-beneficial. I'm thinking of adding/removing a bunch of files by
pathspec, where v5 can walk working directory at the same time it
walks index directory tables.

= Tree traversal =

I don't see big problems here. We support opendir/readdir-like API for
tree traversing (with pathspec filtering). We also support
lookup_cache_entry to get cache_entry* of a certain path.

When tree traversal gets to a conflict entry, it lets the caller know
there's a conflict entry, it does not traverse through stage 1-3
during traversal. Caller is expected to use conflict lookup API for
that.

We also support reading partial index, filtered by pathspec. On v2, it
reads full index.

= Tree update =

At some point we may want to work on trees exclusively. Any operations
here must keep flat API data structure in sync.

We may want to postpone the sync if it's a lot of work, by doing all
the work in flush_index() before caller switches from tree API to flat
API again.

= Flat API deprecation =

At some point, tree update API will not update flat API any more
unless explicitly asked by caller. I don't expect cache in struct
index_state to be removed, unless we do really good merges using tree
API.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Enable HAVE_DEV_TTY for Solaris

2012-08-08 Thread Erik Faye-Lund
On Tue, Aug 7, 2012 at 6:10 AM, Jeff King p...@peff.net wrote:
 Subject: [PATCH] terminal: seek when switching between reading and writing

 When a stdio stream is opened in update mode (e.g., w+),
 the C standard forbids switching between reading or writing
 without an intervening positioning function. Many
 implementations are lenient about this, but Solaris libc
 will flush the recently-read contents to the output buffer.
 In this instance, that meant writing the non-echoed password
 that the user just typed to the terminal.

 Fix it by inserting a no-op fseek between the read and
 write.

My Windows-patches for git_terminal_prompt would probably also solve
this problem. Instead of opening a read-write handle to /dev/tty, they
open two handles to the terminal instead; one for reading and one for
writing. This is because the terminal cannot be opened in read-write
mode on Windows (we need to open CONIN$ and CONOUT$ separately).

You can have a look at the series here if you're interested:
https://github.com/kusma/git/tree/work/terminal-cleanup

That last patch is the reason why I haven't submitted the series yet,
but perhaps some of the preparatory patches could be worth-while for
other platforms in the mean time?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Sync production with Git

2012-08-08 Thread Matthieu Moy
kiranpyati kiran.py...@infobeans.com writes:

 We want a way to seamlessly sync production and Git.

You should be aware that Git was not designed for this scenario. The
usual flow with Git (and actually with most revision control systems),
is to do the development with Git, then use your build system to
generate a package that can be used in production (e.g. generate a
.tar.gz, or a .jar, or whatever your platform needs), and then install
this package on your production server.

It can be tempting, however, to use your revision control system as a
deployment tool, so that an update on the production server be as simple
as git pull. But in real-life applications, it usually has to be more
complicated: do you need to generate some files after you fetch the
latest version of the source? Do you need to update your database? Isn't
the .git/ directory harmfull here (e.g. do I want the full history
source of my project to be visible worldwide if this is a
webapplication?) ...

If you insist in using Git for deployment, then you should absolutely
stick to it. Whether for deployment or for anything else, trying to send
changes using both Git and other mechanism (e.g. uploading files
directly to a working tree as you did) puts you in trouble 99.9% of the
cases.

In your case, the damage is already done. If I were you, I'd do
something like

do some backup
make sure the backup is OK
think twice will I be able to restore the backup if it goes wrong?
$ git fetch origin
$ git reset --hard origin/master

(actually, if I were you, I'd try reproducing the situation on a
non-production server first)

git fetch will download the revisions from the remote server, which
should be the repository where the version you want to run is located.
git reset --hard will discard any local change (committed or not) you
may have, and set your local working tree to the latest version in the
master branch of the remote repository. You may need a git clean to
remove untracked files too.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Git does not handle changing inode numbers well

2012-08-08 Thread Matthijs Kooijman
(Please CC me, I'm not on the list)

Hi folks,

I've spent some time debugging an issue and I'd like to share the
results. The conclusion of my debugging is that git does not currently
handle changing inode numbers on files well.

I have a custom Fuse filesystem, and fuse dynamically allocates inode
numbers to paths, but keeps a limited cache of inode - name mappings,
causing the inodes to change over time.

Now of course, you'll probably say, it's the filesystem's fault, git
can't be expected to cope with that. You'll be right of course, but
since I already spent the time digging into this and figuring out what
goes on inside git in this case, I thought I might as well share the
analysis, just in case someone sees an easy fix in here, or in case
someone else stumbles upon this problem as well.

So, the actual problem I was seeing is that running git status showed
all symlinks as modified, even though they really were identical
between the working copy, index and HEAD. Interestingly enough this only
happened when running git status without further arguments, when
running on a subdirectory, it would show no changes as expected.

I compared the output of stat to a hexdump of the index file and found
that everything matched, except for the inode numbers. I originally
thought I was misinterpreting what I saw, but gdb confirmed that it were
indeed the inode numbers that git observed as different.

Now, I could have stopped here and started trying to fix my filesystem
instead. But it was still weird that this problem only existed for
symlinks and that normal files acted as expected. So I dug in a bit
deeper, hoping to find some way to make this work for symlinks as well.

So, here's what happens (IIUC):
 - cmd_status calls refresh_index, which calls refresh_cache_ent for
   every entry in the index.
 - refresh_cache_ent notices that the inode number has changed (for both
   symlinks and regular files) and compares the file / symlink contents.
 - refresh_cache_ent sees the content hasn't changed, so it calls
   fill_stat_cache_info to update the stat info.
 - fill_stat_cache_info sets the EC_UPTODATE flag on the entry, but only
   if it is a regular file.
 - cmd_status calls wt_status_collect which calls
   wt_status_collect_changes_worktree which calls run_diff_files.
 - run_diff_files skips regular files, because of the EC_UPTODATE flag.
   For symlinks, however, it checks the stat info and notices that the
   inode number has changed (again). It does not do a content check at
   this point, but instead just outputs the file as modified.


It turned out that the reason running git status on a subdirectory did
appear to work, was that the number of files in the subdir wasn't big
enough to overflow the inode number cache fuse keeps, so that numbers
didn't change in this case (the problem _did_ occur when trying a bigger
subdirectory).

So, it seems that git just doesn't cope well with changing inode numbers
because it checks the content in a first pass in refresh_index, but only
checks the stat info in the second pass in run_diff_files. The reason it
does work for regular files is EC_UPTODATE optimization introduced in
eadb5831: Avoid running lstat(2) on the same cache entry.

So, let's see if I can fix my filesystem now ;-)

Gr.

Matthijs


signature.asc
Description: Digital signature


Re: [PATCH/RFC v2 0/16] Introduce index file format version 5

2012-08-08 Thread Junio C Hamano
Nguyen Thai Ngoc Duy pclo...@gmail.com writes:

 OK how about this. The general idea is preserve/extend current flat
 index API and add a new (tree-based) one. Index users can use either.
 They can even mix them up (which they do because we can't just flip
 the API in one day for about 200 source files).

 The day that unpack_trees() is converted to tree API, I will declare
 v5 victory ;)

s/API, /API and benchmark says tree-shaped index is an overall win, /;

 = Cleanup =

 struct cache_entry becomes partly opaque. ce_ctime..ce_gid are hidden
 in -v2.c and -v5.c. We only expose ce_size, ce_flags, ce_namelen, name
 and sha1 to index users. Extra v5 fields like ce_stat_crc, next and
 dir_next are also hidden. These fields can be put in a real struct in
 read-cache.h, which is supposedly included by -v2.c and -v5.c

I do not particularly see a reason to keep different in-core
cache_entry representations even in an early round of the API
updates.  If v2 needs ctime and gid and v5 needs crc, keep both
fields for simplicity.  When coming from the filesystem, ctime, gid
and friends are immediately available and crc needs to be computed
only immediately before it is written out or it is compared with an
existing entry.

I also do not see a reason to keep two representations of in-core
index_state representations for that matter.

The current code that access nth entry from the index-cache[nth]
would need to be updated to use an accessor function, whether the
nth comes from index_name_pos() or from the for-loop that iterates
over the entire index.  For the latter, you would need to give the
users a function that returns a cursor into the in-core index to
allow iterating over it.

When you use an in-core representation that is not a flat array, the
type of nth, which is essentially a cursor, may have to change to
something that is richer than a simple integer, in order to give the
implementation of the in-core index a more efficient way to access
the entry than traversing the leaves of the tree depth first, and
you would need to update index_name_pos() to return such a cursor.
That design and development cost is part of updating the in-core
data structure. In the end result, the runtime cost to manipulate an
index entry that the cursor refers to should be minimum, as that
would be the cost paid by all the users of the API anyway, even if
we _were_ starting from an ideal world where there weren't any flat
in-core index in the first place.

Because the v2 on-disk format forces us to scan the whole thing at
least once, with a properly designed in-core representation, the
overall system would not suffer performance penalty when reading
from v2, as both the current code and the updated code have to read
everything, and accesses based on the cursor given by either
index_name_pos() or the index iterator has to be fast anyway (if the
latter does not hold true, your updated in-core representation that
is not a flat array needs to be rethought).

On top of such a solid foundation, we can map the updated in-core
representation to an on-disk representation with confidence, as any
performance improvement or degradation from that point on must be
solely attributable to the on-disk format difference.

Without such a foundation, it is hard to justify a different on-disk
format without handwaving, no?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] add test for 'git rebase --keep-empty'

2012-08-08 Thread Martin von Zweigbergk
Signed-off-by: Martin von Zweigbergk martin.von.zweigbe...@gmail.com
---

While trying to use patch-id instead of
--ignore-if-in-upstream/--cherry-pick/cherry/etc, I noticed that
patch-id ignores empty patches and I was surprised that tests still
pass. This test case would be useful to protect --keep-empty.

 t/t3401-rebase-partial.sh | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/t/t3401-rebase-partial.sh b/t/t3401-rebase-partial.sh
index 7f8693b..b89b512 100755
--- a/t/t3401-rebase-partial.sh
+++ b/t/t3401-rebase-partial.sh
@@ -47,7 +47,14 @@ test_expect_success 'rebase ignores empty commit' '
git commit --allow-empty -m empty 
test_commit D 
git rebase C 
-   test $(git log --format=%s C..) = D
+   test $(git log --format=%s C..) = D
+'
+
+test_expect_success 'rebase --keep-empty' '
+   git reset --hard D 
+   git rebase --keep-empty C 
+   test $(git log --format=%s C..) = D
+empty
 '
 
 test_done
-- 
1.7.11.1.104.ge7b44f1

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v2 09/16] Read index-v5

2012-08-08 Thread Junio C Hamano
Thomas Gummerer t.gumme...@gmail.com writes:

  +  name = (char *)mmap + *dir_offset;
  +  beginning = mmap + *dir_table_offset;
 
 Notice how you computed name with pointer arithmetic by first
 casting mmap (which is void *) and when computing beginning, you
 forgot to cast mmap and attempted pointer arithmetic with void *.
 The latter does not work and breaks compilation.
 
 The pointer-arith with void * is not limited to this function.
 ...
 I've used the type of the respective assignment for now. e.g. i have
 struct cache_header *hdr, so I'm using
 hdr = (struct cache_header *)mmap + x;

You need to be careful when rewriting the above to choose the right
value for 'x' if you go that route (which I wouldn't recommend).

With

hdr = ptr_add(mmap, x);

you are making hdr point at x BYTES beyond mmap, but

hdr = (struct cache_header *)mmap + x;

means something entirely different, no?  hdr points at x entries
of struct cache_header beyond mmap (in other words, if mmap[] were
defined as struct cache_header mmap[], the above is saying the
same as hdr = mmap[x]).

I think the way you casted to compute the value for the name
pointer is the (second) right thing to do.  The cast (char *)
applied to mmap is about mmap is a typeless blob of memory I want
to count bytes in.  Give me *dir_offset bytes into that blob.  It
is not tied to the type of LHS (i.e. name) at all.  The result
then needs to be casted to the type of LHS (i.e. name), and in
this case the types happen to be the same, so you do not have to
cast the result of the addition but that is mere luck.

The next line is not so lucky and you would need to say something
like:

beginning = (uint32_t *)((char *)mmap + *dir_table_offset);

Again, inner cast is about mmap is a blob counted in bytes, the
outer cast is about type mismatch between a byte-address and LHS of
the assignment.

If mmap variable in this function were not void * but something
more sane like const char *, you wouldn't have to have the inner
cast to begin with, and that is why I said the way you did name is
the second right thing.  Then you can write them like

name = mmap + *dir_offset;
beginning = (uint32_t *)(mmap + *dir_offset);

After thinking about this, the ptr_add() macro might be the best
solution, even though I originally called it as a band-aid.  We know
mmap is a blob of memory, byte-offset of each component of which we
know about, so we can say

name = ptr_add(mmap, *dir_offset);
beginning = ptr_add(mmap, *dir_offset);

Hmmm..


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Documentation: list git-credential in plumbing commands

2012-08-08 Thread Junio C Hamano
Matthieu Moy matthieu@imag.fr writes:

 Commit e30b2feb1b (Jun 24 2012, add 'git credential' plumbing command)
 forgot to add git-credential to command-list.txt, hence the command was
 not appearing in the documentation, making it hard for users to discover
 it.

 While we're there, capitalize the description line for git-crendential
 for consistancy with other commands.

consistency?


 Signed-off-by: Matthieu Moy matthieu@imag.fr

Thanks.

There really should be an easier way for the maintainer to notice
this kind of glitch without being told (better yet, the submitter of
a new command to notice it).  Perhaps the check-docs target in the
Makefile needs some updating?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] add test for 'git rebase --keep-empty'

2012-08-08 Thread Neil Horman
On Wed, Aug 08, 2012 at 09:48:18AM -0700, Martin von Zweigbergk wrote:
 Signed-off-by: Martin von Zweigbergk martin.von.zweigbe...@gmail.com
 ---
 
 While trying to use patch-id instead of
 --ignore-if-in-upstream/--cherry-pick/cherry/etc, I noticed that
 patch-id ignores empty patches and I was surprised that tests still
 pass. This test case would be useful to protect --keep-empty.
 
  t/t3401-rebase-partial.sh | 9 -
  1 file changed, 8 insertions(+), 1 deletion(-)
 
 diff --git a/t/t3401-rebase-partial.sh b/t/t3401-rebase-partial.sh
 index 7f8693b..b89b512 100755
 --- a/t/t3401-rebase-partial.sh
 +++ b/t/t3401-rebase-partial.sh
 @@ -47,7 +47,14 @@ test_expect_success 'rebase ignores empty commit' '
   git commit --allow-empty -m empty 
   test_commit D 
   git rebase C 
 - test $(git log --format=%s C..) = D
 + test $(git log --format=%s C..) = D
 +'
 +
 +test_expect_success 'rebase --keep-empty' '
 + git reset --hard D 
 + git rebase --keep-empty C 
 + test $(git log --format=%s C..) = D
 +empty
  '
  
  test_done
 -- 
 1.7.11.1.104.ge7b44f1
 
 

Acked-by: Neil Horman nhor...@tuxdriver.com

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v3 06/13] Read index-v5

2012-08-08 Thread Junio C Hamano
Nguyen Thai Ngoc Duy pclo...@gmail.com writes:

 +struct index_ops v5_ops = {
 +   match_stat_basic,
 +   verify_hdr,
 +   read_index_v5,
 +   NULL
 +};

 If you do it right, putting write_index_v2 here should work because
 in-core structure is not changed (except that write_index_v2 is static
 function, well..). Maybe putting write_index to this struct is a wrong
 decision. We should be able to read_index_v5+write_index_v2 and
 read_index_v2+write_index_v5.

The right way is to have a global API function

write_index_in_format(int version, int fd);

which calls the_index-index_ops.write_index[version](the_index, fd),
no?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] Documentation: list git-credential in plumbing commands

2012-08-08 Thread Matthieu Moy
Commit e30b2feb1b (Jun 24 2012, add 'git credential' plumbing command)
forgot to add git-credential to command-list.txt, hence the command was
not appearing in the documentation, making it hard for users to discover
it.

While we're there, capitalize the description line for git-crendential
for consistency with other commands.

Signed-off-by: Matthieu Moy matthieu@imag.fr
---
  for consistancy with other commands.
 
 consistency?

Yes, sorry. This one should be OK.

 Documentation/git-credential.txt | 2 +-
 command-list.txt | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-credential.txt b/Documentation/git-credential.txt
index 53adee3..810e957 100644
--- a/Documentation/git-credential.txt
+++ b/Documentation/git-credential.txt
@@ -3,7 +3,7 @@ git-credential(1)
 
 NAME
 
-git-credential - retrieve and store user credentials
+git-credential - Retrieve and store user credentials
 
 SYNOPSIS
 
diff --git a/command-list.txt b/command-list.txt
index 14ea67a..ec64cac 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -25,6 +25,7 @@ git-commit  mainporcelain common
 git-commit-tree plumbingmanipulators
 git-config  ancillarymanipulators
 git-count-objects   ancillaryinterrogators
+git-credential  purehelpers
 git-cvsexportcommit foreignscminterface
 git-cvsimport   foreignscminterface
 git-cvsserver   foreignscminterface
-- 
1.7.12.rc1.183.gb94da76

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: merging confusion and question

2012-08-08 Thread Rich Pixley

Thank you.

I think the work flow here needs some work, but reset --hard gets me 
running again.  That should probably be mentioned in the error message.


--rich

On 8/7/12 18:43 , Junio C Hamano wrote:

Rich Pixley rich.pix...@palm.com writes:


I'm confused.

What is the intended work flow here?  Ie, aside from trashing my
repository and starting over, what does one do to recover?

rich@cobra git clone /home/rich/repos/webos webos
Cloning into 'webos'...
done.
rich@cobra cd webos
rich@cobra git remote add central g...@github.com:openwebos/webos.git
rich@cobra git co master
Already on 'master'
rich@cobra git pull central master
X11 forwarding request failed on channel 0
remote: Counting objects: 22, done.
remote: Compressing objects: 100% (19/19), done.
remote: Total 21 (delta 12), reused 11 (delta 2)
Unpacking objects: 100% (21/21), done.
 From github.com:openwebos/webos
  * branchmaster - FETCH_HEAD
warning: Failed to merge submodule meta-webos (not checked out)
Auto-merging meta-webos
CONFLICT (submodule): Merge conflict in meta-webos
Auto-merging README.md
Automatic merge failed; fix conflicts and then commit the result.
rich@cobra git commit -a

Why isn't there any fix conflicts and then step between this line
and the friendly insn message on the previous line?


error: unable to index file meta-webos
fatal: updating files failed
rich@cobra git add meta-webos
error: unable to index file meta-webos
fatal: updating files failed
rich@cobra git rm meta-webos
meta-webos: needs merge
rm 'meta-webos'
fatal: git rm: 'meta-webos': Is a directory
rich@cobra git merge meta-webos
error: 'merge' is not possible because you have unmerged files.
hint: Fix them up in the work tree,
hint: and then use 'git add/rm file' as
hint: appropriate to mark resolution and make a commit,
hint: or use 'git commit -a'.
fatal: Exiting because of an unresolved conflict.

If you are not interested in mucking with meta-webos with this
merge, you would resolve meta-webos by taking either your (i.e. the
one that came from /home/rich/repos/webos) version or their
(i.e. the one that came from openwebos/webos.git) version.  Go back
to the state before git pull central master with reset --hard,
init and update webos submodule, try the pull again and then git
add webos to resolve to your version, perhaps?

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git does not handle changing inode numbers well

2012-08-08 Thread Junio C Hamano
Matthijs Kooijman matth...@stdin.nl writes:

 So, it seems that git just doesn't cope well with changing inode numbers
 because it checks the content in a first pass in refresh_index, but only
 checks the stat info in the second pass in run_diff_files. The reason it
 does work for regular files is EC_UPTODATE optimization introduced in
 eadb5831: Avoid running lstat(2) on the same cache entry.

 So, let's see if I can fix my filesystem now ;-)

True.  We have knobs to cope with filesystems whose st_dev or
st_ctime are not stable, but there is no such knob to tweak for
st_ino.  Shouldn't be too hard to add such, though.  One approach is
to do something like the attached patch, and declare, define,
initialize, and set trust_inum in a way similar to how we handle
trust_ctime in the existing code.

 read-cache.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index 2f8159f..6da99af 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -210,7 +210,7 @@ static int ce_match_stat_basic(struct cache_entry *ce, 
struct stat *st)
if (ce-ce_uid != (unsigned int) st-st_uid ||
ce-ce_gid != (unsigned int) st-st_gid)
changed |= OWNER_CHANGED;
-   if (ce-ce_ino != (unsigned int) st-st_ino)
+   if (trust_inum  ce-ce_ino != (unsigned int) st-st_ino)
changed |= INODE_CHANGED;
 
 #ifdef USE_STDEV
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fast-import error: fatal: 'refs/heads/master' - not a valid ref

2012-08-08 Thread Jeff King
On Wed, Aug 08, 2012 at 11:25:02AM +0400, Andrey Pavlenko wrote:

 I'm developing a remote helper which uses the fast-import stream for
 fetching. When I perform cloning git prints error message - fatal:
 'refs/heads/master' - not a valid ref, however the clonning completes
 normally. Each my fast-import commit command starts with commit
 refs/heads/master header.
 
 What does this error message mean and how can I fix it?

What version of git are you using? The only command which produces that
exact message is git show-ref, and it is not called by current
versions of the cloning process (but it used to be in old versions).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Bug with git-submodule and IFS

2012-08-08 Thread Andrew Dranse
Hi there,

I ran into an interesting bug with git submodules today.  It appears that if 
your IFS is not set to what git-submodule expects it to be (i.e. the standard 
IFS), it will break in a fun way.

Example:

$ git init
Initialized empty Git repository in /home/adranse/test/.git/
$ git submodule add github:/repos/perf
Cloning into 'perf'...
remote: Counting objects: 5744, done.
remote: Compressing objects: 100% (4627/4627), done.
remote: Total 5744 (delta 2400), reused 1579 (delta 343)
Receiving objects: 100% (5744/5744), 28.78 MiB | 4.56 MiB/s, done.
Resolving deltas: 100% (2400/2400), done.
$ export IFS=
 
$ git submodule update --init --recursive
No submodule mapping found in .gitmodules for path ''
$ unset IFS
$ git submodule update --init --recursive
Submodule 'perf' () registered for path 'perf'

As a solution, I would suggest setting IFS to the expected value before calling 
the git-submodule shell script.

Thanks,

Andrew Dranse
OANDA Corporation
adra...@oanda.com
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4] update make check-docs

2012-08-08 Thread Jeff King
On Wed, Aug 08, 2012 at 09:58:33AM -0700, Junio C Hamano wrote:

 There really should be an easier way for the maintainer to notice
 this kind of glitch without being told (better yet, the submitter of
 a new command to notice it).  Perhaps the check-docs target in the
 Makefile needs some updating?

Hmm. We have a check-docs command? :)

This patch series at least brings that up to date. It goes on top of
Matthieu's patch.

  [1/4]: check-docs: mention gitweb specially
  [2/4]: check-docs: update non-command documentation list
  [3/4]: command-list: add git-sh-i18n
  [4/4]: command-list: mention git-credential-* helpers

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] t9300: Add a test covering 'sub/testname' to 'sub/testname/testfile' renaming

2012-08-08 Thread Techlive Zheng
This test would fail at the moment.
---
 t/t9300-fast-import.sh | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 2fcf269..2a8368e 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -1039,6 +1039,37 @@ test_expect_success \
 git diff-tree -M -r M3^ M3 actual 
 compare_diff_raw expect actual'
 
+cat input INPUT_END
+blob
+mark :1
+data 10
+test file
+
+reset refs/heads/M4
+commit refs/heads/M4
+mark :2
+committer $GIT_COMMITTER_NAME $GIT_COMMITTER_EMAIL $GIT_COMMITTER_DATE
+data 8
+initial
+M 100644 :1 testname
+
+commit refs/heads/M5
+mark :3
+committer $GIT_COMMITTER_NAME $GIT_COMMITTER_EMAIL $GIT_COMMITTER_DATE
+data 8
+initial
+from refs/heads/M4
+M 100644 :1 testname/testfile
+D testname
+
+INPUT_END
+
+test_expect_success \
+   'M: rename file into new subdirectory with same name' \
+   'git fast-import input 
+git checkout M5 
+test -d testname  test -f testname/testfile'
+
 ###
 ### series N
 ###
-- 
1.7.11.4

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] fast-import: Handle 'sub/testname' to 'sub/testname/testfile' renaming correctly

2012-08-08 Thread Techlive Zheng
The current git-fast-import would not correctly handle such a commit stream
in which a file was deleted and at the same time a directory with the same
name was created. All paths under the newly created directory will be lost
after the importing.
---
 fast-import.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/fast-import.c b/fast-import.c
index eed97c8..8874b4b 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -1595,6 +1595,15 @@ static int tree_content_remove(
 * exist and need not be deleted.
 */
return 1;
+   if (!slash1  S_ISREG(e-versions[0].mode)  
S_ISDIR(e-versions[1].mode))
+   /*
+* If p names a file in some subdirectory and in
+* some commit that file got deleted, a directory
+* with the same name was set up in the same 
directory,
+* then there is no need to step into for further
+* iteration or deletion.
+*/
+   return 0;
if (!slash1 || !S_ISDIR(e-versions[1].mode))
goto del_entry;
if (!e-tree)
-- 
1.7.11.4

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/4] update make check-docs

2012-08-08 Thread Jeff King
On Wed, Aug 08, 2012 at 12:13:11PM -0700, Junio C Hamano wrote:

  Hmm. We have a check-docs command? :)
 
 Yes, and there also is a check-builtins target.  Perhaps the default
 build target should depend on them, as they are fairly lightweight?

I think they would want some refactoring. Right now the target does not
fail if there are errors. Any output it generates would typically scroll
by in the mass of other build data, so it would be easy to miss.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v2 06/16] t3700: sleep for 1 second, to avoid interfering with the racy code

2012-08-08 Thread Junio C Hamano
Thomas Gummerer t.gumme...@gmail.com writes:

 On 08/05, Junio C Hamano wrote:
 Thomas Gummerer t.gumme...@gmail.com writes:
 
  The new git racy code uses the mtime of cache-entries to smudge
  a racy clean entry, and loads the work, of checking the file-system
 
 -ECANTPARSE.

 The git racy code for index-v5 uses the mtime of the cache-entries as
 smudge markers. The work of checking the file-system is loaded of to
 the reader.

OK, now I can parse, perhaps with either s/is loaded of/f/ or
s/is loaded of/is offloaded/.

Thanks for clarifying the grammar.

But doesn't the current code make it the responsibilty of the reader
to check the contents with ce_modified_check_fs() already?  You may
have switched st_size to st_mtime as the field to mark a racily
clean entry, but it is unclear how that change affects anything.

  if the entry has really changed, off to the reader. This interferes
  with this test, because the entry is racily smudged and thus has
  mtime 0. We wait 1 second to avoid smudging the entry and getting
  correct test results.
 
 Mild NAK, especially it is totally unclear why you even need to muck
 with racy-git check in the current format of the index in the first
 place, and even if it were necessary, it is unclear why this cannot
 be done with test-chmtime.

 The racy-git code needs to be changed, to avoid problems when implementing
 the partial writing for index-v5. Otherwise it could cause problems, when
 we have entries that should be smudged, but are not due to the different
 racy algorithms.

Hrmph.  But if racy detection and checking is now a responsibility
of the later reader, the overall end result should be the same, no?
Perhaps the existing test was checking a wrong thing?

We should not care if the index still has a racily clean entries, or
how that fact is marked in the index entry.  The primary thing we
care about is that we do not mistake an actual change as no change
due to raciness.

So whether done with sleep or test-chmtime, avoiding a racily
clean situation sounds like sweeping a bug in the v5 code in racy
situation under the rug to me (unless I am misunderstanding what
you are doing with this change and in your explanation, or the test
was checking a wrong thing, that is).

Even more confused

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 resend] gitk: Use an external icon file on Windows

2012-08-08 Thread Sebastian Schuberth
Git for Windows now ships with the new Git icon from git-scm.com. Use that
icon file instead of the old procedurally drawn one if it exists.

Signed-off-by: Sebastian Schuberth sschube...@gmail.com
---
 gitk-git/gitk | 49 ++---
 1 file changed, 26 insertions(+), 23 deletions(-)

diff --git a/gitk-git/gitk b/gitk-git/gitk
index 59693c0..5127e55 100755
--- a/gitk-git/gitk
+++ b/gitk-git/gitk
@@ -11664,7 +11664,6 @@ if { [info exists ::env(GITK_MSGSDIR)] } {
 set gitk_prefix [file dirname [file dirname [file normalize $argv0]]]
 set gitk_libdir [file join $gitk_prefix share gitk lib]
 set gitk_msgsdir [file join $gitk_libdir msgs]
-unset gitk_prefix
 }
 
 ## Internationalization (i18n) through msgcat and gettext. See
@@ -11821,28 +11820,32 @@ if {[expr {[exec git rev-parse --is-inside-work-tree] 
== true}]} {
 set worktree [exec git rev-parse --show-toplevel]
 setcoords
 makewindow
-catch {
-image create photo gitlogo  -width 16 -height 16
-
-image create photo gitlogominus -width  4 -height  2
-gitlogominus put #C0 -to 0 0 4 2
-gitlogo copy gitlogominus -to  1 5
-gitlogo copy gitlogominus -to  6 5
-gitlogo copy gitlogominus -to 11 5
-image delete gitlogominus
-
-image create photo gitlogoplus  -width  4 -height  4
-gitlogoplus  put #008000 -to 1 0 3 4
-gitlogoplus  put #008000 -to 0 1 4 3
-gitlogo copy gitlogoplus  -to  1 9
-gitlogo copy gitlogoplus  -to  6 9
-gitlogo copy gitlogoplus  -to 11 9
-image delete gitlogoplus
-
-image create photo gitlogo32-width 32 -height 32
-gitlogo32 copy gitlogo -zoom 2 2
-
-wm iconphoto . -default gitlogo gitlogo32
+if {$::tcl_platform(platform) eq {windows}  [file exists 
$gitk_prefix/etc/git.ico]} {
+wm iconbitmap . -default $gitk_prefix/etc/git.ico
+} else {
+catch {
+image create photo gitlogo  -width 16 -height 16
+
+image create photo gitlogominus -width  4 -height  2
+gitlogominus put #C0 -to 0 0 4 2
+gitlogo copy gitlogominus -to  1 5
+gitlogo copy gitlogominus -to  6 5
+gitlogo copy gitlogominus -to 11 5
+image delete gitlogominus
+
+image create photo gitlogoplus  -width  4 -height  4
+gitlogoplus  put #008000 -to 1 0 3 4
+gitlogoplus  put #008000 -to 0 1 4 3
+gitlogo copy gitlogoplus  -to  1 9
+gitlogo copy gitlogoplus  -to  6 9
+gitlogo copy gitlogoplus  -to 11 9
+image delete gitlogoplus
+
+image create photo gitlogo32-width 32 -height 32
+gitlogo32 copy gitlogo -zoom 2 2
+
+wm iconphoto . -default gitlogo gitlogo32
+}
 }
 # wait for the window to become visible
 tkwait visibility .
-- 
1.7.11.msysgit.2




--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v2 09/16] Read index-v5

2012-08-08 Thread Thomas Gummerer


On 08/08, Junio C Hamano wrote:
 Thomas Gummerer t.gumme...@gmail.com writes:
 
   +name = (char *)mmap + *dir_offset;
   +beginning = mmap + *dir_table_offset;
  
  Notice how you computed name with pointer arithmetic by first
  casting mmap (which is void *) and when computing beginning, you
  forgot to cast mmap and attempted pointer arithmetic with void *.
  The latter does not work and breaks compilation.
  
  The pointer-arith with void * is not limited to this function.
  ...
  I've used the type of the respective assignment for now. e.g. i have
  struct cache_header *hdr, so I'm using
  hdr = (struct cache_header *)mmap + x;
 
 You need to be careful when rewriting the above to choose the right
 value for 'x' if you go that route (which I wouldn't recommend).
 
 With
 
 hdr = ptr_add(mmap, x);
 
 you are making hdr point at x BYTES beyond mmap, but
 
 hdr = (struct cache_header *)mmap + x;
 
 means something entirely different, no?  hdr points at x entries
 of struct cache_header beyond mmap (in other words, if mmap[] were
 defined as struct cache_header mmap[], the above is saying the
 same as hdr = mmap[x]).
 
 I think the way you casted to compute the value for the name
 pointer is the (second) right thing to do.  The cast (char *)
 applied to mmap is about mmap is a typeless blob of memory I want
 to count bytes in.  Give me *dir_offset bytes into that blob.  It
 is not tied to the type of LHS (i.e. name) at all.  The result
 then needs to be casted to the type of LHS (i.e. name), and in
 this case the types happen to be the same, so you do not have to
 cast the result of the addition but that is mere luck.
 
 The next line is not so lucky and you would need to say something
 like:
 
 beginning = (uint32_t *)((char *)mmap + *dir_table_offset);
 
 Again, inner cast is about mmap is a blob counted in bytes, the
 outer cast is about type mismatch between a byte-address and LHS of
 the assignment.

This is what I tried in v3 of the series, but it didn't seem quiet
right.

 If mmap variable in this function were not void * but something
 more sane like const char *, you wouldn't have to have the inner
 cast to begin with, and that is why I said the way you did name is
 the second right thing.  Then you can write them like
 
 name = mmap + *dir_offset;
 beginning = (uint32_t *)(mmap + *dir_offset);
 
 After thinking about this, the ptr_add() macro might be the best
 solution, even though I originally called it as a band-aid.  We know
 mmap is a blob of memory, byte-offset of each component of which we
 know about, so we can say
 
 name = ptr_add(mmap, *dir_offset);
 beginning = ptr_add(mmap, *dir_offset);
 
 Hmmm..

I start to think so too. Casting the mmap variable to const char *
in the method call doesn't feel right to me, even though it would work.
Unless there are any objections I'll use ptr_add in the next version.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/4] check-docs: factor out command-list

2012-08-08 Thread Jeff King
The check-docs command list is composed from several
Makefile variables plus some special cases. Let's make the
meaning of the list more obvious and avoid repeating
ourselves by factoring it out.

Signed-off-by: Jeff King p...@peff.net
---
 Makefile | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 41d9db8..6ae868d 100644
--- a/Makefile
+++ b/Makefile
@@ -2804,8 +2804,12 @@ endif
 
 ### Check documentation
 #
+ALL_COMMANDS = $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS)
+ALL_COMMANDS += git
+ALL_COMMANDS += gitk
+ALL_COMMANDS += gitweb
 check-docs::
-   @(for v in $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) git gitk gitweb; \
+   @(for v in $(ALL_COMMANDS); \
do \
case $$v in \
git-merge-octopus | git-merge-ours | git-merge-recursive | \
@@ -2858,7 +2862,7 @@ check-docs::
documented,gitweb.conf | \
sentinel,not,matching,is,ok ) continue ;; \
esac; \
-   case  $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) git gitk 
gitweb  in \
+   case  $(ALL_COMMANDS)  in \
* $$cmd *);; \
*) echo removed but $$how: $$cmd ;; \
esac; \
-- 
1.7.12.rc2.36.gb1dc81b

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/4] check-docs: list git-gui as a command

2012-08-08 Thread Jeff King
git-gui is already documented and mentioned in command-list,
but adding it to the Makefile makes sure it is so. We also
add its alias git-citool (which is also documented).

As a result, we can drop them from the special case
statement that avoids them being listed as documented but
does not exist.

Signed-off-by: Jeff King p...@peff.net
---
 Makefile | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 6ae868d..4b3c366 100644
--- a/Makefile
+++ b/Makefile
@@ -2808,6 +2808,7 @@ ALL_COMMANDS = $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS)
 ALL_COMMANDS += git
 ALL_COMMANDS += gitk
 ALL_COMMANDS += gitweb
+ALL_COMMANDS += git-gui git-citool
 check-docs::
@(for v in $(ALL_COMMANDS); \
do \
@@ -2837,8 +2838,6 @@ check-docs::
) | while read how cmd; \
do \
case $$how,$$cmd in \
-   *,git-citool | \
-   *,git-gui | \
*,git-help | \
documented,gitattributes | \
documented,gitignore | \
-- 
1.7.12.rc2.36.gb1dc81b

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/4] check-docs: drop git-help special-case

2012-08-08 Thread Jeff King
The check-docs target special-cases git-help to avoid
mentioning it as documented but removed. This dates back
to the early implementation of git-help, when its code was
simply included inside git.c.

These days it is a full-fledged builtin (in builtin/help.c)
and does not need special-casing.

Signed-off-by: Jeff King p...@peff.net
---
 Makefile | 1 -
 1 file changed, 1 deletion(-)

diff --git a/Makefile b/Makefile
index 4b3c366..b9da511 100644
--- a/Makefile
+++ b/Makefile
@@ -2838,7 +2838,6 @@ check-docs::
) | while read how cmd; \
do \
case $$how,$$cmd in \
-   *,git-help | \
documented,gitattributes | \
documented,gitignore | \
documented,gitmodules | \
-- 
1.7.12.rc2.36.gb1dc81b

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v2 06/16] t3700: sleep for 1 second, to avoid interfering with the racy code

2012-08-08 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 So whether done with sleep or test-chmtime, avoiding a racily
 clean situation sounds like sweeping a bug in the v5 code in racy
 situation under the rug to me (unless I am misunderstanding what
 you are doing with this change and in your explanation, or the test
 was checking a wrong thing, that is).

 Even more confused

OK, after staring this test for a long time, and going back to
3d1f148 (refresh_index: do not show unmerged path that is outside
pathspec, 2012-02-17), I give up.

Let me ask the same question in a more direct way.  Which part of
this test break with your series?

test_expect_success 'git add --refresh with pathspec' '
git reset --hard 
echo foo  echo bar  echo baz 
git add foo bar baz  H=$(git rev-parse :foo)  git rm -f foo 

echo 100644 $H 3   foo | git update-index --index-info 
# sleep 1  in the update here ...
test-chmtime -60 bar baz 
expect 
git add --refresh bar actual 
test_cmp expect actual 

git diff-files --name-only actual 
! grep bar actual
grep baz actual
'

We prepare an index with bunch of paths, we make foo unmerged, we
smudge bar and baz stat-dirty, so that diff-files would report
them, even though their contents match what is recorded in the
index.

Then we say git add --refresh bar.  As far as I know, the output
from git add --refresh pathspec is limited to foo: needs merge
if and only if foo is covered by pathspec and foo is unmerged.

Side note: If --verbose is given to the same command, we
also give Unstaged changes after refreshing the index:
followed by M foo or U foo if foo does not match the
index but not unmerged, or if foo is unmerged, again if
and only if foo is covered by pathspec.  But that is not
how we invoke git add --refresh in this test.

So if you are getting a test failure from the test_cmp, wouldn't it
mean that your series broke what 3d1f148 did (namely, make sure we
report only on paths that are covered by pathspec, in this case
bar), as the contents of bar in the working tree matches what is
recorded in the index?

If the failure you are seeing is that bar appears in the output of
git diff-files --name-only, it means that diff-files noticed
that bar is stat-dirty after git add --refresh bar.  Wouldn't it
mean that the series broke git add --refresh bar in such a way
that it does not to refresh what it was told to refresh?

Another test that could fail after the point you added sleep 1 is
that the output from git diff-files --name-only fails to list
baz in its output, but with test-chmtime -60 bar baz, we made
sure that bar and baz are stat-dirty, and we only refreshed
bar and not baz.  If that is the case, then would it mean that
the series broke git add --refresh bar in such a way that it
refreshes something other than what it was told to refresh?

In any case, having to change this test in any way smells like there
is some breakage in the series; it is not immediately obvious to me
that the current test is checking anything wrong as I suspected in
the earlier message.

So,... I dunno.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 8/4] check-docs: get documented command list from Makefile

2012-08-08 Thread Jeff King
The current code tries to get a list of documented commands
by doing ls Documentation/git*txt and culling a bunch of
special cases from the result. Looking for git-*.txt would
be more accurate, but would miss a few commands like
gitweb and gitk.

Fortunately, Documentation/Makefile already knows what this
list is, so we can just ask it. Annoyingly, we still have to
post-process its output a little, since make will print
extra cruft like GIT-VERSION-FILE is up to date to stdout.

Now that our list is accurate, we can remove all of the ugly
special-cases.

Signed-off-by: Jeff King p...@peff.net
---
 Documentation/Makefile |  3 +++
 Makefile   | 26 ++
 2 files changed, 5 insertions(+), 24 deletions(-)

diff --git a/Documentation/Makefile b/Documentation/Makefile
index 063fa69..cf5916f 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -344,4 +344,7 @@ require-htmlrepo::
 quick-install-html: require-htmlrepo
'$(SHELL_PATH_SQ)' ./install-doc-quick.sh $(HTML_REPO) 
$(DESTDIR)$(htmldir)
 
+print-man1:
+   @for i in $(MAN1_TXT); do echo $$i; done
+
 .PHONY: FORCE
diff --git a/Makefile b/Makefile
index b9da511..51b3c6f 100644
--- a/Makefile
+++ b/Makefile
@@ -2832,34 +2832,12 @@ check-docs::
sed -e '/^#/d' \
-e 's/[ ].*//' \
-e 's/^/listed /' command-list.txt; \
-   ls -1 Documentation/git*txt | \
+   $(MAKE) -C Documentation print-man1 | \
+   grep '\.txt$$' | \
sed -e 's|Documentation/|documented |' \
-e 's/\.txt//'; \
) | while read how cmd; \
do \
-   case $$how,$$cmd in \
-   documented,gitattributes | \
-   documented,gitignore | \
-   documented,gitmodules | \
-   documented,gitcli | \
-   documented,git-tools | \
-   documented,gitcore-tutorial | \
-   documented,gitcvs-migration | \
-   documented,gitdiffcore | \
-   documented,gitglossary | \
-   documented,githooks | \
-   documented,gitrepository-layout | \
-   documented,gitrevisions | \
-   documented,gittutorial | \
-   documented,gittutorial-2 | \
-   documented,git-bisect-lk2009 | \
-   documented,git-remote-helpers | \
-   documented,gitworkflows | \
-   documented,gitcredentials | \
-   documented,gitnamespaces | \
-   documented,gitweb.conf | \
-   sentinel,not,matching,is,ok ) continue ;; \
-   esac; \
case  $(ALL_COMMANDS)  in \
* $$cmd *);; \
*) echo removed but $$how: $$cmd ;; \
-- 
1.7.12.rc2.36.gb1dc81b
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Enable HAVE_DEV_TTY for Solaris

2012-08-08 Thread Jeff King
On Wed, Aug 08, 2012 at 04:13:03PM +0200, Erik Faye-Lund wrote:

 On Tue, Aug 7, 2012 at 6:10 AM, Jeff King p...@peff.net wrote:
  Subject: [PATCH] terminal: seek when switching between reading and writing
 
  When a stdio stream is opened in update mode (e.g., w+),
  the C standard forbids switching between reading or writing
  without an intervening positioning function. Many
  implementations are lenient about this, but Solaris libc
  will flush the recently-read contents to the output buffer.
  In this instance, that meant writing the non-echoed password
  that the user just typed to the terminal.
 
  Fix it by inserting a no-op fseek between the read and
  write.
 
 My Windows-patches for git_terminal_prompt would probably also solve
 this problem. Instead of opening a read-write handle to /dev/tty, they
 open two handles to the terminal instead; one for reading and one for
 writing. This is because the terminal cannot be opened in read-write
 mode on Windows (we need to open CONIN$ and CONOUT$ separately).

Yeah, it would solve it, although it means opening /dev/tty twice (which
is probably not a big deal, though). I'm fine if we go that way in the
long run to share implementations, but let's treat it as a separate
topic.  This fix is an obvious one-liner, and is just fixing me being
stupid about actually following the C standard. So it's a no-brainer for
as a maintenance fix.

 You can have a look at the series here if you're interested:
 https://github.com/kusma/git/tree/work/terminal-cleanup
 
 That last patch is the reason why I haven't submitted the series yet,
 but perhaps some of the preparatory patches could be worth-while for
 other platforms in the mean time?

Yeah, that last patch is really gross. There's no explanation of the
race issue, so I'll refrain from thinking about it until you are ready
to post a series. :)

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 resend] gitk: Use an external icon file on Windows

2012-08-08 Thread Junio C Hamano
Sebastian Schuberth sschube...@gmail.com writes:

 Git for Windows now ships with the new Git icon from git-scm.com. Use that
 icon file instead of the old procedurally drawn one if it exists.

 Signed-off-by: Sebastian Schuberth sschube...@gmail.com
 ---

Forwarding a misdirected patch to the maintainer who is free to pick
or ignore.

Personally I am negative on it (nobody on the list asked for the
new Git icon as far as I recall), but my voice on this counts just
as little as others.

Thanks.

  gitk-git/gitk | 49 ++---
  1 file changed, 26 insertions(+), 23 deletions(-)

 diff --git a/gitk-git/gitk b/gitk-git/gitk
 index 59693c0..5127e55 100755
 --- a/gitk-git/gitk
 +++ b/gitk-git/gitk
 @@ -11664,7 +11664,6 @@ if { [info exists ::env(GITK_MSGSDIR)] } {
  set gitk_prefix [file dirname [file dirname [file normalize $argv0]]]
  set gitk_libdir [file join $gitk_prefix share gitk lib]
  set gitk_msgsdir [file join $gitk_libdir msgs]
 -unset gitk_prefix
  }
  
  ## Internationalization (i18n) through msgcat and gettext. See
 @@ -11821,28 +11820,32 @@ if {[expr {[exec git rev-parse 
 --is-inside-work-tree] == true}]} {
  set worktree [exec git rev-parse --show-toplevel]
  setcoords
  makewindow
 -catch {
 -image create photo gitlogo  -width 16 -height 16
 -
 -image create photo gitlogominus -width  4 -height  2
 -gitlogominus put #C0 -to 0 0 4 2
 -gitlogo copy gitlogominus -to  1 5
 -gitlogo copy gitlogominus -to  6 5
 -gitlogo copy gitlogominus -to 11 5
 -image delete gitlogominus
 -
 -image create photo gitlogoplus  -width  4 -height  4
 -gitlogoplus  put #008000 -to 1 0 3 4
 -gitlogoplus  put #008000 -to 0 1 4 3
 -gitlogo copy gitlogoplus  -to  1 9
 -gitlogo copy gitlogoplus  -to  6 9
 -gitlogo copy gitlogoplus  -to 11 9
 -image delete gitlogoplus
 -
 -image create photo gitlogo32-width 32 -height 32
 -gitlogo32 copy gitlogo -zoom 2 2
 -
 -wm iconphoto . -default gitlogo gitlogo32
 +if {$::tcl_platform(platform) eq {windows}  [file exists 
 $gitk_prefix/etc/git.ico]} {
 +wm iconbitmap . -default $gitk_prefix/etc/git.ico
 +} else {
 +catch {
 +image create photo gitlogo  -width 16 -height 16
 +
 +image create photo gitlogominus -width  4 -height  2
 +gitlogominus put #C0 -to 0 0 4 2
 +gitlogo copy gitlogominus -to  1 5
 +gitlogo copy gitlogominus -to  6 5
 +gitlogo copy gitlogominus -to 11 5
 +image delete gitlogominus
 +
 +image create photo gitlogoplus  -width  4 -height  4
 +gitlogoplus  put #008000 -to 1 0 3 4
 +gitlogoplus  put #008000 -to 0 1 4 3
 +gitlogo copy gitlogoplus  -to  1 9
 +gitlogo copy gitlogoplus  -to  6 9
 +gitlogo copy gitlogoplus  -to 11 9
 +image delete gitlogoplus
 +
 +image create photo gitlogo32-width 32 -height 32
 +gitlogo32 copy gitlogo -zoom 2 2
 +
 +wm iconphoto . -default gitlogo gitlogo32
 +}
  }
  # wait for the window to become visible
  tkwait visibility .
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Add Code Compare v2.80.4 as a merge / diff tool for Windows

2012-08-08 Thread Junio C Hamano
Sebastian Schuberth sschube...@gmail.com writes:

 Code Compare is a commercial file comparison tool for Windows, see

 http://www.devart.com/codecompare/

 Version 2.80.4 added support for command line arguments preceded by a
 dash instead of a slash. This is required for Git for Windows because
 slashes in command line arguments get mangled with according to these
 rules:

 http://www.mingw.org/wiki/Posix_path_conversion

 Signed-off-by: Sebastian Schuberth sschube...@gmail.com
 ---
  Documentation/merge-config.txt |  8 
  contrib/completion/git-completion.bash |  2 +-
  git-mergetool--lib.sh  |  2 +-
  mergetools/codecompare | 25 +
  4 files changed, 31 insertions(+), 6 deletions(-)
  create mode 100644 mergetools/codecompare

 diff --git a/Documentation/merge-config.txt b/Documentation/merge-config.txt
 index 861bd6f..e9e0d55 100644
 --- a/Documentation/merge-config.txt
 +++ b/Documentation/merge-config.txt
 @@ -54,10 +54,10 @@ merge.stat::
  merge.tool::
   Controls which merge resolution program is used by
   linkgit:git-mergetool[1].  Valid built-in values are: araxis,
 - bc3, diffuse, ecmerge, emerge, gvimdiff, kdiff3, meld,
 - opendiff, p4merge, tkdiff, tortoisemerge, vimdiff
 - and xxdiff.  Any other value is treated is custom merge tool
 - and there must be a corresponding mergetool.tool.cmd option.
 + bc3, codecompare, diffuse, ecmerge, emerge, gvimdiff,
 + kdiff3, meld, opendiff, p4merge, tkdiff, tortoisemerge,
 + vimdiff and xxdiff.  Any other value is treated is custom merge
 + tool and there must be a corresponding mergetool.tool.cmd option.
  
  merge.verbosity::
   Controls the amount of output shown by the recursive merge

I do not have a strong reason to vote for or against inclusion of
yet another tool as mergetool backends (read: Meh), but what this
patch does to Documentation/merge-config.txt is actively unwelcome.

As we discussed earlier in

http://thread.gmane.org/gmane.comp.version-control.git/201913/focus=201976

the longer term direction is to reduce the names of tools listed
there.

I am somewhat saddened to find your name in that thread; you should
have been aware of that discussion when you wrote this patch.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/4] check-docs: get documented command list from Makefile

2012-08-08 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 The current code tries to get a list of documented commands
 by doing ls Documentation/git*txt and culling a bunch of
 special cases from the result. Looking for git-*.txt would
 be more accurate, but would miss a few commands like
 gitweb and gitk.

 Fortunately, Documentation/Makefile already knows what this
 list is, so we can just ask it. Annoyingly, we still have to
 post-process its output a little, since make will print
 extra cruft like GIT-VERSION-FILE is up to date to stdout.

Yeah, traditional way to do this is to give special markers around
what you want your Makefile to tell you, e.g.

sayit:
echo @@@ $(FOO) ###
useit:
$(MAKE) sayit | \
sed -ne 's/^@@@ \(.*\) ###$/\1/p' | \
... use it ...

but in this case we know we want *.txt, so the way you filtered
the output is sufficient.

 Now that our list is accurate, we can remove all of the ugly
 special-cases.

 Signed-off-by: Jeff King p...@peff.net
 ---
  Documentation/Makefile |  3 +++
  Makefile   | 26 ++

Yay, maintainability comes with a large line reduction bonus ;-)

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] check-docs: update non-command documentation list

2012-08-08 Thread Philip Oakley
- Original Message - 
From: Jeff King p...@peff.net

To: Junio C Hamano gits...@pobox.com
Cc: Matthieu Moy matthieu@imag.fr; git@vger.kernel.org
Sent: Wednesday, August 08, 2012 9:54 PM
Subject: Re: [PATCH 2/4] check-docs: update non-command documentation 
list




On Wed, Aug 08, 2012 at 12:24:29PM -0700, Junio C Hamano wrote:


Jeff King p...@peff.net writes:

 The check-docs target looks at Documentation/git*txt and
 complains if any entry does not have a matching command.
 Therefore we need to explicitly ignore any entries which are
 not meant to describe a command (like gitattributes.txt).
 This list has grown stale over time, so let's bring it up to
 date.

 Signed-off-by: Jeff King p...@peff.net
 ---
 I really wonder if we would do better to match git-*.txt, since 
 most of
 the ignores are gitfoo(7) types of pages. We'd probably want to add 
 back
 in git, gitweb and gitk explicitly, but they are already 
 handled

 specially above and below.

Quite possibly, yes.


Actually, my already handled specially is not quite accurate. That
special list is things that are commands but are not necessarily
mentioned in the Makefile variables. But this list is things that 
are

documented but do not begin with git-. The two should mostly be the
same, but the whole point of this exercise is to make sure they _are_
the same.

A better solution is to simply ask the Documentation directory what 
the

commands are, since it already knows (in the form of MAN1_TXT).


Also git gitk gitweb may want to be made into a Makefile variable
to be shared in the above and below (I do not know what to call
them offhand---they are programs with special build rules that are
not covered by ALL/SCRIPT_LIB/BUILTIN).


I couldn't think of a special name, either, but I think it is 
sufficient

to just create a new ALL_COMMANDS variable that includes those other
things, and then add to it.


By the way, do we have a documentation for git-gui?  Perhaps it may
want to be added to that git gitk gitweb list as a reminder that
it lacks documentation.  One of the goals of the person who runs
make check-docs should be to reduce the special case that appears
at the beginning of that case statement.


Yes, it should be checked (and git-citool, too).


I also wonder why help is not treated as a built-in?  Perhaps we
should throw it in to git gitk gitweb list?  After all, it is a
command that is available in git foo form, is documented, and is
listed in the command-list.txt file.


One issue I notice a few weeks ago is that `git help --all` does not 
list all of the available git help pages, rather it just limits itself 
to the available command pages.


This means that new users can't discover those additional help pages in 
any easy manner.


I had an initial look at what might be involved in adding a --guides 
option, shifting the current --all to --cmd (or --command) and then 
make --all list both commands and guides.


The need for help to list all the guides is parallel to these patches. I 
didn't get that far in working out how to approach such a patch which 
would discovere the available guides - I'm on GfW-msysgit which normally 
uses web display.




Historically it was part of git.c, but these days it is a built-in and
does not need any special treatment from check-docs.

Patches for all to follow (on top of my previous 4).

 [5/4]: check-docs: factor out command-list
 [6/4]: check-docs: list git-gui as a command
 [7/4]: check-docs: drop git-help special-case
 [8/4]: check-docs: get documented command list from Makefile

-Peff
--


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v2 0/16] Introduce index file format version 5

2012-08-08 Thread Junio C Hamano
Thomas Rast tr...@inf.ethz.ch writes:

 Junio C Hamano gits...@pobox.com writes:

 Thomas Rast tr...@student.ethz.ch writes:

 I like the general idea, too, but I think there is a long way ahead, and
 we shouldn't hold up v5 on this.

 We shouldn't rush, only to keep some deadline, and regret it later
 that we butchered the index format without thinking things through.
 When this was added to the GSoC idea page, I already said upfront
 that this was way too big a topic to be a GSoC project, didn't I?

 Let me spell out my concern.  There are two v5s here:

 * The extent of the GSoC task.

 * The eventual implementation of index-v5 that goes into Git mainline.

 IMHO this thread is mixing up the two.  There indeed must not be any
 rush in the final implementation of index-v5.  However, the GSoC ends in
 less than two weeks, and I have to evaluate Thomas on whatever is
 finished until then.

This is the primary reason why I have recused myself from the Mentor
pool.  My involvement in this thread is mostly about the latter.  It
is not like I do not really care about GSoC, but the maintainer
works for what is best for the project, not for GSoC schedule.

 AFAIK Thomas is now cleaning up the existing code to be in readable
 shape, using your feedback, which is great.  However, the above
 suggestion is such a fuzzily-specified task that there is no way to even
 find out what needs to be done within the next two weeks.

Yes, it is the mentor's job to (1) keep an eye on the progress of
the student, (2) avoid giving a task that is too big to chew within
the given timeframe, and (3) help the student learn the skill to
break down large tasks to manageable pieces.

 Perhaps it
 makes sense, at this point, to wrap anything that ended up having _v[25]
 suffixes in an index_ops like Duy did.

Yes, I think that suggestion was a welcome input for the mentor and
the student (item (3) above).

 That's a long way from actually
 following through on the idea, though.

I think that is perfectly fine, both from the point of view of the
project maintainer (who officially does not give a whit about GSoC
schedule) and from the point of view of somebody who cares about the
health of the development community (and as one part of it, cares
about the GSoC student project).

If Git GSoC admins initially picked a project that is too large by
mistake, finishing a subpart of it that is of reasonable size and
polishing the result into a nice shape would be the best the student
can do, and the grading should be done on the quality of that
subtask alone.  It may not directly help the project without the
remainder, but that is not the student's fault.  But as I am not
part of the Mentor pool, what I wrote in this paragraph is just my
opinion.

 I think the part you snipped

 the loops that iterate over the index [...] either
 skip unmerged entries or specifically look for them.  There are subtle
 differences between the loops on many points: what do they do when they
 hit an unmerged entry?  Or a CE_REMOVED or CE_VALID one?

 is a symptom of the same general problem: the data structures are sound,
 but they are leaking all over the code and now we have lots of
 complexity to do even simple operations like for each unmerged entry.

I do not think I was arguing against an updated cleaner API, so we
are in agreement.  In fact, I was saying that the calling code
should be ported to such a cleaner API and in-core data structure
first, and only then an optimal on-disk representation of the
in-core data structure can be designed.

The mistaken title of this GSoC topic was one of the root cause of
the issues, I think, you are seeing.  It said faster file format,
but file format is a result of a design of the code that uses the
data, not the other way around.

That, and also the project scope is too large for a summer student
project as I said in the very beginning.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] check-docs: update non-command documentation list

2012-08-08 Thread Junio C Hamano
Philip Oakley philipoak...@iee.org writes:

 One issue I notice a few weeks ago is that `git help --all` does not
 list all of the available git help pages, rather it just limits itself
 to the available command pages.

 This means that new users can't discover those additional help pages
 in any easy manner.

That would be a problem _only_ if these additional help pages are
of importance for new users.  I do not think things that come from
Documentation/technical and ARTICLES (in Documentation/Makefile)
qualify as such.

I'd be perfectly happy as long as all documents are reachable from
git.html in html-fied documentation (the man pages have equivalent
cross references, I think).
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git svn: reset invalidates the memoized mergeinfo caches

2012-08-08 Thread Eric Wong
Peter Baumann waste.mana...@gmx.de wrote:
 On Tue, Aug 07, 2012 at 08:45:10PM +, Eric Wong wrote:
  Peter Baumann waste.mana...@gmx.de wrote:
   + for my $suffix (qw(yaml db)) {
   + unlink($cache_file.$suffix);
  
  Need to check for unlink() errors (and ignore ENOENT).
 
 I'm not sure what you mean here: Aren't we screwed either way if unlinking
 the file failed? There is nothhing we can do about it if e.g. the user doesn't
 have the permissions to delete the file, besides terminating, e.g.
 
   for my $cache_file (($cache_path/lookup_svn_merge,
$cache_path/check_cherry_pick,
$cache_path/has_no_changes)) {
   for my $suffix (qw(yaml db)) {
   next unless (-e $cache_file.$suffix);
   unlink($cache_file.$suffix) or 
   die Failed to delete $cache_file.$suffix;
   }

Yes we're screwed, but silent failure is the worst way to fail,
especially if it can lead us back to the problems your patch is meant to
address.

Perhaps something like this (with $! to show the error):

my $file = $cache_file.$suffix;
next unless -e $file;
unlink($file) or die unlink($file) failed: $!\n;
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] git svn: handle errors and concurrent commits in dcommit

2012-08-08 Thread Eric Wong
Robert Luberda rob...@debian.org wrote:
 Eric Wong wrote:
  +  echo PATH=\$PATH\; export PATH  $hook
  +  echo svnconf=\$svnconf\  $hook
  +  cat  $hook - 'EOF2'
  +  cd work-auto-commits.svn
  +  svn up --config-dir $svnconf
  
  That doesn't seem to interact well with users who depend on
  svn_cmd pointing to something non-standard.  Not sure
  what to do about it, though

 I have no idea how to change it either. I've tried to source the
 lib-git-svn.sh file inside the hook, but it sources test-lib.sh, and the
 latter script doesn't work well if it is sourced by non-test script.
 Anyway I the part of my original patch unchanged.

Ah, so svn_cmd only cares about --config-dir and you already handled
that :)   I misremembered it also allowed for non-standard SVN
installations :x

I've pushed your updated patch to my maint branch on
git://bogomips.org/git-svn since master has larger pending changes.

Thanks!
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v2 0/16] Introduce index file format version 5

2012-08-08 Thread Nguyen Thai Ngoc Duy
On Wed, Aug 8, 2012 at 11:31 PM, Junio C Hamano gits...@pobox.com wrote:
 The current code that access nth entry from the index-cache[nth]
 would need to be updated to use an accessor function, whether the
 nth comes from index_name_pos() or from the for-loop that iterates
 over the entire index.  For the latter, you would need to give the
 users a function that returns a cursor into the in-core index to
 allow iterating over it.

 When you use an in-core representation that is not a flat array, the
 type of nth, which is essentially a cursor, may have to change to
 something that is richer than a simple integer, in order to give the
 implementation of the in-core index a more efficient way to access
 the entry than traversing the leaves of the tree depth first, and
 you would need to update index_name_pos() to return such a cursor.
 That design and development cost is part of updating the in-core
 data structure. In the end result, the runtime cost to manipulate an
 index entry that the cursor refers to should be minimum, as that
 would be the cost paid by all the users of the API anyway, even if
 we _were_ starting from an ideal world where there weren't any flat
 in-core index in the first place.

Interesting. So you hide the entire tree walk behind the cursor
concept. And we can make pathspec filter as part of cursor
initialization. Index iteration code this way looks really neat
(compared to how we do traverse sha-1 trees nowadays). The hard part
is updating the index while iterating (or avodiing running into such a
situation). Maybe C++ STL has done it already with std::map::iterator.
I fear that by hiding the trees, we might miss some optimization
opportunities. But I haven't figured it all out yet so I may be wrong.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] gitweb: URL-decode $my_url/$my_uri when stripping PATH_INFO

2012-08-08 Thread Jay Soffian
When gitweb is used as a DirectoryIndex, it attempts to strip
PATH_INFO on its own, as $cgi-url() fails to do so.

However, it fails to account for the fact that PATH_INFO has
already been URL-decoded by the web server, but the value
returned by $cgi-url() has not been. This causes the stripping
to fail whenever the URL contains encoded characters.

To see this in action, setup gitweb as a DirectoryIndex and
then use it on a repository with a directory containing a
space in the name. Navigate to tree view, examine the gitweb
generated html and you'll see a link such as:

  a href=/test.git/tree/HEAD:/directory with spacesdirectory with spaces/a

When clicked on, the browser will URL-encode this link, giving
a $cgi-url() of the form:

   /test.git/tree/HEAD:/directory%20with%20spaces

While PATH_INFO is:

   /test.git/tree/HEAD:/directory with spaces

Fix this by calling unescape() on both $my_url and $my_uri before
stripping PATH_INFO from them.

Signed-off-by: Jay Soffian jaysoff...@gmail.com
---
 gitweb/gitweb.perl | 5 +
 1 file changed, 5 insertions(+)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 3d6a705388..7f8c1878d4 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -54,6 +54,11 @@ sub evaluate_uri {
# to build the base URL ourselves:
our $path_info = decode_utf8($ENV{PATH_INFO});
if ($path_info) {
+   # $path_info has already been URL-decoded by the web server, but
+   # $my_url and $my_uri have not. URL-decode them so we can 
properly
+   # strip $path_info.
+   $my_url = unescape($my_url);
+   $my_uri = unescape($my_uri);
if ($my_url =~ s,\Q$path_info\E$,, 
$my_uri =~ s,\Q$path_info\E$,, 
defined $ENV{'SCRIPT_NAME'}) {
-- 
1.7.11.3

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Sync production with Git

2012-08-08 Thread demerphq
On 8 August 2012 15:11, kiranpyati kiran.py...@infobeans.com wrote:
 I am new to github,

 Earlier we used to manually upload files on the production through FTP
 although git was present on the production. Due to this now git status shows
 many modified and untrack files.

 To sync that with git we have downloaded all files from production and
 committed to git. Now git has all files same as production.

 We have not pulled on production since last 6 months and because of this it
 shows modified and untracked files.

 Now if we pull on the production there any 100% chances of the conflict
 happened on all modified files. As there are hundreds of modified files
 since last since month. Git pull will show conflict to all those files. In
 that case site will get down and we can not afford this.

 We want a way to seamlessly sync production and Git.

 Can anybody please help me on this?

 Thanks in advance..!!

Try git-deploy.

https://github.com/git-deploy

It contains a full work flow management for handling rollouts from git.

Yves



-- 
perl -Mre=debug -e /just|another|perl|hacker/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Sync production with Git

2012-08-08 Thread demerphq
On 9 August 2012 06:21, demerphq demer...@gmail.com wrote:
 On 8 August 2012 15:11, kiranpyati kiran.py...@infobeans.com wrote:
 I am new to github,

 Earlier we used to manually upload files on the production through FTP
 although git was present on the production. Due to this now git status shows
 many modified and untrack files.

 To sync that with git we have downloaded all files from production and
 committed to git. Now git has all files same as production.

 We have not pulled on production since last 6 months and because of this it
 shows modified and untracked files.

 Now if we pull on the production there any 100% chances of the conflict
 happened on all modified files. As there are hundreds of modified files
 since last since month. Git pull will show conflict to all those files. In
 that case site will get down and we can not afford this.

 We want a way to seamlessly sync production and Git.

 Can anybody please help me on this?

 Thanks in advance..!!

 Try git-deploy.

 https://github.com/git-deploy

 It contains a full work flow management for handling rollouts from git.

Better link:

https://github.com/git-deploy/git-deploy

Yves


-- 
perl -Mre=debug -e /just|another|perl|hacker/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html