Re: [PATCH v2 00/21] Support multiple worktrees

2013-12-22 Thread Duy Nguyen
On Sun, Dec 22, 2013 at 1:38 PM, Junio C Hamano gits...@pobox.com wrote:
 Duy Nguyen pclo...@gmail.com writes:

 I am not happy with the choice of main/HEAD that would squat on a
 good name for remote-tracking branch (i.e. s/origin/main/), though.
 $GIT_DIR/COMMON_HEAD perhaps?

 It's not just about HEAD. Anything worktree-specific of the main
 checkout can be accessed this way, e.g. main/index,
 main/FETCH_HEAD and it's not exactly common because it's
 worktree info. Maybe 1ST_ as the prefix (e.g. 1ST_HEAD, 1ST_index...)
 ?

 Do we even need to expose them as ref-like things as a part of the
 external API/UI in the first place?  For end-user scripts that want
 to operate in a real or borrowing worktree, there should be no
 reason to touch this other repository directly.  Things like if
 one of the wortrees tries to check out a branch that is already
 checked out elsewhere, error out policy may need to consult the
 other worktrees via $GIT_COMMON_DIR mechanism but at that level we
 have all the control without contaminating end-user facing ref
 namespace in a way main/FETCH_HEAD... do.

No, external API/UI is just extra bonus. We need to (or should) do so
in order to handle $GIT_COMMON_DIR/HEAD exactly like how we do normal
refs. Given any ref, git_path(ref) gives the path to that ref,
git_path(logs/%s, ref) gives the path of its reflog. By mapping
special names to real paths behind git_path(), We can feed
$GIT_COMMON_DIR/HEAD (under special names) to refs.c and it'll handle
correctly without any changes for special cases.

 You said

 This makes it possible for a linked checkout to detach HEAD of
 the main one.

 but I think that is misguided---it just makes it easier to confuse
 users, if done automatically and without any policy knob. It instead
 should error out, while saying which worktree has the branch in
 question checked out. After all, checking out a branch that is
 checked out in another worktree (not the common one) needs the same
 caution, so main/HEAD is not the only special one.

 And if your updated git checkout 'frotz' gives a clear report of
 which worktree has the branch 'frotz' checked out, the user can go
 there to detach the HEAD in that worktree to detach with

 git -C $the_other_one checkout HEAD^0

 if he chooses to.

Jonathan mentions about the checkout in portable device case that
would make the above a bit unnatural as you just can't cd there (git
update-ref still works).

I don't see any problems with checking out a branch multiple times. I
may want to try modifying something in the branch that will be thrown
away in the end. It's when the user updates a branch that we should
either error+reject or detach other checkouts. I guess it's up to the
user to decide which way they want. The error+reject way may make the
user hunt through dead checkouts waiting to be pruned. But we can
start with error+reject then add an option to auto-detach.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/POC 3/7] setup.c: add split-repo support to .git files

2013-12-22 Thread Duy Nguyen
On Sat, Dec 14, 2013 at 3:43 AM, Jonathan Nieder jrnie...@gmail.com wrote:
 Problems:

  * What if I move my worktree with mv?  Then I still need the
corresponding $GIT_SUPER_DIR/repos/id directory, and nobody told
the GIT_SUPER_DIR about it.

  * What if my worktree is on removable media (think network
filesystem) and has just been temporarily unmounted instead of
deleted?

 So maybe it would be nicer to:

   i. When the worktree is on the same filesystem, keep a *hard link* to
  some file in the worktree (e.g., the .git file).  If the link count
  goes down, it is safe to remove the $GIT_SUPER_DIR/repos/id
  directory.

Link count goes down to 1 if I move the worktree to a different
filesystem and it's not safe to remove $GIT_SUPER_DIR/repos/id in
this case, I think.

  ii. When the worktree is on another filesystem, always keep
  $GIT_SUPER_DIR/repos/id unless the user decides to manually
  remove it.  Provide documentation or a command to list basic
  information about $GIT_SUPER_DIR/repos directories (path to
  worktree, what branch they're on, etc).

 (i) doesn't require any futureproofing.  As soon as someone wants it,
 they can implement the check and fall back to (ii) for worktrees
 without the magic hard link.

 (ii) would benefit from recording the working tree directory as a
 possibly unreliable, human-friendly reminder.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] add: don't complain when adding empty project root

2013-12-23 Thread Duy Nguyen
On Tue, Dec 24, 2013 at 12:48 AM, Torsten Bögershausen tbo...@web.de wrote:
 +test_expect_success 'git add -A on empty repo does not error out' '
 + git init empty  ( cd empty  git add -A . )
 +'
 +
  test_done

 I am (a little bit) confused.

 This is what git does:
  rm -rf test  mkdir test  cd test  git init  touch A  mkdir D  cd 
 D  touch B  git add .  git status
 Initialized empty Git repository in /Users/tb/test/test/.git/
 On branch master

 Initial commit

 Changes to be committed:
   (use git rm --cached file... to unstage)

 new file:   B

 Untracked files:
   (use git add file... to include in what will be committed)

 ../A
 
 And the behaviour is in line with
 https://www.kernel.org/pub/software/scm/git/docs/git-add.html

 . stands for the current directory somewhere in the worktree,
 not only the project root.

Yes, except in this case . is project root because current dir is. I
could have done git add -A (without the dot) like reported, but that
will be deprecated soon. Another way to make it clear about project
root is use git add -A :/. I'll send an update if it makes it
clearer.

 -

 Could it make sense to mention that replace
 [PATCH] add: don't complain when adding empty project root
 with
 [PATCH] add: don't complain when adding empty directory.

We don't complain about adding an empty directory before or after this patch.


 (and similar in the commit message)
 /Torsten




-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] add: don't complain when adding empty project root

2013-12-24 Thread Duy Nguyen
On Wed, Dec 25, 2013 at 4:48 AM, Torsten Bögershausen tbo...@web.de wrote:
 On 2013-12-24 00.46, Duy Nguyen wrote:

 [snip]
 We don't complain about adding an empty directory before or after this patch.
 Ok, thanks for the explanation.

 I think that
 https://www.kernel.org/pub/software/scm/git/docs/git-add.html
 could deserve an update.

 My understanding is that filepattern is related to $GIT_DIR,
 but . is the current directory.

 I will be happy to write a patch,
 (or to review one, whatever comes first)
 /Torsten

filepattern is related to current directory too (e.g. *.sh from t
won't cover git-rebase.sh, :/*.sh does). Yes a patch to update
git-add.txt to use the term pathspec instead of filepattern would
be nice. A pointer to pathspec glossary could help discover
case-insensitive matching, negative matching..
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/21] Support multiple worktrees

2013-12-27 Thread Duy Nguyen
On Fri, Dec 27, 2013 at 12:12 AM, Junio C Hamano gits...@pobox.com wrote:
 Duy Nguyen pclo...@gmail.com writes:

 On Sun, Dec 22, 2013 at 1:38 PM, Junio C Hamano gits...@pobox.com wrote:

 Do we even need to expose them as ref-like things as a part of the
 external API/UI in the first place?  For end-user scripts that want
 to operate in a real or borrowing worktree, there should be no
 reason to touch this other repository directly.  Things like if
 one of the wortrees tries to check out a branch that is already
 checked out elsewhere, error out policy may need to consult the
 other worktrees via $GIT_COMMON_DIR mechanism but at that level we
 have all the control without contaminating end-user facing ref
 namespace in a way main/FETCH_HEAD... do.

 No, external API/UI is just extra bonus. We need to (or should) do so
 in order to handle $GIT_COMMON_DIR/HEAD exactly like how we do normal
 refs.

 And that is what I consider a confusion-trigger, not a nice bonus.

 I do not think it is particularly a good idea to contaminate
 end-user namespace for this kind of peek another repository
 special operation.

 In order to handle your multiple worktrees manipulating the same
 branch case sanely, you need to be aware of not just the real
 repository your worktree is borrowing from, but also _all_ the other
 worktrees that borrow from that same real repository, so a single
 main virtual namespace will not cut it. You will be dealing with
 an unbounded set of HEADs, one for each such worktree.

Yes. My problem is, while all secondary worktrees are in
$GIT_DIR/repos and their HEADs can be accessed there with
repos/xxx/HEAD, the first worktree's HEAD can't be accessed this way
because HEAD in a linked checkouts means repos/my worktree/HEAD.

 Can't we do this by adding a with this real repository layer,
 e.g. resolve HEAD wrt that repository, somewhat similar to how we
 peek into submodule namespaces?

Hmm.. never thought of it like a submodule. Thanks for the idea.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] gc: notice gc processes run by other users

2013-12-31 Thread Duy Nguyen
On Tue, Dec 31, 2013 at 7:07 PM, Kyle J. McKay mack...@gmail.com wrote:
 Since 64a99eb4 git gc refuses to run without the --force option if
 another gc process on the same repository is already running.

 However, if the repository is shared and user A runs git gc on the
 repository and while that gc is still running user B runs git gc on
 the same repository the gc process run by user A will not be noticed
 and the gc run by user B will go ahead and run.

 The problem is that the kill(pid, 0) test fails with an EPERM error
 since user B is not allowed to signal processes owned by user A
 (unless user B is root).

 Update the test to recognize an EPERM error as meaning the process
 exists and another gc should not be run (unless --force is given).

Ack. Looking at kill(2) the other errors are EINVAL and ESRCH, which
are fine to ignore.

 ---

 I suggest this be included in maint as others may also have expected the
 shared repository, different user gc scenario to be caught by the new
 code when in fact it's not without this patch.

  builtin/gc.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/builtin/gc.c b/builtin/gc.c
 index c14190f8..25f2237c 100644
 --- a/builtin/gc.c
 +++ b/builtin/gc.c
 @@ -222,7 +222,7 @@ static const char *lock_repo_for_gc(int force, pid_t* 
 ret_pid)
 time(NULL) - st.st_mtime = 12 * 3600 
 fscanf(fp, %PRIuMAX %127c, pid, locking_host) == 
 2 
 /* be gentle to concurrent gc on remote hosts */
 -   (strcmp(locking_host, my_host) || !kill(pid, 0));
 +   (strcmp(locking_host, my_host) || !kill(pid, 0) || 
 errno == EPERM);
 if (fp != NULL)
 fclose(fp);
 if (should_exit) {
 --
 1.8.5.2




-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] shallow: Remove unused code

2014-01-05 Thread Duy Nguyen
On Mon, Jan 6, 2014 at 7:00 AM, Ramsay Jones ram...@ramsay1.demon.co.uk wrote:

 Commit 58babfff (shallow.c: the 8 steps to select new commits for
 .git/shallow, 05-12-2013) added a function to implement step 5 of
 the quoted eight steps, namely 'remove_nonexistent_ours_in_pack()'.
 This function implements an optional optimization step in the new
 shallow commit selection algorithm. However, this function has no
 callers. (The commented out call sites would need to change, in
 order to provide information required by the function.)

 Signed-off-by: Ramsay Jones ram...@ramsay1.demon.co.uk
 ---

 Hi Junio,

 This should, perhaps, be marked as RFC; if the patch to actually
 use this function is coming soon, then this is not needed.
 However, I suspect it could be slow in coming ... :-P

I'm not against this patch. I think the function is to demonstrate
what step 5 is, which is already in the history (unless there are
major errors that I'll need to resend the whole series). I may do the
real step 5 differently anyway to fit better how index-pack and
unpack-objects are run. Not fully realized yet.


 ATB,
 Ramsay Jones

  builtin/receive-pack.c |  1 -
  commit.h   |  2 --
  fetch-pack.c   |  1 -
  shallow.c  | 16 
  4 files changed, 20 deletions(-)

 diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
 index de869b5..85bba35 100644
 --- a/builtin/receive-pack.c
 +++ b/builtin/receive-pack.c
 @@ -1059,7 +1059,6 @@ static void update_shallow_info(struct command 
 *commands,
 struct command *cmd;
 int *ref_status;
 remove_nonexistent_theirs_shallow(si);
 -   /* XXX remove_nonexistent_ours_in_pack() */
 if (!si-nr_ours  !si-nr_theirs) {
 shallow_update = 0;
 return;
 diff --git a/commit.h b/commit.h
 index 5507460..30598c6 100644
 --- a/commit.h
 +++ b/commit.h
 @@ -230,8 +230,6 @@ struct shallow_info {
  extern void prepare_shallow_info(struct shallow_info *, struct sha1_array *);
  extern void clear_shallow_info(struct shallow_info *);
  extern void remove_nonexistent_theirs_shallow(struct shallow_info *);
 -extern void remove_nonexistent_ours_in_pack(struct shallow_info *,
 -   struct packed_git *);
  extern void assign_shallow_commits_to_refs(struct shallow_info *info,
uint32_t **used,
int *ref_status);
 diff --git a/fetch-pack.c b/fetch-pack.c
 index 1d0328d..d52de74 100644
 --- a/fetch-pack.c
 +++ b/fetch-pack.c
 @@ -984,7 +984,6 @@ static void update_shallow(struct fetch_pack_args *args,
 return;

 remove_nonexistent_theirs_shallow(si);
 -   /* XXX remove_nonexistent_ours_in_pack() */
 if (!si-nr_ours  !si-nr_theirs)
 return;
 for (i = 0; i  nr_sought; i++)
 diff --git a/shallow.c b/shallow.c
 index f48f732..bbc98b5 100644
 --- a/shallow.c
 +++ b/shallow.c
 @@ -358,22 +358,6 @@ void remove_nonexistent_theirs_shallow(struct 
 shallow_info *info)
 info-nr_theirs = dst;
  }

 -/* Step 5, remove non-existent ones in ours in the pack */
 -void remove_nonexistent_ours_in_pack(struct shallow_info *info,
 -struct packed_git *p)
 -{
 -   unsigned char (*sha1)[20] = info-shallow-sha1;
 -   int i, dst;
 -   trace_printf_key(TRACE_KEY, shallow: 
 remove_nonexistent_ours_in_pack\n);
 -   for (i = dst = 0; i  info-nr_ours; i++) {
 -   if (i != dst)
 -   info-ours[dst] = info-ours[i];
 -   if (find_pack_entry_one(sha1[info-ours[i]], p))
 -   dst++;
 -   }
 -   info-nr_ours = dst;
 -}
 -
  define_commit_slab(ref_bitmap, uint32_t *);

  struct paint_info {
 --
 1.8.5



-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/6] read-cache: new extension to mark what file is watched

2014-01-13 Thread Duy Nguyen
On Tue, Jan 14, 2014 at 12:02 AM, Jonathan Nieder jrnie...@gmail.com wrote:
 Hi,

 Nguyễn Thái Ngọc Duy wrote:

 If an entry is watched, git lets an external program decide if the
 entry is modified or not. It's more like --assume-unchanged, but
 designed to be controlled by machine.

 We are running out of on-disk ce_flags, so instead of extending
 on-disk entry format again, watched flags are in-core only and
 stored as extension instead.

 Makes sense.

 Care to add a brief description of the on-disk format for
 Documetnation/technical/index-format.txt as well?

Sure, in the reroll after I fix inotify bugs that make the test suite fail.

 +static void read_watch_extension(struct index_state *istate, uint8_t *data,
 +  unsigned long sz)
 +{
 + int i;
 + if ((istate-cache_nr + 7) / 8 != sz) {
 + error(invalid 'WATC' extension);
 + return;
 + }
 + for (i = 0; i  istate-cache_nr; i++)
 + if (data[i / 8]  (1  (i % 8)))
 + istate-cache[i]-ce_flags |= CE_WATCHED;
 +}

 So the WATC section has one bit per index entry, encoding whether that
 entry is WATCHED.  Makes sense.

 Do I understand correctly that this patch just takes care of the
 bookkeeping for the CE_WATCHED bit and the actual semantics will
 come in a later patch?

Correct.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/6] read-cache: new extension to mark what file is watched

2014-01-13 Thread Duy Nguyen
On Sun, Jan 12, 2014 at 6:03 PM, Nguyễn Thái Ngọc Duy pclo...@gmail.com wrote:
 We are running out of on-disk ce_flags,

Correction, we're not. I saw

/*
 * Extended on-disk flags
 */
#define CE_INTENT_TO_ADD (1  29)
#define CE_SKIP_WORKTREE (1  30)

followed by

/* CE_EXTENDED2 is for future extension */
#define CE_EXTENDED2 (1  31)

and panicked, but on-disk flags could be added backward (e.g. bit 28,
27...). Anyway using extended flags means 2 extra bytes per entry for
almost every entry in this case (and for index v5 it means redoing
crc32 for almost every entry too when the bit is updated) so it may
still be a good idea to keep the new flag separate.

 so instead of extending
 on-disk entry format again, watched flags are in-core only and
 stored as extension instead.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Error logging for git over ssh?

2014-01-14 Thread Duy Nguyen
On Wed, Jan 15, 2014 at 1:44 AM, Martin Langhoff
martin.langh...@gmail.com wrote:
 Diagnosing errors with git over ssh has historically required tooling
 up for debugging or looking at things from the client side, because
 git does not log anything on the server side.

 It would be a boon to those running busy git servers to be able to
 diagnose by looking at a log. It can be both old-fashioned, or very
 modern (if you are using journald).

Maybe we could put a pager program in front of stderr and let it
does the logging? We'll need to output the error side bands to stderr
too in case side-band is used.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] git symbolic-ref does not recognize @ as the shortcut for HEAD

2014-01-15 Thread Duy Nguyen
On Thu, Jan 9, 2014 at 10:05 PM, Mathieu Lemoine math...@mlemoine.name wrote:
 Hello,

 In https://raw.github.com/git/git/master/Documentation/RelNotes/1.8.5.txt
 is mentioned:

  * Instead of typing four capital letters HEAD, you can say @ now,
 e.g. git log @.

 However, `git symbolic-ref @`  gives fatal: No such ref: @, while
 `git symbolic-ref @`  gives refs/heads/master.

you meant while `git symbolic-ref _HEAD_` gives refs/heads/master?

 I looked around in the archive and #git, but nobody seemed to be aware
 of the behaviour.

 I wonder if it's on purpose given the low level of symbolic-ref or if
 it's a bug.

I think this is because symbolic-ref (and show-ref) takes plain refs,
either in full or short form, but nothing else fancier. I'm not sure
if we should add support for @ (and maybe @{u} as it's in the same
boat) to these commands. If it's confusing and hard to explain in
documents then maybe we should.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Diagnosing stray/stale .keep files -- explore what is in a pack?

2014-01-15 Thread Duy Nguyen
On Thu, Jan 16, 2014 at 6:50 AM, Martin Langhoff
martin.langh...@gmail.com wrote:
 On Wed, Jan 15, 2014 at 12:49 PM, Junio C Hamano gits...@pobox.com wrote:
 As long as we can reliably determine that it is safe to do so
 without risking races, automatically cleaning .lock files is a good
 thing to do.

 If the .lock file is a day old, it seems to me that it should be safe
 to call it stale.

Perhaps report those stale locks (and stale .keep files as well if you
can detct them) as garbage in count-objects too.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] .gitignore: Ignore editor backup and swap files

2014-01-16 Thread Duy Nguyen
On Fri, Jan 17, 2014 at 5:43 AM, Jonathan Nieder jrnie...@gmail.com wrote:
 Subject: gitignore doc: add global gitignore to synopsis

 The gitignore(5) manpage already documents $XDG_CONFIG_HOME/git/ignore
 but it is easy to forget that it exists.  Add a reminder to the
 synopsis.

Yes! I knew about the xdg thing but was not aware about
$xdg/git/ignore. No more updating .git/info/exclude on every newly
cloned repo..
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/WIP v2 03/14] read-cache: connect to file watcher

2014-01-17 Thread Duy Nguyen
On Fri, Jan 17, 2014 at 10:24 PM, Torsten Bögershausen tbo...@web.de wrote:
 On 2014-01-17 10.47, Nguyễn Thái Ngọc Duy wrote:
 [snip[
 diff --git a/file-watcher-lib.c b/file-watcher-lib.c


 +int connect_watcher(const char *path)
 Could it be worth to check if we can use some code from unix-socket.c ?

 Especially important could be that unix_sockaddr_init() wotks around a problem
 when long path names are used.


Thanks! I did not even know about unix-socket.c. Well, I never paid
attention to credential-cache.c :(
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/WIP v2 00/14] inotify support

2014-01-19 Thread Duy Nguyen
On Mon, Jan 20, 2014 at 12:04 AM, Thomas Rast t...@thomasrast.ch wrote:
 I never got the last three patches, did you send them?

 Also, this doesn't cleanly apply anywhere at my end.  Can you push it
 somewhere for easier experimentation?

Sorry I rebased it on top of kb/fast-hashmap but never got around to
actually using hash tables. The code is here (and on top of master)

https://github.com/pclouds/git.git file-watcher

 So I think the watcher protocol can be specified roughly as:

   Definitions: The watcher is a separate process that maintains a set of
   _watched paths_.  Git uses the commands below to add or remove paths
   from this set.

   In addition it maintains a set of _changed paths_, which is a subset
   of the watched paths.  Git occasionally queries the changed paths.  If
   at any time after a path P was added to the watched set, P has or
   may have changed, it MUST be added to the changed set.

   Note that a path being unchanged under this definition does NOT mean
   that it is unchanged in the index as per `git diff`.  It may have been
   changed before the watch was issued!  Therefore, Git MUST test whether
   the path started out unchanged using lstat() AFTER it was added to the
   watched set.

Correct.

   Handshake::
 On connecting, git sends hello magic.  magic is an opaque
 string that identifies the state of the index.  The watcher MUST
 respond with hello previous_magic, where previous_magic is
 obtained from the bye command (below).  If magic !=
 previous_magic, the watcher MUST reset state for this repository.

In addition, git must reset itself (i.e. clear all CE_WATCHED flags)
because nothing can be trusted at this point.

...

 Did I get that approximately straight?  Perhaps you can fix up as needed
 and then turn it into the documentation for the protocol.

Will do (and probably steal some of your text above).

 There are several points about this that I don't like:

 * What does the datagram socket buy us?  You already do a lot of
   pkt-line work for the really big messages, so why not just use
   pkt-line everywhere and a streaming socket?

With datagram sockets I did not have to maintain the connected
sockets, which made it somewhat simpler to handle so far.

The default SO_SNDBUF goes up to 212k, so we can send up to that
amount without blocking. With current pkt-line we send up to 64k in
watch message then we have to wait for fine, which results in more
context switches. But I think we can extend pkt-line's length field to
5 hex digit to cover this.

Streaming sockets are probably the way to go for per-user daemon, so
we can identify a socket with a repo.

 * The handshake should have room for capabilities, so that we can extend
   the protocol.

Yeah. One more point for streaming sockets. With datagram sockets it's
harder to define a session and thus hard to add capabilities.

 * The hello/hello handshake is confusing, and probably would allow you
   to point two watchers at each other without them noticing.  Make it
   hello/ready, or some other unambiguous choice.

OK

 * I took some liberty and tried to fully define the transitions between
   the sets.  So even though I'm not sure this is currently handled, I
   made it well-defined to issue watch for a path that is in the
   changed set.

Yes that should avoid races. The path can be removed from watched
set later after git acknowledges it.

 * bye is confusing, because in practice git issues forgets after
   bye.  The best I can come up with is setstate, I'm sure you have
   better ideas.

Originally it was forget, forget, forget then bye. But with
that order, if git crashes before sending bye we could lose info in
changed set so the order was changed but I did not update the
command name.

 There's also the problem of ordering guarantees between the socket and
 inotify.  I haven't found any, so I would conservatively assume that the
 socket messages may in fact arrive before inotify, which is a race in
 the current code.  E.g., in the sequence 'touch foo; git status' the
 daemon may see

   socketinotify
hello...
status
new empty list
 touch foo

 I think a clever way to handle this would be to add a new command:

   Wait::
 This command serves synchronization.  Git creates a file of its
 choice in $GIT_DIR/watch (say, `.git/watch/wait.random`).  Then it
 sends wait path.  The watcher MUST block until it has processed
 all change notifications up to and including path.

So wait.random inotify event functions as a barrier. Nice.

 As far as inotify corner-cases go, the only one I'm aware of is
 directory renames.  I suspect we'll have to watch directories all the
 way up to the repository root to reliably detect when this happens.  Not
 sure how to best handle this.  Perhaps we should declare Git completely
 agnostic wrt such issues, and behind the scenes issue all watches up to
 the root even if we don't need 

Re: [PATCH/WIP v2 08/14] read-cache: add GIT_TEST_FORCE_WATCHER for testing

2014-01-19 Thread Duy Nguyen
On Mon, Jan 20, 2014 at 12:04 AM, Thomas Rast t...@thomasrast.ch wrote:
 Nguyễn Thái Ngọc Duy pclo...@gmail.com writes:

 This can be used to force watcher on when running the test
 suite.

 git-file-watcher processes are not automatically cleaned up after each
 test. So after running the test suite you'll be left with plenty
 git-file-watcher processes that should all end after about a minute.

 Probably not a very good idea, especially in noninteractive use?  E.g.,
 a bisection through the test suite or parallel test runs on different
 commits may exhaust the available processes and/or memory.

I think we run out of inotify resources before hitting process/memory
limits. At least that's the case when running tests in parallel.

 Each test should make an effort to clean up all watchers before
 terminating.

For now it's hard to do this correctly. Maybe once we get the
multi-repo file watcher, we could launch one watcher per trash
directory and cleaning up would be easier.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/WIP v2 05/14] read-cache: put some limits on file watching

2014-01-19 Thread Duy Nguyen
On Mon, Jan 20, 2014 at 12:06 AM, Thomas Rast t...@thomasrast.ch wrote:
 Nguyễn Thái Ngọc Duy pclo...@gmail.com writes:

 watch_entries() is a lot of computation and could trigger a lot more
 lookups in file-watcher. Normally after the first set of watches are
 in place, we do not need to update often. Moreover if the number of
 entries is small, the overhead of file watcher may actually slow git
 down.

 This patch only allows to update watches if the number of watchable
 files is over a limit (and there are new files added if this is not
 the first time). Measurements on Core i5-2520M and Linux 3.7.6, about
 920 lstat() take 1ms. Somewhere between 2^16 and 2^17 lstat calls that
 it starts to take longer than 100ms. 2^16 is chosen at the minimum
 limit to start using file watcher.

 Recently updated files are not considered watchable because they are
 likely to be updated again soon, not worth the ping-pong game with
 file watcher. The default limit 30min is just a random value.

 But then a fresh clone of a big repository would not get any benefit
 from the watcher?

 Not yet sure how to best handle this.

Gaahh, perhaps limit the number of unwatchable recent files to a
hundred or so in addition to time limit.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/WIP v2 02/14] read-cache: new extension to mark what file is watched

2014-01-19 Thread Duy Nguyen
On Mon, Jan 20, 2014 at 12:06 AM, Thomas Rast t...@thomasrast.ch wrote:
 Nguyễn Thái Ngọc Duy pclo...@gmail.com writes:

 If an entry is watched, git lets an external program decide if the
 entry is modified or not. It's more like --assume-unchanged, but
 designed to be controlled by machine.

 We are running out of on-disk ce_flags, so instead of extending
 on-disk entry format again, watched flags are in-core only and
 stored as extension instead.

 I wonder if this would be a good use-case for EWAH bitmaps?  Presumably
 most users would end up having only a few large ranges of files that are
 being watched.  Quite possibly most users would watch *all* files.

Oh yeah. I edited my commit message locally to this a while ago

On webkit.git with
182k entries, that's 364k more to be SHA-1'd on current index
versions, compared to 22k in this format (and even less when
jk/pack-bitmap ewah graduates and we can use ewah compression)
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] list-objects: only look at cmdline trees with edge_hint

2014-01-20 Thread Duy Nguyen
On Tue, Jan 21, 2014 at 4:32 AM, Jeff King p...@peff.net wrote:
 This patch therefore ties the extra tree examination to the
 revs-edge_hint flag; it is the presence of that flag that
 makes the tradeoff worthwhile.

 Here is output from the p0001-rev-list showing the
 improvement in performance:

 Test HEAD^ HEAD
 -
 0001.1: rev-list --all   0.69(0.65+0.02)   
 0.69(0.66+0.02) +0.0%
 0001.2: rev-list --all --objects 3.22(3.19+0.03)   
 3.23(3.20+0.03) +0.3%
 0001.4: rev-list $commit --not --all 0.04(0.04+0.00)   
 0.04(0.04+0.00) +0.0%
 0001.5: rev-list --objects $commit --not --all   0.27(0.26+0.01)   
 0.04(0.04+0.00) -85.2%

You must have so much fun (or headache, depending on your view) with
the variety of repos on github :)

 diff --git a/list-objects.c b/list-objects.c
 index 6cbedf0..43ce1d9 100644
 --- a/list-objects.c
 +++ b/list-objects.c
 @@ -162,15 +162,17 @@ void mark_edges_uninteresting(struct rev_info *revs, 
 show_edge_fn show_edge)
 }
 mark_edge_parents_uninteresting(commit, revs, show_edge);
 }
 -   for (i = 0; i  revs-cmdline.nr; i++) {
 -   struct object *obj = revs-cmdline.rev[i].item;
 -   struct commit *commit = (struct commit *)obj;
 -   if (obj-type != OBJ_COMMIT || !(obj-flags  UNINTERESTING))
 -   continue;
 -   mark_tree_uninteresting(commit-tree);
 -   if (revs-edge_hint  !(obj-flags  SHOWN)) {
 -   obj-flags |= SHOWN;
 -   show_edge(commit);
 +   if (revs-edge_hint) {
 +   for (i = 0; i  revs-cmdline.nr; i++) {
 +   struct object *obj = revs-cmdline.rev[i].item;
 +   struct commit *commit = (struct commit *)obj;
 +   if (obj-type != OBJ_COMMIT || !(obj-flags  
 UNINTERESTING))
 +   continue;
 +   mark_tree_uninteresting(commit-tree);
 +   if (revs-edge_hint  !(obj-flags  SHOWN)) {

Not really important, but perhaps remove revs-edge_hint here because
it's already checked?

 +   obj-flags |= SHOWN;
 +   show_edge(commit);
 +   }
 }
 }
  }
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Anomalous AUTHOR and DOCUMENTATION sections in manpages

2014-01-22 Thread Duy Nguyen
On Wed, Jan 22, 2014 at 6:22 PM, Michael Haggerty mhag...@alum.mit.edu wrote:
 These sections are inconsistent with the other manpages and seem
 superfluous in a project that has, on the one hand, a public history
 and, on the other hand, hundreds of contributors.  Would the mentioned
 authors (CCed) consent to the removal of these sections?

No problem from me.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: problematic git log submodule-dir/

2014-01-22 Thread Duy Nguyen
On Thu, Jan 23, 2014 at 3:35 AM, Jens Lehmann jens.lehm...@web.de wrote:
 Am 20.01.2014 19:25, schrieb Paweł Sikora:
 i've noticed that 'git log submodule-dir' and 'git log submodule-dir/'
 return different results (module's head changes vs. nothing). is it a bug?
 looks like a trailing slash is a problem for git-log.

 I think this is a bug, and for example git diff has similar problems.
 This is especially nasty as shell auto-completion adds the '/' at the
 end.

 Duy, without having looked into the code myself yet: is this something
 that might be easily fixed by using PATHSPEC_STRIP_SUBMODULE_SLASH*?

I think so. But I dislike those preprocessing because it looks
inefficient and the same change may be needed for other diff commands
as well. Maybe we can handle this at diff or log-tree.c level. Will
look further into it tonight.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] t4010: match_pathspec_depth() and trailing slash after submodule

2014-01-23 Thread Duy Nguyen
On Fri, Jan 24, 2014 at 4:38 AM, Junio C Hamano gits...@pobox.com wrote:
 Jens Lehmann jens.lehm...@web.de writes:

 ... But a single
 trailing '/' does mark submod as a directory, which I think is
 ok for a submodule. And it makes life easier for the user if we
 accept that, as shell completion will add it there automatically.

 OK, that would be annoying.

 Perhaps the completion is what is broken here, then?  I dunno, and
 haven't thought things through.

My reasoning is more simple minded: if git diff HEAD HEAD^ submod/
works, the user would expect git diff HEAD submod/ to work too. I
think I've got something working with a bit refactoring. Will send out
soon.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/9] About the trailing slashes

2014-01-24 Thread Duy Nguyen
On Sat, Jan 25, 2014 at 2:22 AM, Junio C Hamano gits...@pobox.com wrote:
 How does git status '[b]/' behave?

It outputs b/ (like git status b). I think that's because
common_prefix_len() stops looking at the first wildcard char, '['. So
common prefix is empty, just like b.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bug with git-diff --quiet

2014-01-24 Thread Duy Nguyen
On Thu, Jan 23, 2014 at 11:45:25AM +0900, IWAMOTO Toshihiro wrote:
 I found git-diff --quiet returns a zero exit status even if there's
 a change.  The following sequence reproduces the bug:
 
   $ mkdir foo
   $ cd foo
   $ git init
   $ echo a  a.txt
   $ echo b b.txt
   $ git add ?.txt
   $ git commit
   $ echo b  b.txt
   $ touch a.txt
   $ git diff --quiet; echo $?
   
 Interestingly, if you issue git-diff --quiet again, it returns the
 expected exit status 1.

Because stat info in index is updated and diff_change() won't be
called again on a.txt.

 The problem is in the optimization code in run_diff_files().  The
 function finds a.txt has different stat(2) data from .git/index and
 calls diff_change(), which sets DIFF_OPT_HAS_CHANGES.  As the flag
 makes diff_can_quit_early() return 1, run_diff_files()'s loop finishes
 without calling diff_change() for b.txt.
 
 Then, diffcore_std() examines diff_queued_diff and clears
 DIFF_OPT_HAS_CHANGES, because a.txt is unchanged.
 This is how a change in b.txt is ignored by git-diff --quiet.

Thanks for the analysis. Perhaps we could make diff_change test
whether a.txt is unchanged so it does not set HAS_CHANGES prematurely?
Maybe something like below.

By the time diffcore_skip_stat_unmatch() is called, everything is
cached, so there's not much of performance regression. We still do
memcmp() twice (in diff_filespec_is_identical), but I think that has
less impact than removing diff_can_quit_early().

-- 8 --
diff --git a/diff.c b/diff.c
index 6b4cd0e..5226fc0 100644
--- a/diff.c
+++ b/diff.c
@@ -4697,6 +4697,33 @@ static int diff_filespec_is_identical(struct 
diff_filespec *one,
return !memcmp(one-data, two-data, one-size);
 }
 
+static int diff_filespec_check_stat_unmatch(struct diff_filepair *p)
+{
+   /*
+* 1. Entries that come from stat info dirtiness
+*always have both sides (iow, not create/delete),
+*one side of the object name is unknown, with
+*the same mode and size.  Keep the ones that
+*do not match these criteria.  They have real
+*differences.
+*
+* 2. At this point, the file is known to be modified,
+*with the same mode and size, and the object
+*name of one side is unknown.  Need to inspect
+*the identical contents.
+*/
+   if (!DIFF_FILE_VALID(p-one) || /* (1) */
+   !DIFF_FILE_VALID(p-two) ||
+   (p-one-sha1_valid  p-two-sha1_valid) ||
+   (p-one-mode != p-two-mode) ||
+   diff_populate_filespec(p-one, 1) ||
+   diff_populate_filespec(p-two, 1) ||
+   (p-one-size != p-two-size) ||
+   !diff_filespec_is_identical(p-one, p-two)) /* (2) */
+   return 1;
+   return 0;
+}
+
 static void diffcore_skip_stat_unmatch(struct diff_options *diffopt)
 {
int i;
@@ -4707,27 +4734,7 @@ static void diffcore_skip_stat_unmatch(struct 
diff_options *diffopt)
for (i = 0; i  q-nr; i++) {
struct diff_filepair *p = q-queue[i];
 
-   /*
-* 1. Entries that come from stat info dirtiness
-*always have both sides (iow, not create/delete),
-*one side of the object name is unknown, with
-*the same mode and size.  Keep the ones that
-*do not match these criteria.  They have real
-*differences.
-*
-* 2. At this point, the file is known to be modified,
-*with the same mode and size, and the object
-*name of one side is unknown.  Need to inspect
-*the identical contents.
-*/
-   if (!DIFF_FILE_VALID(p-one) || /* (1) */
-   !DIFF_FILE_VALID(p-two) ||
-   (p-one-sha1_valid  p-two-sha1_valid) ||
-   (p-one-mode != p-two-mode) ||
-   diff_populate_filespec(p-one, 1) ||
-   diff_populate_filespec(p-two, 1) ||
-   (p-one-size != p-two-size) ||
-   !diff_filespec_is_identical(p-one, p-two)) /* (2) */
+   if (diff_filespec_check_stat_unmatch(p))
diff_q(outq, p);
else {
/*
@@ -4890,6 +4897,7 @@ void diff_change(struct diff_options *options,
 unsigned old_dirty_submodule, unsigned new_dirty_submodule)
 {
struct diff_filespec *one, *two;
+   struct diff_filepair *p;
 
if (S_ISGITLINK(old_mode)  S_ISGITLINK(new_mode) 
is_submodule_ignored(concatpath, options))
@@ -4916,10 +4924,17 @@ void diff_change(struct diff_options *options,
fill_filespec(two, new_sha1, new_sha1_valid, new_mode);
one-dirty_submodule = old_dirty_submodule;
two-dirty_submodule = new_dirty_submodule;
+   p = diff_queue(diff_queued_diff, one, two);
 
-   

Re: 'git diff' command doesn't honor utf8 symbol boundaries, and doesn't use actual terminal width

2014-01-24 Thread Duy Nguyen
On Sat, Jan 25, 2014 at 10:34 AM, Yuri y...@rawbw.com wrote:
 The fragment of 'git diff' output looks like this:
 - translationОшибка: адрес %1 содержит запрещенные символы. Пожалуйста,
 перепро�
 + translationОшибка: адрес %1 содержит запрещённые символы. Пожалуйста,
 перепро�

 Two issues here:
 1. Cyrilic text in utf8 got truncated not at utf8 boundary, causing this
 garbage symbol in the end
 2. Truncation is done at somewhat ~70% of the actual screen width. git could
 go all the way to the whole screen with, but it didn't. Shrinking terminal
 width causes the output truncation limit to shrink in the same proportion.
 So some bad decision about truncation size is made somewhere in 'git diff'
 command.

 Suggested behavior:
 1. git should respect utf8, and in case of truncation it should add …
 symbol.
 2. truncation algorithm should read current terminal width and truncate at
 width-1 length to allow for the above-mentioned symbol

I don't think git diff truncates its output (it does truncate @@
lines, but not the real diff). Perhaps your pager did that?
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 6/9] Pass directory indicator to match_pathspec_item()

2014-01-24 Thread Duy Nguyen
On Sat, Jan 25, 2014 at 4:22 AM, Eric Sunshine sunsh...@sunshineco.com wrote:
 On Fri, Jan 24, 2014 at 8:40 AM, Nguyễn Thái Ngọc Duy pclo...@gmail.com 
 wrote:
 diff --git a/t/t4010-diff-pathspec.sh b/t/t4010-diff-pathspec.sh
 index af5134b..167af53 100755
 --- a/t/t4010-diff-pathspec.sh
 +++ b/t/t4010-diff-pathspec.sh
 @@ -110,4 +110,21 @@ test_expect_success 'diff-tree -r with wildcard' '
 test_cmp expected result
  '

 +test_expect_success 'setup submodules' '
 +   test_tick 

Also the test_tick here. I thought I would need to match SHA-1 but I
did not and still forgot to take this line out.

 +   git init submod 
 +   ( cd submod  test_commit first; ) 

 Unnecessary semicolon might confuse the reader into thinking something
 unusual is going on here.

 +   git add submod 
 +   git commit -m first 
 +   ( cd submod  test_commit second; ) 

 Ditto.

 +   git add submod 
 +   git commit -m second
 +'
 +
 +test_expect_success 'diff-cache ignores trailing slash on submodule path' '
 +   git diff --name-only HEAD^ submod expect 
 +   git diff --name-only HEAD^ submod/ actual 
 +   test_cmp expect actual
 +'
 +
  test_done



-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] tree_entry_interesting: match against all pathspecs

2014-01-25 Thread Duy Nguyen
On Sat, Jan 25, 2014 at 10:06:46PM +, Andy Spencer wrote:
 The current basedir compare aborts early in order to avoid futile
 recursive searches. However, a match may still be found by another
 pathspec. This can cause an error while checking out files from a branch
 when using multiple pathspecs:
 
 $ git checkout master -- 'a/*.txt' 'b/*.txt'
 error: pathspec 'a/*.txt' did not match any file(s) known to git.
 
 Signed-off-by: Andy Spencer andy753...@gmail.com
 ---
  tree-walk.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/tree-walk.c b/tree-walk.c
 index 5ece8c3..e06f240 100644
 --- a/tree-walk.c
 +++ b/tree-walk.c
 @@ -743,7 +743,7 @@ match_wildcards:
  
   if (item-nowildcard_len 
   !match_wildcard_base(item, base_str, baselen, matched))
 - return entry_not_interesting;
 + continue;
  
   /*
* Concatenate base and entry-path into one and do

Ack. Perhaps this on top to verify it

-- 8 --
diff --git a/t/t4010-diff-pathspec.sh b/t/t4010-diff-pathspec.sh
index af5134b..d9f37c3 100755
--- a/t/t4010-diff-pathspec.sh
+++ b/t/t4010-diff-pathspec.sh
@@ -110,4 +110,17 @@ test_expect_success 'diff-tree -r with wildcard' '
test_cmp expected result
 '
 
+test_expect_success 'diff multiple wildcard pathspecs' '
+   mkdir path2 
+   echo rezrov path2/file1 
+   git update-index --add path2/file1 
+   tree3=`git write-tree` 
+   git diff --name-only $tree $tree3 -- path2*1 path1*1 actual 
+   cat EOF expect 
+path1/file1
+path2/file1
+EOF
+   test_cmp expect actual
+'
+
 test_done
-- 8 --
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] setup: Don't dereference in-tree symlinks for absolute paths

2014-01-26 Thread Duy Nguyen
On Mon, Jan 27, 2014 at 7:07 AM, Martin Erik Werner
martinerikwer...@gmail.com wrote:
 diff --git a/setup.c b/setup.c
 index 5432a31..0789a96 100644
 --- a/setup.c
 +++ b/setup.c
 @@ -22,11 +22,51 @@ char *prefix_path_gently(const char *prefix, int len,
 const char *orig = path;
 char *sanitized;
 if (is_absolute_path(orig)) {
 -   const char *temp = real_path(path);
 -   sanitized = xmalloc(len + strlen(temp) + 1);
 -   strcpy(sanitized, temp);
 +   int i, match;
 +   size_t wtpartlen;
 +   char *npath, *wtpart;
 +   struct string_list list = STRING_LIST_INIT_DUP;
 +   const char *work_tree = get_git_work_tree();
 +   if (!work_tree)
 +   return NULL;
 +   npath = xmalloc(strlen(path) + 1);
 if (remaining_prefix)
 *remaining_prefix = 0;
 +   if (normalize_path_copy_len(npath, path, remaining_prefix)) {
 +   free(npath);
 +   return NULL;
 +   }
 +
 +   string_list_split(list, npath, '/', -1);
 +   wtpart = xmalloc(strlen(npath) + 1);
 +   i = 0;
 +   match = 0;

 +   strcpy(wtpart, list.items[i++].string);
 +   strcat(wtpart, /);
 +   if (strcmp(real_path(wtpart), work_tree) == 0) {
 +   match = 1;
 +   } else {

Could we remove this part and let the while loop handle the first path
component too? The only difference I see is if this code matches, we
have a trailing slash, while the while loop does not have a trailing
slash in wtpart.

 +   while (i  list.nr) {
 +   strcat(wtpart, list.items[i++].string);
 +   if (strcmp(real_path(wtpart), work_tree) == 
 0) {
 +   match = 1;
 +   break;
 +   }
 +   strcat(wtpart, /);
 +   }
 +   }
 +   string_list_clear(list, 0);
 +   if (!match) {
 +   free(npath);
 +   free(wtpart);
 +   return NULL;
 +   }
 +
 +   wtpartlen = strlen(wtpart);
 +   sanitized = xmalloc(strlen(npath) - wtpartlen);
 +   strcpy(sanitized, npath + wtpartlen + 1);

This + 1 is to ignore '/', isn't it? If so we should not do if match
is set 1 outside while loop.

 +   free(npath);
 +   free(wtpart);

All this new code looks long enough to be a separate function with a
meaningful name. And we could travese through each path component in
npath without wtpart (replacing '/' with '\0' to terminate the string
temporarily for real_path()). But it's up to you. Whichever way is
easier to read to you.

 } else {
 sanitized = xmalloc(len + strlen(path) + 1);
 if (len)
 @@ -34,26 +74,10 @@ char *prefix_path_gently(const char *prefix, int len,
 strcpy(sanitized + len, path);
 if (remaining_prefix)
 *remaining_prefix = len;
 -   }
 -   if (normalize_path_copy_len(sanitized, sanitized, remaining_prefix))
 -   goto error_out;
 -   if (is_absolute_path(orig)) {
 -   size_t root_len, len, total;
 -   const char *work_tree = get_git_work_tree();
 -   if (!work_tree)
 -   goto error_out;
 -   len = strlen(work_tree);
 -   root_len = offset_1st_component(work_tree);
 -   total = strlen(sanitized) + 1;
 -   if (strncmp(sanitized, work_tree, len) ||
 -   (len  root_len  sanitized[len] != '\0'  
 sanitized[len] != '/')) {
 -   error_out:
 +   if (normalize_path_copy_len(sanitized, sanitized, 
 remaining_prefix)) {
 free(sanitized);
 return NULL;
 }
 -   if (sanitized[len] == '/')
 -   len++;
 -   memmove(sanitized, sanitized + len, total - len);
 }
 return sanitized;
  }
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git blame vs git log --follow performance

2014-01-26 Thread Duy Nguyen
On Mon, Jan 27, 2014 at 4:10 AM, Joe Perches j...@perches.com wrote:
 For instance (using the Linus' linux kernel git):

 $ time git log --follow -- drivers/firmware/google/Kconfig  /dev/null

 real0m42.329s
 user0m40.984s
 sys 0m0.792s

 $ time git blame -- drivers/firmware/google/Kconfig  /dev/null

 real0m0.963s
 user0m0.860s
 sys 0m0.096s


It's not fair to compare blame and log. If you compare, compare it to
non follow version

$ time git log --follow -- drivers/firmware/google/Kconfig  /dev/null

real0m35.552s
user0m35.120s
sys 0m0.383s

$ time git log -- drivers/firmware/google/Kconfig  /dev/null

real0m4.366s
user0m4.215s
sys 0m0.144s

Although because we need to detect rename, we can't really filter to
one path. So the base line is more like

$ time git log  /dev/null

real0m29.338s
user0m28.485s
sys 0m0.813s

with rename detection taking some more time.

 Perhaps adding a whole-file rename option to the
 git log history simplification mechanism could
 help?

 Thoughts?

I tested a version with rename detection logic removed. It did not
change the timing significantly. To improve --follow I think we need
to do something about path filtering.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git blame vs git log --follow performance

2014-01-26 Thread Duy Nguyen
On Mon, Jan 27, 2014 at 4:10 AM, Joe Perches j...@perches.com wrote:
 Is there something that can be done about improving
 git log --follow -- file performance to be nearly
 equivalent speed to git blame -- file ?

Not strictly about --follow, but there is room for improvement for
diff'ing in log in general. Right now we do diff HEAD HEAD~1, diff
HEAD~1 HEAD~2 and so on (--follow needs diff to detect rename). At
each step we load new tree objects and reparse. Notice after diff
HEAD HEAD~1 we may have HEAD~1 and its subtrees read and parsed
(not entirely). We could reuse that diff HEAD~1 HEAD~2.

On git.git, git log --raw takes 10s and it seems tree object reading
is about 2s.In ideal case we might be able to cut that to 1s. The tree
parsing code (update_tree_entry) takes about 5s. We might be able to
cut that in half, I'm not entirely sure. But there could be a lot of
work in caching HEAD~1 and the overhead may turn out too high for
any gain.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's with git blame --reverse ?

2014-01-27 Thread Duy Nguyen
On Mon, Jan 27, 2014 at 7:45 PM, David Kastrup d...@gnu.org wrote:

 The git blame manual page talks about using git blame --reverse to
 figure out when a particular change disappeared, but I cannot make it
 produce anything useful regardless of what range I give it.  Using
 --root delivers a different state of uselessness.

 Can anyone give a recipe for using git blame --reverse on the Git code
 base for figuring out anything of relevance?

I rarely use blame, but the commit that introduces --reverse seems to
have an example. See 85af792 (git-blame --reverse - 2008-04-02)
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/WIP v2 00/14] inotify support

2014-01-28 Thread Duy Nguyen
On Mon, Jan 20, 2014 at 12:04 AM, Thomas Rast t...@thomasrast.ch wrote:
 There's also the problem of ordering guarantees between the socket and
 inotify.  I haven't found any, so I would conservatively assume that the
 socket messages may in fact arrive before inotify, which is a race in
 the current code.  E.g., in the sequence 'touch foo; git status' the
 daemon may see

   socketinotify
hello...
status
new empty list
 touch foo

 I think a clever way to handle this would be to add a new command:

   Wait::
 This command serves synchronization.  Git creates a file of its
 choice in $GIT_DIR/watch (say, `.git/watch/wait.random`).  Then it
 sends wait path.  The watcher MUST block until it has processed
 all change notifications up to and including path.

Assuming that the time between foo is touched and the time an event is
put in the daemon's queue is reasonably small, would emptying the
event queue at hello be enough? To my innocent eyes (at the kernel),
it seems inotify handling happens immediately after an fs event, and
it's uninterruptable (or at least not interruptable by another user
space process, I don't think we need to care about true interrupts).
If that's true, by the time the touch syscall is finished, the event
is already sitting in the daemon's queue.

The problem with wait.random is we need to tell the daemon to expect
it. Otherwise if the daemon processes the wait.random even before
wait is sent, it would try to wait for the (lost) wait.random
event forever. An extension is git touch wait.random regularly. Or
to keep a queue of processed wait.* events. Both look ugly imo.
Another option is send expect wait.random first, wait for ack,
touch wait.random, then send wait, which is too much.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/3] diff: turn skip_stat_unmatch on selectively

2014-01-28 Thread Duy Nguyen
On Tue, Jan 28, 2014 at 02:51:45PM -0800, Junio C Hamano wrote:
   This replaces 'diff: turn off skip_stat_unmatch on diff --cached'
   The previous patch obviously leaves skip_stat_unmatch on in diff
   rev rev and maybe other cases.
 
  Oops, I lost track.  Sorry.
 
 Together with {1,2}/3 applied on maint-1.8.4, this sems to break
 t3417 (there may be others, but I didn't have time to check).

My bad. I thought I covered all cases in my last patch (and didn't
retest it!). It turns out I should have set skip_stat_unmatch in
builtin_diff_b_f() too. This on top of 3/3 passes the tests

-- 8 --
diff --git a/builtin/diff.c b/builtin/diff.c
index 88542d9..8ab5e3d 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -89,6 +89,7 @@ static int builtin_diff_b_f(struct rev_info *revs,
if (blob[0].mode == S_IFINVALID)
blob[0].mode = canon_mode(st.st_mode);
 
+   revs-diffopt.skip_stat_unmatch = !!diff_auto_refresh_index;
stuff_change(revs-diffopt,
 blob[0].mode, canon_mode(st.st_mode),
 blob[0].sha1, null_sha1,
-- 8 --
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


inotify support, nearly there

2014-01-28 Thread Duy Nguyen
Just a quick update for the enthusiasts. My branch file-watcher [1]
has got working per-user inotify support. It's a 20 patch series so
I'll refrain from spamming git@vger for a while, even though it hurts
your eyes a lot less than what I have posted so far. The test suite
ran fine with it so it's not that buggy. It has new tests too, even
though real inotify is not tested in the new tests. Documentation is
there, either in .txt or comments. Using it is simple:

$ mkdir ~/.watcher
$ git file-watcher --detach ~/.watcher
$ git config --global filewatcher.path $HOME/.watcher

There's still some polishing work to do. But I think the core logic is
done. I have some ideas what to be polished, but I'd appreciate
feedback if anyone uses it. We may need to make lookup code faster
later.

MacOS, FreeBSD and Windows contributors. If you have time and are
interested, have a look at the protocol, which is basically documented
in file-watcher.c:handle_command(), and see if something is
incompatible with your file notification mechanism. MacOS and FreeBSD
may reuse code from file-watcher.c, at least the unix socket part. I'm
not so sure about Windows. It probably needs a brand new daemon
because little could be shared, I don't know. I deliberately design
the daemon dumb so writing a completely new one won't be so hard. My
plan is focus on inotify and get it merged first, then new OS support
can come later (with refactoring if needed, but should not change the
protocol drastically).

[1] git clone https://github.com/pclouds/git.git file-watcher
--
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: inotify support, nearly there

2014-01-29 Thread Duy Nguyen
On Wed, Jan 29, 2014 at 2:44 PM, Mike Hommey m...@glandium.org wrote:
 Haven't looked at the code, so I don't know if you've done that, but in
 case you haven't, it would be nice to have an environment variable or a
 config option to make git use the file-watcher *and* normal lstat
 operations, to check consistency.

No I haven't. So far I depend on my manual tests and the test suite to
check for consistency. Will do.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 3/4] setup: Add 'abspath_part_inside_repo' function

2014-01-31 Thread Duy Nguyen
On Sat, Feb 1, 2014 at 3:22 AM, Martin Erik Werner
martinerikwer...@gmail.com wrote:
 diff --git a/setup.c b/setup.c
 index 5432a31..e606846 100644
 --- a/setup.c
 +++ b/setup.c
 @@ -77,6 +77,69 @@ int path_inside_repo(const char *prefix, const char *path)
 return 0;
  }

 +/*
 + * It is ok if dst == src, but they should not overlap otherwise.
 + * No checking if the path is in fact an absolute path is done, and it must
 + * already be normalized.
 + *
 + * Find the part of an absolute path that lies inside the work tree by
 + * dereferencing symlinks outside the work tree, for example:
 + * /dir1/repo/dir2/file   (work tree is /dir1/repo)  - dir2/file
 + * /dir/file  (work tree is /)   - dir/file
 + * /dir/symlink1/symlink2 (symlink1 points to work tree) - symlink2
 + * /dir/repo  (exactly equal to work tree)   - (empty string)
 + */
 +int abspath_part_inside_repo(char *dst, const char* src)
 +{
 +   size_t len;
 +   char *dst0;
 +   char temp;
 +
 +   const char* work_tree = get_git_work_tree();
 +   if (!work_tree)
 +   return -1;
 +   len = strlen(src);
 +   dst0 = dst;
 +
 +   // check root level

Um.. no C++ style comments. And there should be a test that work_tree
is the prefix of src (common case). If so we can return early and do
not need to do real_path() on every path component.

 +   if (has_dos_drive_prefix(src)) {
 +   *dst++ = *src++;
 +   *dst++ = *src++;
 +   *dst++ = *src++;
 +   } else {
 +   *dst++ = *src++;
 +   }
 +   temp = *dst;
 +   *dst = '\0';
 +   if (strcmp(real_path(dst0), work_tree) == 0) {
 +   *dst = temp;
 +   memmove(dst0, src, len - (dst - dst0) + 1);
 +   return 0;
 +   }
 +   *dst = temp;
 +
 +   // check each level
 +   while (*dst != '\0') {
 +   *dst++ = *src++;
 +   if (*dst == '/') {
 +   *dst = '\0';
 +   if (strcmp(real_path(dst0), work_tree) == 0) {
 +   memmove(dst0, src + 1, len - (dst - dst0));
 +   return 0;
 +   }
 +   *dst = '/';
 +   }
 +   }
 +
 +   // check whole path
 +   if (strcmp(real_path(dst0), work_tree) == 0) {
 +   *dst0 = '\0';
 +   return 0;
 +   }
 +
 +   return -1;
 +}
 +
  int check_filename(const char *prefix, const char *arg)
  {
 const char *name;
 --
 1.8.5.2



-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


splitting a commit that adds new files

2014-02-01 Thread Duy Nguyen
I usually start splitting a commit with reset @^ then add -p back.
The problem is reset @^ does not keep track of new files added in
HEAD, so I often end up forgetting to add new files back (with add
-p). I'm thinking of making reset to do add -N automatically for
me so I won't miss changes in add -p. But maybe people already know
how to deal with this case without adding more code?
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] setup: Add 'abspath_part_inside_repo' function

2014-02-01 Thread Duy Nguyen
On Sun, Feb 2, 2014 at 8:59 AM, Martin Erik Werner
martinerikwer...@gmail.com wrote:
 +   /* check if work tree is already the prefix */
 +   if (strncmp(path, work_tree, wtlen) == 0) {
 +   if (path[wtlen] == '/')
 +   memmove(path, path + wtlen + 1, len - wtlen);
 +   else
 +   /* work tree is the root, or the whole path */
 +   memmove(path, path + wtlen, len - wtlen + 1);
 +   return 0;
 +   }

No the 4th time is not the charm yet :) if path is /abc/defghi and
work_tree is /abc/def you don't want to return ghi as the prefix
here.

 +   path0 = path;
 +   path += offset_1st_component(path);
 +
 +   /* check each level */
 +   while (*path != '\0') {
 +   path++;

To me it looks like we could write

for (; *path; path++) {

or even

for (path += offset_1st_component(path); *path; path++) {

but it's personal taste..

 +   if (*path == '/') {
 +   *path = '\0';
 +   if (strcmp(real_path(path0), work_tree) == 0) {
 +   memmove(path0, path + 1, len - (path - 
 path0));
 +   return 0;
 +   }
 +   *path = '/';
 +   }
 +   }
 +
 +   /* check whole path */
 +   if (strcmp(real_path(path0), work_tree) == 0) {
 +   *path0 = '\0';
 +   return 0;
 +   }

I think this is already handled by the check if work tree is already
the prefix block.

 +
 +   return -1;
 +}
 +
 +/*
   * Normalize path, prepending the prefix for relative paths. If
   * remaining_prefix is not NULL, return the actual prefix still
   * remains in the path. For example, prefix = sub1/sub2/ and path is
 --
 1.8.5.2




-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] setup: Add 'abspath_part_inside_repo' function

2014-02-01 Thread Duy Nguyen
On Sun, Feb 2, 2014 at 9:19 AM, Duy Nguyen pclo...@gmail.com wrote:
 +   /* check whole path */
 +   if (strcmp(real_path(path0), work_tree) == 0) {
 +   *path0 = '\0';
 +   return 0;
 +   }

 I think this is already handled by the check if work tree is already
 the prefix block.

No, the check if.. block does not do real_path() so we still need it
here. Sorry for the noise.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Creating own hierarchies under $GITDIR/refs ?

2014-02-02 Thread Duy Nguyen
On Sun, Feb 2, 2014 at 5:37 PM, David Kastrup d...@gnu.org wrote:
 in the context of an ongoing discussion on the Emacs developer list of
 converting the Bzr repository of Emacs, one question (with different
 approaches) is where to put the information regarding preexisting Bazaar
 revision numbers and bug tracker ids: those are not present in the
 current Git mirror.

 Putting them in the commit messages would require a full history
 rewrite, and if some are missed in the process, this cannot be fixed
 afterwards.

What do you need them for? Perhaps putting everything in a file, maybe
sorted by SHA-1, would suffice? It should not be too hard to write a
script to map bug tracker id to a commit id. The file is for past
commits only. New commits can contain these info in their messages.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Creating own hierarchies under $GITDIR/refs ?

2014-02-02 Thread Duy Nguyen
On Sun, Feb 2, 2014 at 6:19 PM, David Kastrup d...@gnu.org wrote:
 Since Git has a working facility for references that is catered to do
 exactly this kind of mapping and already _does_, it seems like a
 convenient path to explore.

It will not scale. If you make those refs available for
cloning/fetching, all of them will be advertised first thing when git
starts negotiate. Imagine thousands of refs (and keep increasing) sent
to the receiver at the beginning of every connection. Something like
reverse git-notes may transfer more efficiently. Or we need to
improve git protocol to handle massive refs better, something that's
been discussed for a while without any outcome.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] setup: Add 'abspath_part_inside_repo' function

2014-02-02 Thread Duy Nguyen
On Sun, Feb 2, 2014 at 6:13 PM, Martin Erik Werner
martinerikwer...@gmail.com wrote:
 diff --git a/setup.c b/setup.c
 index 2270bd4..5817875 100644
 --- a/setup.c
 +++ b/setup.c
 @@ -32,9 +32,11 @@ static inline int abspath_part_inside_repo(char *path)
 if (strncmp(path, work_tree, wtlen) == 0) {
 if (path[wtlen] == '/')
 memmove(path, path + wtlen + 1, len - wtlen);
 -   else
 +   else if (path[wtlen - 1] == '/' || path[wtlen] == '\0')
 /* work tree is the root, or the whole path */
 memmove(path, path + wtlen, len - wtlen + 1);
 +   else
 +   return -1;

No, you can't return here. /abc/defghi might actually be a symlink
to /abc/def. If it does not match, let the following loop handle it.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 5/5] setup: Don't dereference in-tree symlinks for absolute paths

2014-02-02 Thread Duy Nguyen
On Sun, Feb 2, 2014 at 11:35 PM, Martin Erik Werner
martinerikwer...@gmail.com wrote:
 diff --git a/setup.c b/setup.c
 index a2e60ab..230505c 100644
 --- a/setup.c
 +++ b/setup.c
 @@ -86,11 +86,23 @@ char *prefix_path_gently(const char *prefix, int len,
 const char *orig = path;
 char *sanitized;
 if (is_absolute_path(orig)) {
 -   const char *temp = real_path(path);
 -   sanitized = xmalloc(len + strlen(temp) + 1);
 -   strcpy(sanitized, temp);
 +   char *npath;
 +
 +   npath = xmalloc(strlen(path) + 1);
 if (remaining_prefix)
 *remaining_prefix = 0;
 +   if (normalize_path_copy_len(npath, path, remaining_prefix)) {
 +   free(npath);
 +   return NULL;
 +   }
 +   if (abspath_part_inside_repo(npath)) {
 +   free(npath);
 +   return NULL;
 +   }
 +
 +   sanitized = xmalloc(strlen(npath) + 1);
 +   strcpy(sanitized, npath);
 +   free(npath);

We could replace these three lines with sanitized = npath;. But it's
not a big deal imo. The rest of the series looks good.

Reviewed-by: Duy Nguyen pclo...@gmail.com

 } else {
 sanitized = xmalloc(len + strlen(path) + 1);
 if (len)
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [WIP/PATCH 9/9] submodule: teach unpack_trees() to update submodules

2014-02-03 Thread Duy Nguyen
On Tue, Feb 4, 2014 at 2:54 AM, Jens Lehmann jens.lehm...@web.de wrote:
 Implement the functionality needed to enable work tree manipulating
 commands so that an changed submodule does not only affect the index but
 it also updates the work tree of any initialized submodule according to
 the SHA-1 recorded in the superproject.

 Signed-off-by: Jens Lehmann jens.lehm...@web.de
 ---
  entry.c| 15 --
  submodule.c| 86 
 ++
  submodule.h|  3 ++
  unpack-trees.c | 69 --
  unpack-trees.h |  1 +
  5 files changed, 157 insertions(+), 17 deletions(-)

 diff --git a/entry.c b/entry.c
 index d1bf6ec..61a2767 100644
 --- a/entry.c
 +++ b/entry.c
 @@ -265,7 +265,7 @@ int checkout_entry(struct cache_entry *ce,

 if (!check_path(path, len, st, state-base_dir_len)) {
 unsigned changed = ce_match_stat(ce, st, 
 CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE);
 -   if (!changed)
 +   if (!changed  (!S_ISDIR(st.st_mode) || 
 !S_ISGITLINK(ce-ce_mode)))
 return 0;

Should we report something when ce is a gitlink, but path is not a
directory, instead of siliently exit?

 diff --git a/submodule.c b/submodule.c
 index 3907034..83e7595 100644
 --- a/submodule.c
 +++ b/submodule.c
 @@ -520,6 +520,42 @@ int depopulate_submodule(const char *path)
 return 0;
  }

 +int update_submodule(const char *path, const unsigned char sha1[20], int 
 force)
 +{
 +   struct strbuf buf = STRBUF_INIT;
 +   struct child_process cp;
 +   const char *hex_sha1 = sha1_to_hex(sha1);
 +   const char *argv[] = {
 +   checkout,
 +   force ? -fq : -q,

respect state-quiet in checkout_entry() as well?

 +   hex_sha1,
 +   NULL,
 +   };
 +   const char *git_dir;
 +
 +   strbuf_addf(buf, %s/.git, path);
 +   git_dir = read_gitfile(buf.buf);
 +   if (!git_dir)
 +   git_dir = buf.buf;
 +   if (!is_directory(git_dir)) {
 +   strbuf_release(buf);
 +   /* The submodule is not populated, so we can't check it out */
 +   return 0;
 +   }
 +   strbuf_release(buf);
 +
 +   memset(cp, 0, sizeof(cp));
 +   cp.argv = argv;
 +   cp.env = local_repo_env;
 +   cp.git_cmd = 1;
 +   cp.no_stdin = 1;
 +   cp.dir = path;   /* GIT_WORK_TREE doesn't work for git checkout */

And if we do respect --quiet and it's not specified, paths printed by
this process is relative to dir, not to user cwd. Could be
confusing.

 +   if (run_command(cp))
 +   return error(Could not checkout submodule %s, path);
 +
 +   return 0;
 +}
 +
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: splitting a commit that adds new files

2014-02-03 Thread Duy Nguyen
On Tue, Feb 4, 2014 at 1:11 AM, Junio C Hamano gits...@pobox.com wrote:
 Jeff King p...@peff.net writes:

 [1] I _do_ use reset -p when splitting commits, but I do not think it
 is useful here. I use it for oops, I staged this change, but it
 actually belongs in the next commit. Undo my staging, but leave the
 changes in the working tree for the next one.

 Sure.  I thought that was exactly what Duy was attempting to do when
 he splitted a commit into two (or more).

For splitting into two commits, reset -p or reset @^; add -p would
be more or less the same, although I still prefer to think this is
what I need than this is what I do _not_ need. add -p is more
convenient when the commit is big and you need to split into more than
two because the number of revert chunks may be higher than the number
of added chunks. I recall editing a patch with checkout -p sometimes
does not work, not sure it happens with reset -p too.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bug? git push triggers auto pack when gc.auto = 0

2014-02-03 Thread Duy Nguyen
On Tue, Feb 4, 2014 at 9:20 AM, chris j...@hotmail.com wrote:
 $ git push origin next
 Counting objects: 56, done.
 Delta compression using up to 4 threads.
 Compressing objects: 100% (9/9), done.
 Writing objects: 100% (9/9), 895 bytes | 0 bytes/s, done.
 Total 9 (delta 8), reused 0 (delta 0)
 Auto packing the repository for optimum performance.

This string only appears in versions before 1.8.0. It's longer after 1.8.0.

 To ssh://g...@my.server.com/my_project
3560275..f508080  next - next
 $ git config gc.auto
 0
 $ git config gc.autopacklimit
 0
 $ git --version
 git version 1.8.5.3

but your client is after 1.8.0 so the string printed above is from the
server side. git config gc.auto here does not matter. Run that
command again on my.server.com.

 So my question is, should gc.auto = 0 disable auto-packing from occurring on
 git push and other non-gc commands?

Yes it should.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bug? git push triggers auto pack when gc.auto = 0

2014-02-03 Thread Duy Nguyen
On Tue, Feb 4, 2014 at 12:13 PM, chris j...@hotmail.com wrote:
 However, I question why I should even care about this message?  I'm going to
 assume that simply it is a lengthy synchronous operation that someone felt
 deserved some verbosity to why the client push action is taking longer than
 it should.  Yet that makes me question why I'm being penalized for this
 server side operation.  My client time should not be consumed for server
 side house keeping.

 An obvious fix is to disable gc on the server and implement a cron job for
 the house keeping task.  However, as often the case one does not have
 control over the server, so it is unfortunate that git has this server side
 house keeping as a blocking operation to a client action.

I agree it should not block the client. I think you can Ctrl-C git
push at this point without losing anything (data has already been
pushed at this point) but that's not a good advice to general cases.
Maybe we can do something at the server side to not block the client..

Another thing we could do is put remote:  in front of these strings,
even in ssh case. They seem to confuse you (and me too) that things
happened locally.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] reset: support --mixed --intent-to-add mode

2014-02-04 Thread Duy Nguyen
On Tue, Feb 04, 2014 at 02:25:25PM -0800, Junio C Hamano wrote:
 Junio C Hamano gits...@pobox.com writes:
 
  While I do not have any problem with adding an optional keep lost
  paths as intent-to-add entries feature, I am not sure why this has
  to be so different from the usual add-cache-entry codepath.  The
  if/elseif chain you are touching inside this loop does:
 
   - If the tree you are resetting to has something at the path
 (which is different from the current index, obviously), create
 a cache entry to represent that state from the tree and stuff
 it in the index;
 
   - Otherwise, the tree you are resetting to does not have that
 path.  We used to say remove it from the index, but now we have
 an option to instead add it as an intent-to-add entry.
 
  So, why doesn't the new codepath do exactly the same thing as the
  first branch of the if/else chain and call add_cache_entry but with
  a ce marked with CE_INTENT_TO_ADD?  That would parallel what happens
  in git add -N better, I would think, no?
 
 In other words, something along this line, perhaps?

snip

Yes. But you need something like this on top to actually set
CE_INTENT_TO_ADD

-- 8 --
diff --git a/read-cache.c b/read-cache.c
index 325d193..87f1367 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -585,6 +585,7 @@ void mark_intent_to_add(struct cache_entry *ce)
if (write_sha1_file(, 0, blob_type, sha1))
die(cannot create an empty blob in the object database);
hashcpy(ce-sha1, sha1);
+   ce-ce_flags |= CE_INTENT_TO_ADD;
 }
 
 int add_to_index(struct index_state *istate, const char *path, struct stat 
*st, int flags)
-- 8 --
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] reset: support --mixed --intent-to-add mode

2014-02-05 Thread Duy Nguyen
On Thu, Feb 6, 2014 at 1:25 AM, Junio C Hamano gits...@pobox.com wrote:
 Yes, indeed.  I wonder why your new test did not notice it, though
 ;-)

 ... and the answer turns out to be that it was not testing the right
 thing.  On top of that faulty version, this will fix it.

Yes, write-tree should test that well.

 Your suggestion to move CE_INTENT_TO_ADD to mark-intent-to-add makes
 sense but a caller needs to be adjusted to drop the duplicated flag
 manipulation.

No no. I found that duplicate, but I did not suggest removing it
because it is needed there..

 @@ -613,8 +614,6 @@ int add_to_index(struct index_state *istate, const char 
 *path, struct stat *st,
 ce-ce_namelen = namelen;
 if (!intent_only)
 fill_stat_cache_info(ce, st);
 -   else
 -   ce-ce_flags |= CE_INTENT_TO_ADD;

 if (trust_executable_bit  has_symlinks)
 ce-ce_mode = create_ce_mode(st_mode);

A few lines down, there's ie_match_stat() call that will check
CE_INTENT_TO_ADD and returns changed immediately without looking at
stat data. If stat info is used, it may (not so sure) return not
changed, the exit path is taken and mark_intent_to_add() is ignored.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] t7101, t7014: rename test files to indicate what that file is for

2014-02-05 Thread Duy Nguyen
On Tue, Feb 4, 2014 at 11:05 PM, Jonathan Nieder jrnie...@gmail.com wrote:
  t/t7101-reset-empty-subdirs.sh (new +x) | 63 
 +
  t/t7101-reset.sh (gone) | 63 
 -
  t/t7104-reset-hard.sh (new +x)  | 46 
  t/t7104-reset.sh (gone) | 46 

 Hm, summary incorporated in the diffstat.  Neat. :)

Yeah, it's from this patch [1]. Maybe I should give it another push,
then remove --summary from format-patch diffstat.

[1] http://article.gmane.org/gmane.comp.version-control.git/213752
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] Add a function skip_prefix_if_present()

2014-02-06 Thread Duy Nguyen
On Wed, Feb 5, 2014 at 1:55 PM, Michael Haggerty mhag...@alum.mit.edu wrote:
 * Duy seemed to offer to rewrite his patch series, but I don't think
   that it has happened yet.

It's really low in my todo list. So if you want to pick it up, please do.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] shallow clones over http

2014-02-06 Thread Duy Nguyen
(Digging back an old topic after Jeff mentioned it)

On Thu, May 9, 2013 at 2:12 AM, Junio C Hamano gits...@pobox.com wrote:
 Jeff King p...@peff.net writes:

 I'm trying to track down a protocol bug that happens with shallow clones
 over smart-http. As far as I can tell, the bug has existed in all
 versions.

 You can reproduce it using the attached repository, which is a shallow
 clone of https://github.com/mileszs/ack.vim.git, like:

   $ tar xzf repo.tar.gz
   $ cd repo.git
   $ git fetch --depth=10
   fatal: git fetch-pack: expected shallow list

 In that test my fetch actually hit github.com as the upstream full repo,
 but you can also clone it down locally and demonstrate it with purely
 local copies of git (but it's more of a pain, because you have to set up
 a smart http server).

 The last part of the conversation looks like this:

   packet:   fetch-pack 
   packet:   fetch-pack ACK f183a345a0c10caed7684d07dabae33e007c7590 common
   packet:   fetch-pack have f183a345a0c10caed7684d07dabae33e007c7590
   packet:   fetch-pack ACK 33312d4db4e91468957b1b41dd039c5d88e85fda common
   packet:   fetch-pack ACK 34d0b2fbc182b31d926632d170bc07d6a6fc3f9b common
   packet:   fetch-pack ACK 45c802e07c60686986474b6b05b2c7048330b6b5 common
   packet:   fetch-pack ACK e93f693fd2a9940d6421bf9e4ddd1f535994eaa5 common
   packet:   fetch-pack ACK 132ee41e8e2c8c545b3aed120171e1596c9211a4 common
   packet:   fetch-pack ACK 973deb3145a2638b2301cfd654721cf35d68 common
   packet:   fetch-pack ACK e53a88a4e72d84562493313e8911ada4def787da common
   packet:   fetch-pack ACK 90be0bf3eee6f7a0cb9c2377a50610f4ce738da3 common
   packet:   fetch-pack ACK aeab88ccf41bf216fde37983bd403d9b913391e7 common
   packet:   fetch-pack ACK 5f480935d3ce431c393657c3000337bcbdbd5535 common
   packet:   fetch-pack ACK db81e01b433501b159983ea38690aeb01eea1e6b common
   packet:   fetch-pack ACK 06c44b8cab93e780a29ff7f7b5b1dd41dba4b2d5 common
   packet:   fetch-pack ACK 65f3966becdb2d931d5afbdcc6a28008d154668a common
   packet:   fetch-pack ACK 10e8caef9f2ed308231ce1abc326c512e86a5d4c common
   packet:   fetch-pack ACK 6b55dd91f2e7fc64c23eea57e85171cb958f9cd2 common
   packet:   fetch-pack ACK 6b55dd91f2e7fc64c23eea57e85171cb958f9cd2 ready
   packet:   fetch-pack NAK
   packet:   fetch-pack ACK 6b55dd91f2e7fc64c23eea57e85171cb958f9cd2
   fatal: git fetch-pack: expected shallow list

 So we see that upload-pack sends a bunch of detailed ACKs, followed by a
 NAK, and then it sends another ACK.

 Fetch-pack is inside find_common, reading these acks. At the beginning
 of each stateless-rpc response, it expects to consume any
 shallow/unshallow lines up to a flush packet (the call to
 consume_shallow_list). And then it reads the acks in a loop. After it
 sees the NAK, it assumes that the server is done sending the packet, and
 loops again, expecting another set of shallow/unshallow lines. But we
 get the next ACK instead, and die.

 So who is wrong? Is upload-pack wrong to send an ACK after the NAK?

 3e63b21aced1 (upload-pack: Implement no-done capability, 2011-03-14)
 claims that the above sequence of acks and naks is what upload-pack
 wants to show.

 What happens when you disable no-done extension handling on the
 server end, I wonder?

fetch succeeded when no-done was disabled. An immediate workaround
would be disable no-done in fetch-pack.c in a shallow repo but maybe
we can do better..
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Sparse checkout leaves no entry on working directory all the time on Windows 7 on Git 1.8.5.2.msysgit.0

2014-02-06 Thread Duy Nguyen
On Thu, Feb 6, 2014 at 8:20 PM, Johannes Sixt j.s...@viscovery.net wrote:
 Am 2/6/2014 12:54, schrieb konst...@ngs.ru:
 However I typed the checkout directory in file
 ..git/info/sparse-checkout by using different formats with
 and without the leading and the trailing slashes, with and
 without asterisk after trailing slash, having tried all
 the possible combinations, but, all the same,
 nevertheless, the error occured.

 Make sure that you do not use CRLF line terminators in the sparse-checkout
 file.

I suspected the same thing. But it looks like we do trim CRLF. Another
option is the file is encoded in utf-16 or some encoding which ascii
is not a subset of.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6] Fix the shallow deepen bug with no-done

2014-02-06 Thread Duy Nguyen
On Fri, Feb 7, 2014 at 2:31 AM, Junio C Hamano gits...@pobox.com wrote:
 Nguyễn Thái Ngọc Duy  pclo...@gmail.com writes:

 Reported by Jeff [1]. Junio spotted it right but nobody made any move
 since then.

 Hrm.  Was I supposed to make any move at that point?  The discussion
 ended by me asking a question what happens if we did X? and there
 was no discussion.

Don't take it the wrong way. I was just summarizing the last round. It
surprised me though that this went under my radar. Perhaps a bug
tracker is not a bad idea after all (if Jeff went missing, this bug
could fall under the crack)
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/6] fetch-pack: fix deepen shallow over smart http with no-done cap

2014-02-06 Thread Duy Nguyen
On Fri, Feb 7, 2014 at 2:16 AM, Junio C Hamano gits...@pobox.com wrote:
 Nguyễn Thái Ngọc Duy  pclo...@gmail.com writes:

 In smart http, upload-pack adds new shallow lines at the beginning of
 each rpc response. Only shallow lines from the first rpc call are
 useful. After that they are thrown away. It's designed this way
 because upload-pack is stateless and has no idea when its shallow
 lines are helpful or not.

 So after refs are negotiated with multi_ack_detailed and both sides
 happy. The server sends ACK obj-id ready, terminates the rpc call

 Is the above ... and both sides are happy, the server sends ...?

Yes. Although think again, both sides is incorrect. If the server
side is happy, then it'll send ACK.. ready to stop the client. The
client can hardly protest.

 Do I understand the situation correctly if I said The call to
 consume-shallow-list has been here from the very beginning, but it
 should have been adjusted like this patch when no-done was
 introduced.?

It's been there since the introduction of smart http (there are so
many beginnings in git). The rest is right.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Confusing git log --- First time bug submission please advise on best practices

2014-02-07 Thread Duy Nguyen
On Fri, Feb 07, 2014 at 09:43:46AM +, Francis Stephens wrote:
 Thanks for your clear response. I can see where I went wrong now.

Maybe something like this would help avoid confusion a bit in the
future? This toy patch puts a horizontal line as a break between two
commits if they are not related, so we can clearly see linear commit
segments.

--graph definitely helps, but it's too many threads for topic-based
development model like git.git that I avoid it most of the time.

-- 8 --
diff --git a/log-tree.c b/log-tree.c
index 08970bf..7841bf2 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -795,6 +795,7 @@ static int log_tree_diff(struct rev_info *opt, struct 
commit *commit, struct log
 
 int log_tree_commit(struct rev_info *opt, struct commit *commit)
 {
+   static struct commit_list *last_parents;
struct log_info log;
int shown;
 
@@ -805,6 +806,17 @@ int log_tree_commit(struct rev_info *opt, struct commit 
*commit)
if (opt-line_level_traverse)
return line_log_print(opt, commit);
 
+   if (last_parents) {
+   struct commit_list *p = last_parents;
+   int got_parent = 0;
+   for (; p  !got_parent; p = p-next)
+   got_parent = !hashcmp(p-item-object.sha1,
+ commit-object.sha1);
+   if (!got_parent)
+   
printf(__\n);
+   free_commit_list(last_parents);
+   last_parents = NULL;
+   }
shown = log_tree_diff(opt, commit, log);
if (!shown  opt-loginfo  opt-always_show_header) {
log.parent = NULL;
@@ -813,5 +825,6 @@ int log_tree_commit(struct rev_info *opt, struct commit 
*commit)
}
opt-loginfo = NULL;
maybe_flush_or_die(stdout, stdout);
+   last_parents = copy_commit_list(commit-parents);
return shown;
 }
-- 8 --

 
 On Thu, Feb 6, 2014 at 4:10 PM, David Kastrup d...@gnu.org wrote:
  Vincent van Ravesteijn v...@lyx.org writes:
 
  The commits that are in the log for master and which are not in the
  log for originssh/master are merged in at 6833fd4 (HEAD, master);
  Completed merge.
 
  As git log can only present the commits in a linear way, it shows
  the commits from the ancentry of both parents of HEAD in a reverse
  chronological order. This means that the commits from the two
  ancestries are mixed and commits that are shown after each other don't
  have to be parent and child. See the documentation of git log and
  the section Commit Ordering: By default, the commits are shown in
  reverse chronological order.
 
  git log --graph can help with getting a better picture.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC 2/2] receive-pack: hint that the user can stop

2014-02-07 Thread Duy Nguyen
On Fri, Feb 7, 2014 at 7:36 PM, chris j...@hotmail.com wrote:
 Junio C Hamano gitster at pobox.com writes:
 Instead of adding a boolean --break-ok that is hidden, why not
 adding an exposed boolean --daemonize, and let auto-gc run in the
 background?  With the recent do not let more than one gc run at the
 same time, that should give a lot more pleasant end user
 experience, no?

 That sounds quite useful to me.  Duy, are you up for generating such a patch?

It would not be so hard for that patch. I'm still thinking whether it
should be done if auto-gc is started on the client side too (sometimes
it does, which is equally annoying)..
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/6] fetch-pack: fix deepen shallow over smart http with no-done cap

2014-02-07 Thread Duy Nguyen
On Fri, Feb 07, 2014 at 10:01:08AM -0800, Junio C Hamano wrote:
 Here is the difference between the posted series and what I queued
 after applying the changes suggested during the review.
 
 Thanks.

I was going to send a reroll after the received comments. Could you
put this on top of 6/6, just to make sure future changes in t5537
(maybe more or less commits created..) does not change the test
behavior?

It fixes the test name too. I originally thought, ok let's create
commits in one test and do fetch in another. But it ended up in the
same test and I forgot to update test name.

-- 8 --
diff --git a/t/t5537-fetch-shallow.sh b/t/t5537-fetch-shallow.sh
index 1413caf..b300383 100755
--- a/t/t5537-fetch-shallow.sh
+++ b/t/t5537-fetch-shallow.sh
@@ -203,7 +203,7 @@ EOF
 # This test is tricky. We need large enough haves that fetch-pack
 # will put pkt-flush in between. Then we need a have the server
 # does not have, it'll send ACK %s ready
-test_expect_success 'add more commits' '
+test_expect_success 'no shallow lines after receiving ACK ready' '
(
cd shallow 
for i in $(test_seq 10)
@@ -224,7 +224,9 @@ test_expect_success 'add more commits' '
cd clone 
git checkout --orphan newnew 
test_commit new-too 
-   git fetch --depth=2
+   GIT_TRACE_PACKET=$TRASH_DIRECTORY/trace git fetch --depth=2 
+   grep fetch-pack ACK .* ready ../trace 
+   ! grep fetch-pack done ../trace
)
 '
 
-- 8 -- 
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Bug tracker (again)

2014-02-07 Thread Duy Nguyen
(Dropped some CC as this becomes a different topic)

On Sat, Feb 8, 2014 at 2:20 AM, Jonathan Nieder jrnie...@gmail.com wrote:
 Duy Nguyen wrote:

 Don't take it the wrong way. I was just summarizing the last round. It
 surprised me though that this went under my radar. Perhaps a bug
 tracker is not a bad idea after all (if Jeff went missing, this bug
 could fall under the crack)

 I'm happy to plug
 - http://bugs.debian.org/cgi-bin/pkgreport.cgi?src=git;include=tags:upstream
 - http://packages.qa.debian.org/common/index.html (email subscription link:
   source package = git; under Advanced it's possible to subscribe to
   bug-tracking system emails and skip e.g. the automated build stuff)
 - https://www.debian.org/Bugs/Reporting (bug reporting interface -
   unfortunately the important part is buried under Sending the bug
   report via e-mail)
 again. :)

So I wonder if we use debian bug tracker for git upstream. I haven't
used debian tracker much (or debian for that matter). It's probably
best just ask instead of searching and guessing.

I suppose if debian people (mostly debian git maintainer?) are not
opposed to us using their tracker for upstream bugs, then it's just a
matter of associating a mail thread with a bug number for tracking.
That could be probably be done via email, then reply all to the thread
in question with a bug email address. After that all email discussions
are also tracked via this bug email. Anybody can help track bugs. Say
if 3 weekdays are over and nobody said a thing about something that
looks a lot like bug, then it should be tracked (problems that can be
quickly fixed do not need tracking). Hmm?
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/6] t5538: fix default http port

2014-02-07 Thread Duy Nguyen
On Sat, Feb 8, 2014 at 6:47 AM, Jeff King p...@peff.net wrote:
 Thinking on this more, I wonder if we should just do something like
 this:

 diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
 index bfdff2a..c82b4ee 100644
 --- a/t/lib-httpd.sh
 +++ b/t/lib-httpd.sh
 @@ -64,7 +64,9 @@ case $(uname) in
  esac

  LIB_HTTPD_PATH=${LIB_HTTPD_PATH-$DEFAULT_HTTPD_PATH}
 -LIB_HTTPD_PORT=${LIB_HTTPD_PORT-'8111'}
 +if test -z $LIB_HTTPD_PORT; then
 +   LIB_HTTPD_PORT=${this_test#t}
 +fi

  TEST_PATH=$TEST_DIRECTORY/lib-httpd
  HTTPD_ROOT_PATH=$PWD/httpd

 and drop the manual LIB_HTTPD_PORT setting in each of the test scripts.
 That would prevent this type of error in the future, and people can
 still override if they want to.

 Any test scripts that are not numbered would need to set it, but we
 should not have any of those (and if we did, the failure mode would be
 OK, as Apache would barf on the bogus port number).

Yes! Next stop, attempt to start httpd at start_httpd regardless of
GIT_TEST_HTTPD. If successful, test_set_prereq HTTPD or something that
the following tests can use. If GIT_TEST_HTTPD is set and start_httpd
fails, error out, not set prereq.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/26] inotify support

2014-02-08 Thread Duy Nguyen
Thanks for the comments. I can see I now have some work to do in the
coming weeks :)

On Sat, Feb 8, 2014 at 3:04 PM, Torsten Bögershausen tbo...@web.de wrote:
 I would appreciate if we could have an outline of the protocol
 as a seperate document somewhere, to be able to have a look at the protocol
 first, before looking into the code.

My fear is document becomes outdated because people don't always
remember to update doc when they change code. Which is why I embed the
protocol as comments in handle_command() function. If the final
version [1] is still not easy to read, I'll split the protocol out
into a separate document.

[1] https://github.com/pclouds/git/blob/file-watcher/file-watcher.c#L672

 (Am I using wireshark too much to dream about a dissector ?)

Try GIT_TRACE_PACKET=2

 1)
   write_in_full_timeout()
   packet_read_line_timeout()
   At other places we handle EINTR after calling poll().
   Looking at the code, it could be easier to introduce
   a new function xpoll() in wrapper.c, and use that instead
   of poll().

Yeah there are already 4 poll call sites before file-watcher jumps in. Will do.

 2)
   Similar for the usage of accept().
   I like the idea of xread() xwrite() and all the x functions
   so it coud make sense to establish xpoll() and xaccept()
   before inotify suppport.

OK

 3)
 -int unix_stream_listen(const char *path)
 +int unix_stream_listen(const char *path, int replace)
  {
   int fd, saved_errno;
   struct sockaddr_un sa;
 @@ -103,7 +103,8 @@ int unix_stream_listen(const char *path)
   return -1;
   fd = unix_stream_socket();

 - unlink(path);
 + if (replace)
 + unlink(path);

 Minor remark:
 As we do not do the replace here:
 s/replace/un_link/ may be ?

Heh, I thought of using the name unlink but it's taken so I chose
replace and did not think of underscore.Will do.

 4)
 +++ b/file-watcher.c
 {}
 +static const char *const file_watcher_usage[] = {
 + N_(git file-watcher [options] socket directory),
 + NULL
 +};
 Do we already have options here?
 I can think about having
 -d daemoniye
 -u uses Unix doain socket
 (And later -t to use a TCP/IP socket, when the repo
  is on a mounted NFS (or SAMBA) drive, and  the daemon is on a
  different machine.
  I don't say this patch should include this logic in first round,
  But I can see a gain for this kind of setup)

Later on we have two, --detach (i.e. daemonize, I reuse the same name
from git-daemon) and --check-support. Transport settings (like unix vs
tcp/ip...) should be config options though. You don't want to specify
it here, then again when you run git status. Actually I should add
--default, that will retrieve socket directory from config key
filewatcher.path, so the user does not have to specify it..

 5)
 +++ b/file-watcher.c
 []
 +static int shutdown_connection(int id)
 +{
 + struct connection *conn = conns[id];
 + conns[id] = NULL;
 + pfd[id].fd = -1;
 + if (!conn)
 + return 0;
 + close(conn-sock);
 + free(conn);
 + return 0;
 +}
 The function is called shutdown_connection(), but it does a close()
 Could it be named close_connection() ?

Yes, there was a close_connection() which did something similar, and
then it was killed off. Will rename.

 6)
 +++ b/file-watcher.c
 []
 Do we have a sequence chart about the command flow between the watcher
 daemon and the client ?

I suggest you have a look at the file-watcher test. For example, from
[2] we have this sequence

hello
hello
index $SIG $TRASH_DIRECTORY
inconsistent
# Do not call inotify_add_watch()
test-mode
test mode on
# First batch should be all ok
watch
dir1/foo
dir2/foo
dir3/foo
dir4/foo
...

 denotes a packet from the client to file-watcher,  the opposite
direction. But you can always obtain a real flow with
GIT_TRACE_PACKET=2 (path lists not available though, so really just a
flow).

[2] 
https://github.com/pclouds/git/blob/file-watcher/t/t7513-file-watcher.sh#L273

 in 03/26:
This version is also gentler than its friend packet_read_line()
 gentler, what does this mean?

No dying for whatever error. packet_read_line is designed for git
protocol. It's the main job so if there's an error, dying is the right
thing to do. file-watcher on the other hand is a side job and should
not stop whatever command from doing. Will update commit message.

because it's designed for side channel I/O that should not abort the
program even if the channel is broken.
 I'm not so familar with side-channel I/O. How does it fit in here?

To sum up, whatever error in communication with file-watcher should
not stop you from doing whatever you're doing. file-watcher is
contacted whenever $GIT_DIR/index is read, so whatever you're doing
is basically all git commands that involve worktree or index.

 Does this make sense:
 In opposite to packet_read_line() which can call die()
 to abort the program, read_in_full_timeout() will keep the program running.
 (or something like this)

Exactly!

diff 

Re: git: problematic git status output with some translations (such as fr_FR)

2014-02-08 Thread Duy Nguyen
On Fri, Dec 20, 2013 at 3:50 AM, Jonathan Nieder jrnie...@gmail.com wrote:
 Junio C Hamano wrote:
 Jonathan Nieder jrnie...@gmail.com writes:

 This includes the colon in the translated string, to make it easier to
 remember to keep the non-breaking space before it.

 Hmph, recent 3651e45c (wt-status: take the alignment burden off
 translators, 2013-11-05) seems to have gone in the different
 direction when it updated similar code for the non-unmerged paths.

 Yes, if this seems to go in the right direction, I'd add a follow-up
 for that when rerolling.

Just checking. Did you reroll it or did my filters move some mails to
spam/trash folder again?
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] Ignore trailing spaces in .gitignore

2014-02-08 Thread Duy Nguyen
On Sat, Feb 8, 2014 at 11:45 PM, Jeff King p...@peff.net wrote:
 On Sat, Feb 08, 2014 at 03:10:02PM +0700, Nguyễn Thái Ngọc Duy wrote:

 Trailing spaces are invisible in most standard editors (*). git diff
 does show trailing spaces by default. But that does not help newly
 written .gitignore files. And trailing spaces are the source of
 frustration when writing .gitignore.

 I guess leading spaces are not as frustrating, but I wonder if it would
 be more consistent to say that we soak up all whitespace. That is also a
 regression, but I agree that these are exceptional cases, and as long as
 we provide an out for people to cover them via quoting, ignoring the
 whitespace seems like a sane default.

Hm...


 People can still quote trailing spaces, which will not be ignored (and
 much more visible). Quoting comes with a cost of doing fnmatch(). But
 I won't optimize it until I see someone shows me they have a use case
 for it.

 I naively expected that:

   echo 'trailing\ \ ' .gitignore

 would count as quoting, but your patch doesn't handle that. It _does_
 seem to work currently (that is, the backslashes go away and we treat it
 literally), but I am not sure if that is planned, or simply happens to
 work.

No that's what I had in mind. But yeah my patches are flawed.


 I guess by quoting you meant:

   echo 'trailing  ' .gitignore

This makes  special. If we follow shell convention then things
between .. should be literal (e.g. * is no longer a wildcard). We
don't support it yet. So I rather go with backslash as it adds less
code.


 Anyway, here are some tests I wrote up while playing with this. They do
 not pass with your current patch for the reasons above, but maybe they
 will be useful to squash in (either after tweaking the tests, or
 tweaking the patch).

 diff --git a/t/t0008-ignores.sh b/t/t0008-ignores.sh
 index b4d98e6..4dde314 100755
 --- a/t/t0008-ignores.sh
 +++ b/t/t0008-ignores.sh
 @@ -775,4 +775,33 @@ test_expect_success PIPE 'streaming support for --stdin' 
 '
 echo $response | grep ^::two
  '

 +
 +#
 +# test whitespace handling
 +
 +test_expect_success 'leading/trailing whitespace is ignored' '
 +   mkdir whitespace 
 +   whitespace/leading 
 +   whitespace/trailing 
 +   whitespace/untracked 
 +   {
 +   echo whitespace/leading 
 +   echo whitespace/trailing   
 +   } ignore 
 +   echo whitespace/untracked expect 
 +   git ls-files -o -X ignore whitespace actual 
 +   test_cmp expect actual
 +'
 +
 +test_expect_success 'quoting allows trailing whitespace' '
 +   rm -rf whitespace 
 +   mkdir whitespace 
 +   whitespace/trailing   
 +   whitespace/untracked 
 +   echo whitespace/trailing\\ \\  ignore 
 +   echo whitespace/untracked expect 
 +   git ls-files -o -X ignore whitespace actual 
 +   test_cmp expect actual
 +'
 +
  test_done



-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug: relative core.worktree is resolved from symlink and not its target

2014-02-09 Thread Duy Nguyen
On Tue, Feb 04, 2014 at 11:20:39AM +0100, Daniel Hahler wrote:
 Hi,
 
 when using a submodule sm, there is a relative worktree in its config:
 
.git/modules/sm/config:
[core]
 worktree = ../../../smworktree
 
 git-new-worktree (from contrib) symlinks this config the new worktree.
 
 From inside the new worktree, git reads the config, but resolves the
 relative worktree setting based on the symlink's location.

Hmm.. core.worktree is relative to $GIT_DIR. Whether config is a
symlink should have no effects.

$ pwd
/tmp/abc
$ ls -l .git/config 
lrwxrwxrwx 1 pclouds users 11 Feb  9 15:57 .git/config - /tmp/config
$ cat /tmp/config 
[core]
repositoryformatversion = 0
filemode = true
bare = false
logallrefupdates = true
worktree = ../../worktree
$ ls -l /tmp/worktree/
total 4
-rw-r--r-- 1 pclouds users 5 Feb  9 15:59 abc
$ ~/w/git/git ls-files -o
abc

Maybe it's something else. Could you produce a small test case?

 A fix would be to resolve any relative worktree setting based on the
 symlink target's location (the actual config file), and not from the
 symlink.
 
 This is with git version 1.8.5.3.
--
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] Ignore trailing spaces in .gitignore

2014-02-09 Thread Duy Nguyen
On Mon, Feb 10, 2014 at 11:07 AM, Junio C Hamano gits...@pobox.com wrote:
 Nguyễn Thái Ngọc Duy  pclo...@gmail.com writes:

 Trailing spaces are invisible in most standard editors (*). git diff
 does show trailing spaces by default. But that does not help newly
 written .gitignore files. And trailing spaces are the source of
 frustration when writing .gitignore.

 So let's ignore them. Nobody sane would put a trailing space in file
 names. But we could be careful and do it in two steps: warn first,
 then ignore trailing spaces. Another option is merge two patches in
 one and be done with it.

 People can still quote trailing spaces, which will not be ignored (and
 much more visible). Quoting comes with a cost of doing fnmatch(). But

 Hmph, sorry but I fail to see why we need to incur cost for
 fnmatch().  We read and parse the file and keep them as internal
 strings, so your unquoting (and complaining the unquoted trailng
 spaces) can be done at the parse time, while keeping the trailing
 spaces the user explicitly told us to keep by quoting in the
 internal string that we eventually feed fnmatch() with _after_
 unquoting, no?

That's the optimization in the but sentence. Another (off topic)
opt. we could do is make *.[ch] behave more like *.c, where we
just try to match the tail part.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/26] inotify support

2014-02-10 Thread Duy Nguyen
On Mon, Feb 10, 2014 at 3:19 AM, Torsten Bögershausen tbo...@web.de wrote:
 Please see filewatcher.c:
 +   if (daemon) {
 +   int err;
 +   strbuf_addf(sb, %s/log, socket_path);
 +   err = open(sb.buf, O_CREAT | O_TRUNC | O_WRONLY, 0600);
 +   adjust_shared_perm(sb.buf);
 (And now we talk about the logfile:
 In daemon mode, stdout and stderr are saved in $WATCHER/log.
 It could be nice to make this feature configrable,
 either XXX/log or /dev/null.
 On the long run we may eat to much disc space on a machine.
 The other thing is that we may want to seperate stdout
 from stderr, but even this is a low prio comment.

I probably should follow git-daemon and put these to syslog.

 
 There is a small issue when I tested on a machine,
 where the data directory called daten is softlinked to another disk:
 daten - /disk3/home2/tb/daten

 and the projects directory is softlinked to daten/projects
 projects - daten/projects/

 t7514 fails like this:
 --- expect  2014-02-08 14:37:07.0 +
 +++ actual  2014-02-08 14:37:07.0 +
 @@ -1,6 +1,6 @@
  packet:  git hello
  packet:  git hello
 -packet:  git index 6cb9741eee29ca02c5b79e9c0bc647bcf47ce948 
 /home/tb/projects/git/tb/t/trash directory.t7514-file-watcher-lib
 +packet:  git index 6cb9741eee29ca02c5b79e9c0bc647bcf47ce948 
 /disk3/home2/tb/daten/projects/git/tb/t/trash directory.t7514-file-watcher-lib

 Could we use relative path names internally, relative to $GIT_DIR ?

No because this is when the client tell the server about $GIT_DIR. I
guess we can use realpath(1) here.

 ---
 Another thing:
 Compiling under Mingw gives this:
 LINK git-credential-store.exe
 libgit.a(file-watcher-lib.o): In function `connect_watcher':
 c:\Dokumente und Einstellungen\tb\projects\git\tb/file-watcher-lib.c:21: 
 undefined reference to `unix_stream_connect'
 collect2: ld returned 1 exit status
 make: *** [git-credential-store.exe] Error 1

 We may need a compiler option like HAS_UNIX_SOCKETS or so.

I'll make unix-socket.o build unconditionally and return error at runtime.

 --
 +++ b/file-watcher.c

 +#define INOTIFY_MASKS (IN_DELETE_SELF | IN_MOVE_SELF | \
 +  IN_CREATE | IN_ATTRIB | IN_DELETE | IN_MODIFY |  \
 +  IN_MOVED_FROM | IN_MOVED_TO)
 This feels confusing:
 a) we have inotify_masks with lower case below.
 b) how about INOTIFY_NEEDED_BITS ?
 ---

OK

 I'm OK with having the protocol having specified in the
 test cases.
 One thing that I have on the wish list is to make the
 commands/responses more unique, to be able to run grep
 on the code base.

 One idea could be to use a prefix
 fwr for file watcher request or
 fwr for file watcher response.
 This does not work, hihi, so

 fwq for file watcher reQuest and
 fwe for file watcher rEsponse.
 Or
 ffw as from file watcher and
 tfw as to file watcher for the people who have problems
 with left and right,  and  could work.

If you want I can update test-file-watcher to accept send and
recv instead of  and , respectively. The only command with
the same name for response and request is hello. I can make it
hello and helloack (or bonjour as response?).
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] gc: config option for running --auto in background

2014-02-10 Thread Duy Nguyen
On Mon, Feb 10, 2014 at 6:03 PM, Erik Faye-Lund kusmab...@gmail.com wrote:
 `gc --auto` takes time and can block the user temporarily (but not any
 -   if (!quiet)
 -   fprintf(stderr,
 -   _(Auto packing the repository for 
 optimum performance. You may also\n
 -   run \git gc\ manually. See 
 -   \git help gc\ for more 
 information.\n));
 +   if (!quiet) {
 +   if (detach_auto)
 +   fprintf(stderr, _(Auto packing the 
 repository in background for optimum performance.\n));
 +   else
 +   fprintf(stderr, _(Auto packing the 
 repository for optimum performance.\n));
 +   fprintf(stderr, _(See \git help gc\ for manual 
 housekeeping.\n));
 +   }
 +   if (detach_auto)
 +   /*
 +* failure to daemonize is ok, we'll continue
 +* in foreground
 +*/
 +   daemonize();

 While I agree that it should be OK, shouldn't we warn the user?

If --quiet is set, we should not be printing anyway. If not, I thinkg
we could only print auto packing in background.. when we actually
can do that, else just print the old message. It means an #ifdef
NO_POSIX_GOODIES here again though..
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/2] daemon: move daemonize() to libgit.a

2014-02-10 Thread Duy Nguyen
On Tue, Feb 11, 2014 at 1:46 AM, Junio C Hamano gits...@pobox.com wrote:
 Nguyễn Thái Ngọc Duy  pclo...@gmail.com writes:

 diff --git a/setup.c b/setup.c
 index 6c3f85f..b09a412 100644
 --- a/setup.c
 +++ b/setup.c
 @@ -787,3 +787,27 @@ void sanitize_stdfds(void)
   if (fd  2)
   close(fd);
  }
 +
 +int daemonize(void)
 +{
 +#ifdef NO_POSIX_GOODIES
 + errno = -ENOSYS;

 Negated?

Facepalm. I remember I wrote this somewhere but don't remember what
topic :( Should I resend?
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/26] inotify support

2014-02-10 Thread Duy Nguyen
On Mon, Feb 10, 2014 at 11:55 PM, Torsten Bögershausen tbo...@web.de wrote:
 On 2014-02-10 11.37, Duy Nguyen wrote:

 Could we use relative path names internally, relative to $GIT_DIR ?

 No because this is when the client tell the server about $GIT_DIR. I
 guess we can use realpath(1) here.
 Good.

 I realized that the watcher can watch several repos at the same time.

 However, we could allow relative path names, which will be relative to 
 $SOCKET_DIR,
 and loosen the demand for an absolut path name a little bit.
 And $SOCKET_DIR can be the same as $GIT_DIR, when we are watching only one 
 repo.

It does not help much anyway because file-watcher-lib.c sends
get_git_work_tree(), which is absolute/normalized path, to
file-watcher. There's no sources of sending $GIT_DIR relative to
$SOCKET_DIR (and I don't think we want to make get_git_work_tree()
relative before sending, it's more work on both sides we no benefits,
except for tracing).
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Profiling support?

2014-02-11 Thread Duy Nguyen
On Tue, Feb 11, 2014 at 6:17 PM, David Kastrup d...@gnu.org wrote:

 Looking in the Makefile, I just find support for coverage reports using
 gcov.  Whatever is there with profile in it seems to be for
 profile-based compilation rather than using gprof.

 Now since I've managed to push most of the runtime for basic git-blame
 operation out of blame.c proper, it becomes important to figure out
 where most of the remaining runtime (a sizable part of that being system
 time) is being spent.  Loop counts like that provided by gcov (or am I
 missing something here?) are not helpful for that, I think I rather need
 the kind of per-function breakdown that gprof provides.

 Is there a reason there are no prewired recipes or advice for using
 gprof on git?  Is there a way to get the work done, namely seeing the
 actual distribution of call times (rather than iterations) using gcov so
 that this is not necessary?

Would perf help? No changes required, and almost no overhead, I think.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] gc: config option for running --auto in background

2014-02-11 Thread Duy Nguyen
On Tue, Feb 11, 2014 at 2:11 AM, Junio C Hamano gits...@pobox.com wrote:
 On Mon, Feb 10, 2014 at 10:43 AM, Junio C Hamano gits...@pobox.com wrote:
 If --quiet is set, we should not be printing anyway. If not, I thinkg
 we could only print auto packing in background.. when we actually
 can do that, else just print the old message. It means an #ifdef
 NO_POSIX_GOODIES here again though..

 Didn't you change it not to die but return nosys or something?

 Ah, the problem is that it is too late to take back ... will do so in
 the background when you noticed that daemonize() did not succeed, so
 you would need a way to see if we can daemonize() before actually
 doing so if you want to give different messages.

 int can_daemonize(void) could be an answer that is nicer than
 NO_POSIX_GOODIES, but I am not sure if it is worth it.

Or we could pass the quiet flag to daemonize() and let it print
something in the #ifdef NO_POSIX_GOODIES part.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-11 Thread Duy Nguyen
On Wed, Feb 12, 2014 at 8:54 AM, Stefan Zager sza...@chromium.org wrote:
 We in the chromium project have a keen interest in adding threading to
 git in the pursuit of performance for lengthy operations (checkout,
 status, blame, ...).  Our motivation comes from hitting some
 performance walls when working with repositories the size of chromium
 and blink:

 https://chromium.googlesource.com/chromium/src
 https://chromium.googlesource.com/chromium/blink

I have no comments about thread safety improvements (well, not yet).
If you have investigated about git performance on chromium
repositories, could you please sum it up? Threading may be an option
to improve performance, but it's probably not the only option.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-11 Thread Duy Nguyen
On Wed, Feb 12, 2014 at 9:02 AM, Robin H. Johnson robb...@gentoo.org wrote:
 On Tue, Feb 11, 2014 at 05:54:51PM -0800,  Stefan Zager wrote:
 We in the chromium project have a keen interest in adding threading to
 git in the pursuit of performance for lengthy operations (checkout,
 status, blame, ...).  Our motivation comes from hitting some
 performance walls when working with repositories the size of chromium
 and blink:
 +1 from Gentoo on performance improvements for large repos.

 The main repository in the ongoing Git migration project looks to be in
 the 1.5GB range (and for those that want to propose splitting it up, we
 have explored that option and found it lacking), with very deep history
 (but no branches of note, and very few tags).

From v1.9 shallow clone should work for all push/pull/clone... so
history depth does not matter (on the client side). As for
gentoo-x86's large worktree, using index v4 and avoid full-tree
operations (e.g. status ., not status..) should make all
operations reasonably fast. I plan to make status fast even without
path limiting with the help of inotify, but that's not going to be
finished soon. Did I miss anything else?
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pack bitmap woes on Windows

2014-02-12 Thread Duy Nguyen
On Wed, Feb 12, 2014 at 8:23 PM, David Kastrup d...@gnu.org wrote:
 Johannes Sixt j.s...@viscovery.net writes:

 Am 2/12/2014 13:55, schrieb David Kastrup:
 Johannes Sixt j.s...@viscovery.net writes:

 Running test suite of 'next' on Windows fails in t5310-pack-bitmaps with
 the following symptoms. I haven't followed the topic. Have there been
 patches floating that addressed the problem in one way or another?

 (gdb) run
 Starting program: D:\Src\mingw-git\t\trash
 directory.t5310-pack-bitmaps/..\..\git.exe rev-list --test-bitmap
 HEAD
 [New thread 3528.0x8d4]
 Bitmap v1 test (20 entries loaded)
 Found bitmap for 537ea4d3eb79c95f602873b1167c480006d2ac2d. 64 bits
 / 15873b36 checksum

 Does reverting a201c20b41a2f0725977bcb89a2a66135d776ba2 help?

 YES! t5310 passes after reverting this commit.

 Oh.  I just looked through the backtrace until finding a routine
 reasonably related with the regtest and checked for the last commit
 changing it, then posted my question.

 Then I looked through the diff of the patch and considered it
 unconspicuous.  So I commenced reading through earlier commits.

 I actually don't have a good idea what might be wrong here.  The code is
 somewhat distasteful as it basically uses eword_t and uint64_t
 interchangeably, but then this does match its current definition.

Perhaps __BYTE_ORDER or __BIG_ENDIAN is misdefined and the ntohll() is skipped?
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] tests: turn on network daemon tests by default

2014-02-12 Thread Duy Nguyen
On Thu, Feb 13, 2014 at 5:12 AM, Jeff King p...@peff.net wrote:
 lib-httpd was never designed to be included from anywhere except the
 beginning of the file. But that wouldn't be right for t5537, because it
 wants to run some of the tests, even if apache setup fails. The right
 way to do it is probably to have lib-httpd do all of its work in a lazy
 prereq. I don't know how clunky that will end up, though; it might be
 simpler to just move the shallow http test into one of the http-fetch
 scripts.

I'll move it out later.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: error: src refspec refs/heads/master matches more than one.

2014-02-14 Thread Duy Nguyen
On Fri, Feb 14, 2014 at 7:45 PM, Andreas Schwab sch...@linux-m68k.org wrote:
 Josef Wolf j...@raven.inka.de writes:

 Notice the refs/heads _within_ refs/heads!

 Now I wonder how I managed to get into this situation and what's the best way
 to recover?

 Probably you did something like git branch refs/heads/master.  You can
 remove it again with git branch -d refs/heads/master.

As a porcelain, git branch should prevent (or at least warn) users
from creating such refs, I think.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-14 Thread Duy Nguyen
On Sat, Feb 15, 2014 at 2:16 AM, Zachary Turner ztur...@chromium.org wrote:
 (Gah, sorry if you're receiving multiple emails to your personal
 addresses, I need to get used to manually setting Plain-text mode
 every time I send a message).

 For the mixed read, we wouldn't be looking for another caller of
 pread() (since it doesn't care what the file pointer is), but instead
 a caller of read() or lseek() (since those do depend on the current
 file pointer).  In index-pack.c, I see two possible culprits:

 1) A call to xread() from inside fill()
 2) A call to lseek in parse_pack_objects()

 Do you think these could be related?  If so, maybe that opens up some
 other solutions?

For index-pack alone, what's wrong with open one file handle per thread?
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-14 Thread Duy Nguyen
On Sat, Feb 15, 2014 at 7:50 AM, Stefan Zager sza...@google.com wrote:
 On Fri, Feb 14, 2014 at 4:45 PM, Duy Nguyen pclo...@gmail.com wrote:
 On Sat, Feb 15, 2014 at 2:16 AM, Zachary Turner ztur...@chromium.org wrote:
 (Gah, sorry if you're receiving multiple emails to your personal
 addresses, I need to get used to manually setting Plain-text mode
 every time I send a message).

 For the mixed read, we wouldn't be looking for another caller of
 pread() (since it doesn't care what the file pointer is), but instead
 a caller of read() or lseek() (since those do depend on the current
 file pointer).  In index-pack.c, I see two possible culprits:

 1) A call to xread() from inside fill()
 2) A call to lseek in parse_pack_objects()

 Do you think these could be related?  If so, maybe that opens up some
 other solutions?

 For index-pack alone, what's wrong with open one file handle per thread?

 Nothing wrong with that, except that it would mean either using
 thread-local storage (which the code doesn't currently use); or
 plumbing pack_fd through the call stack, which doesn't sound very fun.

Current code does use thread-local storage (struct thread_local and
get_thread_data). Adding a new file handle when NO_THREAD_SAFE_PREAD
is defined is simpler imo.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-14 Thread Duy Nguyen
On Sat, Feb 15, 2014 at 8:15 AM, Zachary Turner ztur...@chromium.org wrote:
 Even if we make that change to use TLS for this case, the
 implementation of pread() will still change in such a way that the
 semantics of pread() are different on Windows.  Is that ok?

 Just to summarize, here's the viable approaches I've seen discussed so far:

 1) Use _WINVER at compile time to select either a thread-safe or
 non-thread-safe implementation of pread.  This is the easiest possible
 code change, but would necessitate 2 binary distributions of git for
 windows.
 2) Use TLS as you suggest and have one fd per pack thread.  Probably
 the most complicated code change (at least for me, being a first-time
 contributor)

It's not so complicated. I suggested a patch [1] before (surprise!).

 3) Use Karsten's suggested implementation from earlier in the thread.
 Seems to work, but it's a little confusing from a readability
 standpoint since the implementation is not-thread safe except in this
 specific usage contex

[1] http://article.gmane.org/gmane.comp.version-control.git/196042
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/4] Good bye fnmatch

2014-02-15 Thread Duy Nguyen
On Sat, Feb 15, 2014 at 3:23 PM, David Kastrup d...@gnu.org wrote:
 Nguyễn Thái Ngọc Duy pclo...@gmail.com writes:

 Long story short, we wanted globbing wildcard ** so I ripped
 wildmatch library from rsync to do it.

 Since version 3.0.0, rsync is GPLv3
 URL:https://rsync.samba.org/GPL.html.  So be sure to take an older
 version.

Yes. See 5230f60 (Import wildmatch from rsync - 2012-10-15), the code
was taken from the last GPL-2 commit.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: error: src refspec refs/heads/master matches more than one.

2014-02-15 Thread Duy Nguyen
On Fri, Feb 14, 2014 at 08:32:07AM -0800, Junio C Hamano wrote:
 Duy Nguyen pclo...@gmail.com writes:
 
  On Fri, Feb 14, 2014 at 7:45 PM, Andreas Schwab sch...@linux-m68k.org 
  wrote:
  Josef Wolf j...@raven.inka.de writes:
 
  Notice the refs/heads _within_ refs/heads!
 
  Now I wonder how I managed to get into this situation and what's the best 
  way
  to recover?
 
  Probably you did something like git branch refs/heads/master.  You can
  remove it again with git branch -d refs/heads/master.
 
  As a porcelain, git branch should prevent (or at least warn) users
  from creating such refs, I think.
 
 warn, possibly, but I do not see a reason to *prevent*.
 
  A. You are not allowed to call your branch with a string that begins with
 'refs/heads/'.
  B. Why?
  A. Because it will confuse you.
  B. I know what I am doing.
  A. ???

Prevent is a strong word. I meant we only do it if they force
it. Something like this..

-- 8 --
diff --git a/branch.c b/branch.c
index 723a36b..3f0540f 100644
--- a/branch.c
+++ b/branch.c
@@ -251,6 +251,11 @@ void create_branch(const char *head,
forcing = 1;
}
 
+   if (!force  dwim_ref(name, strlen(name), sha1, real_ref))
+   die(_(creating ref refs/heads/%s makes %s ambiguous.\n
+ Use -f to create it anyway.),
+   name, name);
+
real_ref = NULL;
if (get_sha1(start_name, sha1)) {
if (explicit_tracking) {
-- 8 --
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] reset: setup worktree on --mixed

2014-02-15 Thread Duy Nguyen
On Sat, Feb 15, 2014 at 4:14 PM, Nguyễn Thái Ngọc Duy pclo...@gmail.com wrote:
 --mixed does mainly two things: read_from_tree(), which does not
 require a worktree, and refresh_index(), which does.

.. and I should have run the entire test suite before sending this. It
breaks t7103.5 (reset --mixed in bare repository). That test seems
wrong though, in my opinon..


 Reported-by: Patrick Palka patr...@parcs.ath.cx
 Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com
 ---
  builtin/reset.c  |  2 +-
  t/t7102-reset.sh | 11 +++
  2 files changed, 12 insertions(+), 1 deletion(-)

 diff --git a/builtin/reset.c b/builtin/reset.c
 index 6004803..9928c28 100644
 --- a/builtin/reset.c
 +++ b/builtin/reset.c
 @@ -320,7 +320,7 @@ int cmd_reset(int argc, const char **argv, const char 
 *prefix)
 if (reset_type == NONE)
 reset_type = MIXED; /* by default */

 -   if (reset_type != SOFT  reset_type != MIXED)
 +   if (reset_type != SOFT)
 setup_work_tree();

 if (reset_type == MIXED  is_bare_repository())
 diff --git a/t/t7102-reset.sh b/t/t7102-reset.sh
 index 8d4b50d..ee117e2 100755
 --- a/t/t7102-reset.sh
 +++ b/t/t7102-reset.sh
 @@ -535,4 +535,15 @@ test_expect_success 'reset with paths accepts tree' '
 git diff HEAD --exit-code
  '

 +test_expect_success 'reset --mixed sets up work tree' '
 +   git init mixed_worktree 
 +   (
 +   cd mixed_worktree 
 +   test_commit dummy
 +   ) 
 +   : expect 
 +   git --git-dir=mixed_worktree/.git --work-tree=mixed_worktree reset 
 actual 
 +   test_cmp expect actual
 +'
 +
  test_done
 --
 1.8.5.2.240.g8478abd




-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git GSoC 2014

2014-02-15 Thread Duy Nguyen
On Sat, Feb 15, 2014 at 7:17 PM, Thomas Rast t...@thomasrast.ch wrote:
 It also includes an ok from Nicolas Pitre, who has been the driving
 force behind packv5 development.  The only thing that makes me uneasy is

Nit: pack v4. You are probably confused with index v5, which is also
cooking for a while now.

 that Duy is not in the list (Duy, have you been asked by libgit2 about
 possible relicensing?).

I don't remember. But for the record I'm OK with relicensing my
contributions in git.git for inclusion in libgit2.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] Wider exposure for index-v4

2014-02-15 Thread Duy Nguyen
On Sun, Feb 16, 2014 at 2:23 AM, Thomas Gummerer t.gumme...@gmail.com wrote:
 Hi,

 since index-v5 didn't seem to generate enough interest to be merged, I

I thought there were some comments last time that you were going to
address and resubmit?

 have a few patches that give users users easier access to index-v4.
 Until now users have to go into the source code and compile git
 themselves to use index-v4 by default, or use git-update-index to
 change the index file to the new version.

Not objecting this, but I think something like [1] would give v4 more
exposure. Reading the patch again, I think putting that detection code
in unpack_trees() or git-merge may make more sense because people will
be advised about upgrading to v4 at the next fast-forward.

[1] http://article.gmane.org/gmane.comp.version-control.git/216307

 With this patches it's possible to set the default index file format
 either in gitconfig or in an environment variable.  It also simplifies
 testing index-v4 by adding a Makefile knob to use it for running the
 test suite.  For safety, existing repositories are not changed when
 the environment or the config variables are set.

 I'm not sure about the precedence in patch 3, right now the environment
 variable has precedence, but it should be easy to give the config
 option precedence over that.

 Thomas Gummerer (3):
 introduce GIT_INDEX_VERSION environment variable
 test-lib: allow setting the index format version
 read-cache: add index.version config variable

 Documentation/config.txt  |  4 +++
 Documentation/git.txt |  5 
 Makefile  |  7 +
 read-cache.c  | 36 +++-
 t/t1600-index.sh  | 52 +++
 t/t2104-update-index-skip-worktree.sh |  2 ++
 t/test-lib-functions.sh   |  5 
 t/test-lib.sh |  3 ++
 8 files changed, 113 insertions(+), 1 deletion(-)
 create mode 100755 t/t1600-index.sh

 --
 1.8.3.2

 --
 To unsubscribe from this list: send the line unsubscribe git in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFH] hackday and GSoC topic suggestions

2014-02-16 Thread Duy Nguyen
On Sun, Feb 9, 2014 at 2:03 AM, Thomas Rast t...@thomasrast.ch wrote:
 Easy:

 * Add -p 'e' when it fails to apply should offer an obvious way of
   starting from the original hunk (not the broken one) or both

If it's too easy, you can add a command to change diff display
settings (--color-words, context size...)
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Ambiguous branch names...

2014-02-16 Thread Duy Nguyen
On Mon, Feb 17, 2014 at 3:34 AM, Ingo Rohloff lund...@gmx.de wrote:
 Hello,

 while trying out git (version 1.7.9.5), I did this:

 git clone -- ssh://myserver/~rohloff/git/w1.git w1

 So I just cloned a test repository.
 The in the cloned w1 repository I executed:

 git branch origin/master
 git branch remotes/origin/master
 git branch refs/remotes/origin/master

 I now have got three *local* branches with the above names, which
 now seems to make it impossible to refer to the master branch
 from the origin *remote* repository.

 Wouldn't it make sense to forbid such names for local branches ?
 For example to enforce some rules like a local branch name *must not*
 start with remotes/ or refs/ ?

See http://thread.gmane.org/gmane.comp.version-control.git/242096/focus=242181.
The proposal basically is you can't create those branches without -f.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug: relative core.worktree is resolved from symlink and not its target

2014-02-17 Thread Duy Nguyen
On Mon, Feb 17, 2014 at 4:36 PM, Daniel Hahler
genml+git-2...@thequod.de wrote:
 On 09.02.2014 10:08, Duy Nguyen wrote:
 On Tue, Feb 04, 2014 at 11:20:39AM +0100, Daniel Hahler wrote:

 Thanks for looking into this.

 when using a submodule sm, there is a relative worktree in its config:

.git/modules/sm/config:
[core]
 worktree = ../../../smworktree

 git-new-worktree (from contrib) symlinks this config the new worktree.

 From inside the new worktree, git reads the config, but resolves the
 relative worktree setting based on the symlink's location.

 Hmm.. core.worktree is relative to $GIT_DIR. Whether config is a
 symlink should have no effects.

 If config is a symlink, the relative path for worktree is meant to be
 resolved based on the config file's location, and not from the symlink
 ($GIT_DIR).

I think you started with a wrong assumption. See config.txt it says

-- 8 --
core.worktree::
Set the path to the root of the working tree.
This can be overridden by the GIT_WORK_TREE environment
variable and the '--work-tree' command line option.
The value can be an absolute path or relative to the path to
the .git directory, which is either specified by --git-dir
or GIT_DIR, or automatically discovered.
-- 8 --

So I think it fails correctly (by the book).

 Here is a test case to reproduce it:

   # Create a submodule repo
   mkdir /tmp/t-sm
   cd /tmp/t-sm
   git init
   touch foo
   git add foo
   git commit -m init

   # Create the root repo
   mkdir /tmp/t-root
   cd /tmp/t-root
   git init
   mkdir submodules
   git submodule add /tmp/t-sm submodules/sm
   git commit -m init

   # Create a new worktree from the submodule
   cd /tmp/t-root/submodules/sm
   git-new-workdir . /tmp/new-workdir

 This then fails when checking out:
 + git checkout -f
 fatal: Could not chdir to '../../../../submodules/sm': No such file or 
 directory

 % ls -l /tmp/new-workdir/.git/config
 […] /tmp/new-workdir/.git/config - 
 /tmp/t-root/.git/modules/submodules/sm/config

 % cat /tmp/new-workdir/.git/config
 [core]
 repositoryformatversion = 0
 filemode = true
 bare = false
 logallrefupdates = true
 worktree = ../../../../submodules/sm


 From inside of /tmp/new-workdir `git rev-parse --git-dir` fails already
 (with the same cannot chdir error).

 The problem appears to be that it tries to chdir based on
 /tmp/new-workdir/.git, but should do so based on
 $(readlink -f .git/config).

 I recognize that this case is constructed anyway, because even if
 `worktree` would get resolved correctly, it would not be what you'd
 expect: the point of git-new-workdir is to get a separate worktree, and
 not use the existing one.

 Therefore I see two problems here:
 1. worktree is not resolved correctly by git itself (with .git/config
being a symlink)
 2. git-new-workdir should handle this better, e.g. by creating a copy of
the config file with the worktree setting removed and printing a
warning about it.

 The workaround appears to be to explicitly set
 GIT_WORK_TREE=/tmp/new-workdir.

No if you copy config out then it may be not what you want anymore
(e.g. new remotes not reflected in new worktree). A better solution is
move core.worktree out of config. Notice t-root/submodules/sm is a
file that contains the path to the true .git directory. We could have
stored worktree path in that file instead of in config.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/26] inotify support

2014-02-17 Thread Duy Nguyen
On Mon, Feb 10, 2014 at 3:19 AM, Torsten Bögershausen tbo...@web.de wrote:

 On 2014-02-08 09.53, Duy Nguyen wrote:
 file-watcher.c | 32 
 1 file changed, 32 insertions(+)

 I feel a little bit unsure about the 700.
 Most often Git does not care about permissions,
 and relies on umask being set appropriatly.
 (Please correct me if I'm wrong)

Git does care. See credential-cache--daemon.c. In fact this function
is a copy of check_socket_directory() from that file.

 I was probably a little bit unclear.
 Of course credentials should be protected well and stored with 700.
 The rest of the repo could be more loose by using adjust_shared_perm().
 Because the whole repo can be shared (or not) and data is visible
 to the group or everyone.
 (this is a minor issue)

So how about a check whenever a worktree is connected to the daemon,
if that worktree has stricter permission, e.g. 0700 vs 0770 of the
daemon socket directory, then the daemon refuses the worktree (maybe
with a warning)?
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git gc --aggressive led to about 40 times slower git log --raw

2014-02-18 Thread Duy Nguyen
On Tue, Feb 18, 2014 at 3:55 PM, David Kastrup d...@gnu.org wrote:
 Christian Jaeger chr...@gmail.com writes:

 I've got a repository where git log --raw  _somefile took a few
 seconds in the past, but after an attempt at merging some commits that
 were collected in a clone of the same repo that was created about a
 year ago, I noticed that this command was now taking 3 minutes 7
 seconds. git gc, git fsck, git clone file:///the/repo/.git also
 now each took between ~4-10 minutes, also git log --raw somefile got
 equally unusably slow. With the help of the people on the IRC, I
 tracked it down to my recent use of git gc --aggressive in this
 repo. Running git repack -a -d -f solved it, now it's again taking
 4-5 seconds. After running git gc --aggressive again for
 confirmation, git log --raw  _somefile was again slowed down,
 although now 'only' to 1 minute 34 seconds;

 [...]

 I've now learned to avoid git gc --aggressive. Perhaps there are
 some other conclusions to be drawn, I don't know.

 I've seen the same with my ongoing work on git-blame with the current
 Emacs Git mirror.  Aggressive packing reduces the repository size to
 about a quarter, but it blows up the system time (mainly I/O)
 significantly, quite reducing the total benefits of my algorithmic
 improvements there.

Likely because --aggressive passes --depth=250 to pack-objects. Long
delta chains could reduce pack size and increase I/O as well as zlib
processing signficantly. Christian can try git repack -adf which is
really close to --aggressive (except it uses default --depth=50) and
see if it makes any difference.

 There is also some quite visible additional time spent in zlib, so a
 wild guess would be that zlib is not really suited to the massive amount
 of directory entries of a Git object store.  Since the system time still
 dominates, this guess would only make sense if Git over zlib kept
 rereading the directory section of whatever compressed file we are
 talking about.  But that's really a rather handwavy wild guess without
 anything better than a hunch to back it up.  I don't even know what kind
 of compression and/or packs are used: I've only ever messed myself with
 the delta coding of the normal unpacked operation (there are a few
 older commits from me on that).

 --
 David Kastrup
 --
 To unsubscribe from this list: send the line unsubscribe git in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git gc --aggressive led to about 40 times slower git log --raw

2014-02-18 Thread Duy Nguyen
On Wed, Feb 19, 2014 at 3:59 AM, Junio C Hamano gits...@pobox.com wrote:
 Let's do something like this first and then later make --depth
 configurable just like --width, perhaps?  For aggressive, I think
 the default width (hardcoded to 250 but configurable) is a bit too
 narrow.

  builtin/gc.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/builtin/gc.c b/builtin/gc.c
 index 6be6c8d..0d010f0 100644
 --- a/builtin/gc.c
 +++ b/builtin/gc.c
 @@ -204,7 +204,7 @@ int cmd_gc(int argc, const char **argv, const char 
 *prefix)

 if (aggressive) {
 argv_array_push(repack, -f);
 -   argv_array_push(repack, --depth=250);
 +   argv_array_push(repack, --depth=20);
 if (aggressive_window  0)
 argv_array_pushf(repack, --window=%d, 
 aggressive_window);
 }

Lower depth than default (50) does not sound aggressive to me, at
least from disk space utilization. I agree it should be configurable
though.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH] Supporting non-blob notes

2014-02-18 Thread Duy Nguyen
On Tue, Feb 18, 2014 at 9:46 PM, Johan Herland jo...@herland.net wrote:
 On Mon, Feb 17, 2014 at 11:48 AM,  yann.dir...@bertin.fr wrote:
 The recent git-note -C changes commit type? thread
 (http://thread.gmane.org/gmane.comp.version-control.git/241950) looks
 like a good occasion to discuss possible uses of non-blob notes.

 The use-case we're thinking about is the storage of testrun logs as
 notes (think: being able to justify that a given set of tests were
 successfully run on a given revision).

 I think this is a good use of notes, and organizing the testrun logs
 into a tree of files seems like a natural way to proceed.

Notes from the previous attempt to store trees as notes (something to
watch out maybe, when you do it again)

http://article.gmane.org/gmane.comp.version-control.git/197712
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git gc --aggressive led to about 40 times slower git log --raw

2014-02-18 Thread Duy Nguyen
On Wed, Feb 19, 2014 at 7:10 AM, Junio C Hamano gits...@pobox.com wrote:
 Duy Nguyen pclo...@gmail.com writes:

 Lower depth than default (50) does not sound aggressive to me, at
 least from disk space utilization. I agree it should be configurable
 though.

 Do you mean you want to keep --aggressive to mean too aggressive
 in resulting size, to the point that it is not useful to anybody?

git-gc.txt is pretty vague about this --aggressive. I assume we would
want both, better disk utilization and performance. But if it produces
a tiny pack that takes forever to access, then it's definitely bad
aggression.

 Shallow and wide will give us, with a large window, the most
 aggressively efficient packfiles that are useful, and we would
 rather want to fix it to be usable, I would think.

fwiw this is the thread that added --depth=250

http://thread.gmane.org/gmane.comp.gcc.devel/94565/focus=94626

yes, if reducing depth leads to better performance and does not use
much disk in general case, then of course we should do it. General
case may be hard to define though. It'd be best if we have some sort
of heuristics to try out different combinations on a specific repo and
return the best combination of parameters. It could even take longer
time, but once we have good parameters, they should remain good for a
long time, I think.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git gc --aggressive led to about 40 times slower git log --raw

2014-02-19 Thread Duy Nguyen
On Wed, Feb 19, 2014 at 3:38 PM, Philippe Vaucher
philippe.vauc...@gmail.com wrote:
 fwiw this is the thread that added --depth=250

 http://thread.gmane.org/gmane.comp.gcc.devel/94565/focus=94626

 This post is quite interesting:
 http://article.gmane.org/gmane.comp.gcc.devel/94637

Especially this part

-- 8 --
And quite frankly, a delta depth
of 250 is likely going to cause overflows in the delta cache (which is
only 256 entries in size *and* it's a hash, so it's going to start having
hash conflicts long before hitting the 250 depth limit).
-- 8 --

So in order to get file A's content, we go through its 250 level chain
(and fill the cache), then we get to file B and do the same, which
evicts nearly everything from A. By the time we go to the next commit,
we have to go through 250 levels for A again because the cache is
pretty much useless.

I can think of two improvements we could make, either increase cache
size dynamically (within limits) or make it configurable. If we have N
entries in worktree (both trees and blobs) and depth M, then we might
need to cache N*M objects for it to be effective. Christian, if you
want to experiment this, update MAX_DELTA_CACHE in sha1_file.c and
rebuild.

The other is smarter eviction, instead of throwing all A's cached
items out (based on recent order), keep the last few items of A and
evict B's oldest cached items. Hopefully by the next comit, we can
still reuse some cache for A and other files/trees. Delta cache needs
to learn about grouping to achieve this.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git gc --aggressive led to about 40 times slower git log --raw

2014-02-19 Thread Duy Nguyen
On Wed, Feb 19, 2014 at 4:01 PM, David Kastrup d...@gnu.org wrote:
 Calling git blame via C-x v g is a rather important part of the
 workflow, and it's currently intolerable to work with on a number of
 files.

 While I'm fixing the basic shortcomings in builtin/blame.c itself, the
 operation fetch the objects is necessary for all objects at least
 once.  It's conceivable that some nice caching strategy would help with
 avoiding the repeated traversal of long delta chain tails.  That could
 also help defusing the operation of basic stuff like git-log.

Pack v4 is supposed to tackle this delta chain thing, but its future
is a bit uncertain (you can give a hand btw). If you often do git
blame, you might consider unpack most accessed objects (make it part
of blame process), which would function exactly like a cache with no
extra code. The downside is git-gc --auto is more likely to kick in
because of too many loose objects and pack everything up again.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] tag: support --sort=version

2014-02-19 Thread Duy Nguyen
On Wed, Feb 19, 2014 at 9:09 PM, Jeff King p...@peff.net wrote:
 On Wed, Feb 19, 2014 at 08:39:27PM +0700, Nguyễn Thái Ngọc Duy wrote:

 --sort=version sorts tags as versions. GNU extension's strverscmp is
 used and no real compat implementation is provided so this is Linux only.

 Sounds like a good goal.

 I wonder, if we were to merge the for-each-ref and tag
 implementations[1], how this would integrate with for-each-ref's
 sorting. It can sort on arbitrary fields like --sort=-*authordate. I
 think given the syntax you provide, this would fall out naturally as
 just another key (albeit a slightly magical one, as it is building on
 the %(refname:short) field but using a different comparator).

 Would we ever want to version-sort any other field? Perhaps
 %(content:subject) for a tag? I'm not sure what would be the most
 natural way to specify that. Maybe --sort=version:content:subject,
 with just --version as a synonym for version:refname:short.

If f-e-r and tag share code and syntax then we could use another
letter for this sort type (like we use '-' to reverse sort order). I
think that would make the implementation easier as well. So we could
have --sort=.refname for version sorting refname, --sort=-.refname for
reversed version sort, and sort=-refname for reverse alphabetical
sort.

 We don't need to do any of that immediately.  This is mostly just me
 thinking aloud, to make sure we do not paint ourselves into a corner
 compatibility-wise.

Good thinking. If you plan to use the exact same sort syntax f-e-r
uses now, pick a letter (the dot I used above is probably not the
best), I'll rewrite this patch to use the same syntax.


 -Peff

 [1] I have patches which I really need to polish and send out that
 combine the ref-selection code (so tag, branch, and for-each-ref all
 know --merged, --contains, etc). I'd really like to combine the
 sorting and formatting code, too, so everybody can use --sort and
 --format with all of the associated fields.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6] inotify support

2014-02-19 Thread Duy Nguyen
On Thu, Feb 20, 2014 at 3:35 AM, Shawn Pearce spea...@spearce.org wrote:
 Why a new daemon? Why don't we reuse the stable
 https://github.com/facebook/watchman project Facebook built to make
 Hg's status system fast?

I did look briefly through its readme before but there were a few
off-factors like CLA, JSON.. that made me go away. I agree that
reusing the same daemon would save us maintenance code and buy the
stability (and maybe sharing some inotify handles for users that use
both git and hg). I'll need to have a closer look at it and compare
with what I've got, now that I have a better picture of how things
should work.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 02/25] Convert git_snpath() to strbuf_git_path()

2014-02-19 Thread Duy Nguyen
On Thu, Feb 20, 2014 at 6:48 AM, Junio C Hamano gits...@pobox.com wrote:
 Nguyễn Thái Ngọc Duy  pclo...@gmail.com writes:

 @@ -651,14 +653,10 @@ static void update_refs_for_switch(const struct 
 checkout_opts *opts,
   new-name);
   }
   }
 - if (old-path  old-name) {
 - char log_file[PATH_MAX], ref_file[PATH_MAX];
 -
 - git_snpath(log_file, sizeof(log_file), logs/%s, 
 old-path);
 - git_snpath(ref_file, sizeof(ref_file), %s, 
 old-path);
 - if (!file_exists(ref_file)  file_exists(log_file))
 - remove_path(log_file);
 - }
 + if (old-path  old-name 
 + !file_exists(git_path(%s, old-path)) 
 +  file_exists(git_path(logs/%s, old-path)))
 + remove_path(git_path(logs/%s, old-path));

 Hmph.  Is this conversion safe?

 This adds three uses of the round-robin path buffer; if a caller of
 this function used two or more path buffers obtained from
 get_pathname() and expected their contents to remain stable across
 the call to this, it will silently break.

three round-robin buffers but not all required at the same time, once
the first file_exists() returns the first round-robin buffer could be
free, and remove_path() calls git_path again, not reusing the result
from the second file_exists, so the second buffer is free to go too.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 02/25] Convert git_snpath() to strbuf_git_path()

2014-02-19 Thread Duy Nguyen
On Thu, Feb 20, 2014 at 10:41 AM, Junio C Hamano gits...@pobox.com wrote:
 Duy Nguyen pclo...@gmail.com writes:

 - }
 + if (old-path  old-name 
 + !file_exists(git_path(%s, old-path)) 
 +  file_exists(git_path(logs/%s, old-path)))
 + remove_path(git_path(logs/%s, old-path));

 Hmph.  Is this conversion safe?

 This adds three uses of the round-robin path buffer; if a caller of
 this function used two or more path buffers obtained from
 get_pathname() and expected their contents to remain stable across
 the call to this, it will silently break.

 three round-robin buffers but not all required at the same time, once
 the first file_exists() returns the first round-robin buffer could be
 free, and remove_path() calls git_path again, not reusing the result
 from the second file_exists, so the second buffer is free to go too.

 I know these three callers to git_path() will not step on each
 other's toes.  But that is not the question I asked.

OK so your question was if there was a git_path() or mkpath() call
earlier in update_refs_for_switch() and the result was expected to
remain stable till the end of update_refs_for_switch(), then this
conversion could ruin it, correct? I didn't think about that, but I
have checked and the only mkpath() call in this function is not
supposed to last long. If it's about a git_pathname() call outside
update_refs_...() that still expects to be stable, we should fix that
code instead.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 14/25] setup.c: convert is_git_directory() to use strbuf

2014-02-20 Thread Duy Nguyen
On Thu, Feb 20, 2014 at 3:17 AM, Junio C Hamano gits...@pobox.com wrote:
 + strbuf_setlen(sb, len);
 + strbuf_add(sb, s, strlen(s));

 I am not sure addstr_at() gives us a good abstraction, or at least
 the name conveys what it does well not to confuse readers.

 At first after only seeing its name, I would have expected that it
 would splice the given string into an existing strbuf at the
 location, not chopping the existing strbuf at the location and
 appending.

I think I invented a few new strbuf_* in this series and this is one
of them. We have about ~14 other places in current code that do
similar pattern: set length back, then add something on top. Not sure
if it's worth a convenient wrapper. I don't know, maybe it's not worth
reducing one line and causing more confusion.


 +}
  static inline void strbuf_addbuf(struct strbuf *sb, const struct strbuf 
 *sb2) {
   strbuf_grow(sb, sb2-len);
   strbuf_add(sb, sb2-buf, sb2-len);



-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 24/25] prune: strategies for linked checkouts

2014-02-20 Thread Duy Nguyen
On Thu, Feb 20, 2014 at 3:32 AM, Junio C Hamano gits...@pobox.com wrote:
 Nguyễn Thái Ngọc Duy  pclo...@gmail.com writes:

 (Only nitpicks during this round of review).

 + if (get_device_or_die(path) != get_device_or_die(get_git_dir())) {
 + strbuf_reset(sb);
 + strbuf_addf(sb, %s/locked, sb_repo.buf);
 + write_file(sb.buf, 1, located on a different file system\n);
 + keep_locked = 1;
 + } else {
 + strbuf_reset(sb);
 + strbuf_addf(sb, %s/link, sb_repo.buf);
 + link(sb_git.buf, sb.buf); /* it's ok to fail */

 If so, perhaps tell that to the code by saying something like

 (void) link(...);

 ?

 But why is it OK to fail in the first place?  If we couldn't link,
 don't you want to fall back to the lock codepath?  After all, the
 are we on the same device? check is to cope with the case where
 we cannot link and that alternate codepath is supposed to be
 prepared to handle the ah, we cannot link case correctly, no?

Filesystems not supporting hardlinks are one reason. The idea behind
locked is that the new checkout could be on a portable device and we
don't want to prune its metadata just because the device is not
plugged in. Checking same device help but even that can't guarantee no
false positives, unless your only mount point is /. So no I don't
really think we should go lock whenever link() fails, that would just
make fewer checkouts prunable.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   3   4   5   6   7   8   9   10   >