Re: [PATCH v7 2/3] branch: mark missing tracking branch as gone

2013-08-21 Thread Matthieu Moy
Jiang Xin  writes:

> $ git status
> # On branch topic
> # Your branch is based on 'topicbase', but the upstream is gone.
> #   (use "git branch --unset-upstream" to fixup)

Sorry, I didn't follow closely the previous discussions. I'm not sure
"gone" is right either, since the user may just have configured an
upstream that does not exist and never existed. Perhaps "absent" would
be better.

Just a thought, shouldn't block the patch.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] git-remote-mediawiki: add test and check Makefile targets

2013-08-21 Thread Matthieu Moy
There are a few level 4 and 2 perlcritic issues in the current code.
We make level 5 fatal, and keep level 2 as warnings so that "make
check" pass.

Signed-off-by: Matthieu Moy 
---
 contrib/mw-to-git/Makefile | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/contrib/mw-to-git/Makefile b/contrib/mw-to-git/Makefile
index 76fcd4d..f206f96 100644
--- a/contrib/mw-to-git/Makefile
+++ b/contrib/mw-to-git/Makefile
@@ -24,6 +24,11 @@ INSTLIBDIR=$(shell $(MAKE) -C $(GIT_ROOT_DIR)/perl \
 
 all: build
 
+test: all
+   $(MAKE) -C t
+
+check: perlcritic test
+
 install_pm:
install $(GIT_MEDIAWIKI_PM) $(INSTLIBDIR)/$(GIT_MEDIAWIKI_PM)
 
@@ -41,4 +46,7 @@ clean:
rm $(INSTLIBDIR)/$(GIT_MEDIAWIKI_PM)
 
 perlcritic:
-   perlcritic -2 *.perl
+   perlcritic -5 $(SCRIPT_PERL)
+   -perlcritic -2 $(SCRIPT_PERL)
+
+.PHONY: all test check install_pm install clean perlcritic
-- 
1.8.4.rc2.18.g030d947

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] git-remote-mediawiki: reset private ref after non-dumb push

2013-08-21 Thread Matthieu Moy
Git-mediawiki's "dumb push" sends the local revisions to the remote wiki,
but does not update the local metadata to reflect the push (hence, the
next pull will have to re-import the exported revisions).

The previous implementation was simply omitting the update to the private
ref after a dumb push. This was broken by 664059fb62 (Felipe Contreras,
Apr 17 2013, transport-helper: update remote helper namespace), which
does an automatic update of the private ref (not just the
remote-tracking) on push.

This patch fixes git-remote-mediawiki to reset the private ref after the
push is completed, cancelling the automatic update triggered by
664059fb62.

Signed-off-by: Matthieu Moy 
---
Just a resend of the RFC
( http://thread.gmane.org/gmane.comp.version-control.git/232224 ),
which received no comment.

 contrib/mw-to-git/git-remote-mediawiki.perl | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl 
b/contrib/mw-to-git/git-remote-mediawiki.perl
index f8d7d2c..13919ad 100755
--- a/contrib/mw-to-git/git-remote-mediawiki.perl
+++ b/contrib/mw-to-git/git-remote-mediawiki.perl
@@ -53,6 +53,7 @@ if (@ARGV != 2) {
 
 my $remotename = $ARGV[0];
 my $url = $ARGV[1];
+my $reset_private_ref_to = undef;
 
 # Accept both space-separated and multiple keys in config file.
 # Spaces should be written as _ anyway because we'll use chomp.
@@ -161,6 +162,9 @@ sub parse_command {
my ($line) = @_;
my @cmd = split(/ /, $line);
if (!defined $cmd[0]) {
+   if ($reset_private_ref_to) {
+   run_git("update-ref -m \"Git-MediaWiki non-dumb push\" 
refs/mediawiki/$remotename/master $reset_private_ref_to");
+   }
return 0;
}
if ($cmd[0] eq 'capabilities') {
@@ -1209,9 +1213,10 @@ sub mw_push_revision {
die("Unknown error from mw_push_file()\n");
}
}
-   if (!$dumb_push) {
+   if ($dumb_push) {
+   $reset_private_ref_to = $remoteorigin_sha1;
+   } else {
run_git(qq(notes --ref=${remotename}/mediawiki add -f 
-m "mediawiki_revision: ${mw_revision}" ${sha1_commit}));
-   run_git(qq(update-ref -m "Git-MediaWiki push" 
refs/mediawiki/${remotename}/master ${sha1_commit} ${sha1_child}));
}
}
 
-- 
1.8.4.rc2.18.g030d947

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] repack: rewrite the shell script in C.

2013-08-21 Thread Jonathan Nieder
Hi,

Stefan Beller wrote:

> [PATCH] repack: rewrite the shell script in C.

Thanks for your work so far.  This review will have mostly cosmetic
notes.  Hopefully others can try it out to see if the actual behavior
is good.

As a first nit: in git, as usual in emails, the style in subject lines
is not to end with a period.  The above subject line is otherwise good
(a nice summary that quickly explains the effect, which is handy in
e.g. abbreviated changelogs from release announcements).

> This is the beginning of the rewrite of the repacking.

This is a place to explain

 - the motivation / intended positive effect of the patch
 - any noticeable behavior changes
 - complications and other hints for people looking back and trying
   to understand this code

Based on the discussion before, I think the motivation is to get
closer to a goal of being able to have a core subset of git
functionality built in to git.  That would mean

 * people on Windows could get a copy of at least the core parts
   of Git without having to install a Unix-style shell

 * people deploying to servers don't have to rewrite the #! line
   or worry about the PATH and quality of installed POSIX
   utilities, if they are only using the built-in part written
   in C 

This patch is meant to be mostly a literal translation of the
git-repack script; the intent is that later patches would start using
more library facilities, but this patch is meant to be as close to a
no-op as possible so it doesn't do that kind of thing.

> All tests are constantly positive now.

This kind of changes-since-the-previous-iteration information that
doesn't need to be recorded in the commit log for posterity goes
after the "---" marker.

> 
> Signed-off-by: Stefan Beller 
> ---
>  Makefile|   2 +-
[...]
> --- /dev/null
> +++ b/builtin/repack.c
> @@ -0,0 +1,361 @@
[...]
> +static int delta_base_offset = 0;

The "= 0" is automatic for statics without an initializer.  The
prevailing style in git is to leave it out.

Behavior change: in the script, wasn't the default "true"?

[...]
> +static void remove_temporary_files() {

Style: argument list should have "void".  (In C89 and C99, an empty
argument list means "having unspecified arguments" instead of "having
no arguments" as in C++.)

> + DIR *dir;
> + struct dirent *e;
> + char *prefix, *path;
> +
> + prefix = mkpathdup(".tmp-%d-pack", getpid());

pid_t is not guaranteed to be an "int", so this needs a cast.

> + path = mkpathdup("%s/pack", get_object_directory());

The names "prefix" and "path" are quite generic.  What does this
function do?  A comment could help, e.g.:

/*
 * Remove temporary $GIT_OBJECT_DIRECTORY/pack/.tmp-$$-pack-* files.
 */

> +
> + dir = opendir(path);
> + while ((e = readdir(dir)) != NULL) {

What happens if the directory does not exist?

> + if (!prefixcmp(e->d_name, prefix)) {

The git-repack script removes $PACKTMP-*, but this code matches $PACKTMP*
instead.  Intentional?

> + struct strbuf fname = STRBUF_INIT;
> + strbuf_addf(&fname, "%s/%s", path, e->d_name);
> + unlink(strbuf_detach(&fname, NULL));

This leaks fname (see Documentation/technical/api-strbuf.txt for an
explanation of strbuf_detach).

> + }
> + }
> + free(prefix);
> + free(path);
> + closedir(dir);

I wonder if it would make sense for buffers to share space here.
E.g. something like

 {
/*
 * Remove temporary $GIT_OBJECT_DIRECTORY/pack/.tmp-$$-pack-* files.
 */

struct strbuf buf = STRBUF_INIT;
size_t dirlen, prefixlen;
DIR *dir;
struct dirent *e;

/* .git/objects/pack */
strbuf_addstr(&buf, get_object_directory());
strbuf_addstr(&buf, "/pack");
dir = opendir(buf.buf);
if (!dir)
... handle error ...

/* .git/objects/pack/.tmp-$$-pack-* */
dirlen = buf.len + 1;
strbuf_addf(&buf, "/.tmp-%d-pack-", getpid());
prefixlen = buf.len - dirlen;

while ((e = readdir(dir))) {
if (strncmp(e->d_name, buf.buf + dirlen, prefixlen))
continue;
strbuf_setlen(&buf, dirlen);
strbuf_addstr(&buf, e->d_name);
unlink(buf.buf);
}
if (closedir(dir))
... handle error ...

strbuf_release(&buf);
 }

I dunno.

[...]
> +/*
> + * Fills the filename list with all the files found in the pack directory
> + * ending with .pack, without that extension.
> + */

Ideally a comment opening a function will save lazy readers the
trouble of reading the body of the function, by explaining what the
function is for and giving them some reliable summary of what its
effect will be.

The above comment doesn't do either: it doesn't make it clear why
the function exists, and it doesn't make the semantics 

Re: [PATCH] repack: rewrite the shell script in C.

2013-08-21 Thread Matthieu Moy
Stefan Beller  writes:

> All tests are constantly positive now.

Cool!

> +/*
> + * Fills the filename list with all the files found in the pack directory

Detail: "filename list" could be "fname_list" to match the actual
argument below.

> + * ending with .pack, without that extension.
> + */
> +void get_pack_filenames(char *packdir, struct string_list *fname_list)
> +{
> + DIR *dir;
> + struct dirent *e;
> + char *path, *suffix, *fname;
> +
> + path = mkpath("%s/pack", get_object_directory());
> + suffix = ".pack";
> +
> + dir = opendir(path);

I think you should test and complain if dir is NULL ("cannot open pack
directory: ...")

> +void remove_pack(char *path, char* sha1)
> +{
> + char *exts[] = {".pack", ".idx", ".keep"};
> + int ext = 0;
> + for (ext = 0; ext < 3; ext++) {
> + char *fname;
> + fname = mkpath("%s/%s%s", path, sha1, exts[ext]);
> + unlink(fname);

Here also, the return value from unlink is not checked. Probably not
serious (mistakenly deleting a pack file would be very serious, but
keeping it around by mistake shouldn't harm), but a warning message may
be welcome.

These kinds of warnings are never shown in normal usage, but may be
welcome when something goes really wrong with the repo, as a diagnosis
tool for the user. The shell version had these warnings implicitly since
"rm" displays the message on stderr when it fails.

> + struct child_process cmd;
> + struct argv_array cmd_args = ARGV_ARRAY_INIT;

Since the command is going to be "pack-objects", you may call the
variables pack_cmd and pack_cmd_args or so.

> + if (local)
> + argv_array_push(&cmd_args,  "--local");
> + if (quiet)
> + argv_array_push(&cmd_args,  "--quiet");
> + if (delta_base_offset)
> + argv_array_push(&cmd_args,  "--delta-base-offset");
> +
> + argv_array_push(&cmd_args, packtmp);
> +
> + memset(&cmd, 0, sizeof(cmd));
> + cmd.argv = cmd_args.argv;
> + cmd.git_cmd = 1;
> + cmd.out = -1;
> + cmd.no_stdin = 1;
> +
> + if (start_command(&cmd))
> + return 1;

A warning message would be welcome in addition to returning 1.

> + if (!count_packs && !quiet)
> + printf("Nothing new to pack.\n");
> +
> + int failed = 0;

Don't declare variables inside code, it's not C90.

> + for_each_string_list_item(item, &names) {
> + for (ext = 0; ext < 2; ext++) {
> + char *fname, *fname_old;
> + fname = mkpathdup("%s/%s%s", packdir, item->string, 
> exts[ext]);
> + if (!file_exists(fname)) {
> + free(fname);
> + continue;
> + }
> +
> + fname_old = mkpath("%s/old-%s%s", packdir, 
> item->string, exts[ext]);
> + if (file_exists(fname_old))
> + unlink(fname_old);

Unchecked returned value.

> + if (rename(fname, fname_old)) {
> + failed = 1;
> + break;
> + }
> + string_list_append_nodup(&rollback, fname);
> + free(fname);
> + }
> + if (failed)
> + break;
> + }

I tend to dislike these "set a variable and break twice" to exit nested
loops. Using an auxiliary function, you could just do

int f()
{
for_each {
for () {
...
if ()
return 1;
...
}
}
return 0;
}

(Matter of taste, though. Some people may disagree)

A good side effect would be to move some code out of cmd_repack, which
is rather long.

> + /* Now the ones with the same name are out of the way... */
> + for_each_string_list_item(item, &names) {
> + for (ext = 0; ext < 2; ext++) {
> + char *fname, *fname_old;
> + struct stat statbuffer;
> + fname = mkpathdup("%s/pack-%s%s", packdir, 
> item->string, exts[ext]);
> + fname_old = mkpath("%s-%s%s", packtmp, item->string, 
> exts[ext]);
> + if (!stat(fname_old, &statbuffer)) {
> + statbuffer.st_mode &= ~S_IWUSR | ~S_IWGRP | 
> ~S_IWOTH;
> + chmod(fname_old, statbuffer.st_mode);

Unchecked return value.

> + /* Remove the "old-" files */
> + for_each_string_list_item(item, &names) {
> + char *fname;
> + fname = mkpath("%s/old-pack-%s.idx", packdir, item->string);
> + if (remove_path(fname))
> + die_errno(_("removing '%s' failed"), fname);
> +
> + fname = mkpath("%s/old-pack-%s.pack", packdir, item->string);
> + if (remove_path(fname))
> + 

Re: [PATCH v4 2/4] gitweb: vertically centre contents of page footer

2013-08-21 Thread Tony Finch
Junio C Hamano  wrote:
> Tony Finch  writes:
>
> >  div.page_footer {
> > -   height: 17px;
> > +   height: 22px;
> > padding: 4px 8px;
> > background-color: #d9d8d1;
> >  }
> >
> >  div.page_footer_text {
> > +   line-height: 22px;
> > float: left;
> > color: #55;
> > font-style: italic;
>
> Hmmm, is it a good idea to do "px" here, or are they ways to do
> relative to x-height or something to make sure the text fits?

Good question. I also don't know much about css. I basically followed the
style that was already there, and found out about vertical centering using
line-height by searching the web.

I think font-size relative scaling would require a bigger overhaul.

Tony.
-- 
f.anthony.n.finchhttp://dotat.at/
Forties, Cromarty: East, veering southeast, 4 or 5, occasionally 6 at first.
Rough, becoming slight or moderate. Showers, rain at first. Moderate or good,
occasionally poor at first.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCHv4] repack: rewrite the shell script in C.

2013-08-21 Thread Johannes Sixt

Am 21.08.2013 00:36, schrieb Stefan Beller:

I think I got all the suggestions except the
use of git_path/mkpathdup.
I replaced mkpathdup by mkpath where possible,
but it's still not perfect.
I'll wait for the dokumentation patch of Jonathan,
before changing all these occurences forth and back
again.


I trust Jonathan's judgement of how to use git_path, mkpath, and mkpathdup 
more than my own. So, please take my earlier comments in this regard with 
an appropriately large grain of salt.



Below there is just the diff against RFC PATCHv4,
however I'll send the whole patch as well.


Thanks, that is VERY helpful!

I'll comment here and have a look at the full patch later.


...
  int cmd_repack(int argc, const char **argv, const char *prefix) {



You should move the opening brace to the next line, which would then not 
be empty anymore.



...
@@ -217,34 +217,34 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix) {
argv_array_push(&cmd_args, packtmp);

memset(&cmd, 0, sizeof(cmd));
-   cmd.argv = argv_array_detach(&cmd_args, NULL);
+   cmd.argv = cmd_args.argv;
cmd.git_cmd = 1;
cmd.out = -1;
cmd.no_stdin = 1;

-   if (run_command(&cmd))
+   if (start_command(&cmd))
return 1;


You should have an int ret here and use it like

ret = start_command(&cmd);
if (ret)
return ret;

to retain any exit codes from the sub-process. I know, the script didn't 
preserve it:


names=$(git pack-objects ...) || exit 1

but that was not idiomatic as it should have been written as

names=$(git pack-objects ...) || exit

to forward the failure exit code.



-   struct string_list names = STRING_LIST_INIT_DUP;
-   struct string_list rollback = STRING_LIST_INIT_DUP;
-
-   char line[1024];
-   int counter = 0;
-   FILE *out = xfdopen(cmd.out, "r");


Nice! I missed these decl-after-stmt in my earlier review.


+   count_packs = 0;
+   out = xfdopen(cmd.out, "r");
while (fgets(line, sizeof(line), out)) {
/* a line consists of 40 hex chars + '\n' */
-   assert(strlen(line) == 41);
+   if (strlen(line) != 41)
+   die("repack: Expecting 40 character sha1 lines only from 
pack-objects.");


I agree with Jonathan that you should use strbuf_getline() here.


line[40] = '\0';
string_list_append(&names, line);
-   counter++;
+   count_packs++;
}
-   if (!counter)
-   printf("Nothing new to pack.\n");
+   if (finish_command(&cmd))
+   return 1;


Same as above here:

ret = finish_command(&cmd);
if (ret)
return ret;

I would prefer to see

argv_array_clear(&cmd_args);

here, i.e., at the end of the current use rather than later at the 
beginning of the next use. (Ditto for the other uses of cmd_args.)



fclose(out);


This should happen before finish_command(). It doesn't matter if there are 
no errors, but if things go awry, closing the channel before 
finish_command() avoids deadlocks.




+   if (!count_packs && !quiet)
+   printf("Nothing new to pack.\n");
+
...
@@ -301,33 +299,33 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix) {
for_each_string_list_item(item, &names) {
for (ext = 0; ext < 2; ext++) {
char *fname, *fname_old;
+   struct stat statbuffer;
fname = mkpathdup("%s/pack-%s%s", packdir, 
item->string, exts[ext]);
-   fname_old = mkpathdup("%s-%s%s", packtmp, item->string, 
exts[ext]);
-   stat(fname_old, &statbuffer);
-   statbuffer.st_mode &= ~S_IWUSR | ~S_IWGRP | ~S_IWOTH;
-   chmod(fname_old, statbuffer.st_mode);
+   fname_old = mkpath("%s-%s%s", packtmp, item->string, 
exts[ext]);
+   if (!stat(fname_old, &statbuffer)) {
+   statbuffer.st_mode &= ~S_IWUSR | ~S_IWGRP | 
~S_IWOTH;


This is still wrong: it should be one of

... &= ~S_IWUSR & ~S_IWGRP & ~S_IWOTH;
... &= ~(S_IWUSR | S_IWGRP | S_IWOTH);


+   chmod(fname_old, statbuffer.st_mode);
+   }
if (rename(fname_old, fname))
-   die("Could not rename packfile: %s -> %s", 
fname_old, fname);
+   die_errno(_("renaming '%s' failed"), fname_old);
free(fname);
-   free(fname_old);
}
}
...


Everything else looks OK. But as I said, mkpath() may have to be reverted 
to mkpathdup() as per Jonathans comments.


-- Hannes

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body 

Re: [PATCH] repack: rewrite the shell script in C.

2013-08-21 Thread Stefan Beller
On 08/21/2013 10:25 AM, Jonathan Nieder wrote:
>> +static int delta_base_offset = 0;
> 
> The "= 0" is automatic for statics without an initializer.  The
> prevailing style in git is to leave it out.
> 
> Behavior change: in the script, wasn't the default "true"?
> 

Yes, I was printing out the arguments of shell version and 
of the C version and tried to match the arguments.
I must have missconfigured the test repository where
I run these differential tests. 
Now that I test again, the --delta-base-offset option
shows up as default as it is documented.

Now fixing the rest of your annotations.

Stefan

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] repack: rewrite the shell script in C.

2013-08-21 Thread Stefan Beller
On 08/21/2013 10:49 AM, Matthieu Moy wrote:
>> +if (start_command(&cmd))
>> > +  return 1;
> A warning message would be welcome in addition to returning 1.
> 

Johannes Sixt proposes to retain the return value of
the sub process, which I'd agree on.
However why do we need a warning message here?

I'd expect the pack-objects to bring up the warning as
the stderr is untouched in the command invocation.
And we already passed either --quiet or not, so pack-objects
should know how to behave on its own.
Also it is a builtin command, so we do not need to check
if it is found or not, so I'd strongly rely on the error
and warning reporting from the underlying process, no?

Thanks,
Stefan




signature.asc
Description: OpenPGP digital signature


Re: [PATCH] repack: rewrite the shell script in C.

2013-08-21 Thread Stefan Beller
On 08/21/2013 10:49 AM, Matthieu Moy wrote:
> I tend to dislike these "set a variable and break twice" to exit nested
> loops. Using an auxiliary function, you could just do
> 
> int f()
> {
>   for_each {
>   for () {
>   ...
>   if ()
>   return 1;
>   ...
>   }
>   }
>   return 0;
> }
> 
> (Matter of taste, though. Some people may disagree)
> 
> A good side effect would be to move some code out of cmd_repack, which
> is rather long.

Thank you very much for the review, it helps me very much to focus
on the small details.

I intend to have the C code in this patch as close to the
shell version as possible. This goes both for functionality as
well as style/organisation within the file.

All the additional changes, such as this one
(Or in the previous mail, retaining the error code of subprocesses)
I'd like to put in small follow up patches changing just one thing
at a time.

But as these follow up changes heavily rely on the very first patch
I will first try to get that right, meaning accepted into pu.
Then I can send patches with these proposals such as making more
functions.

Thanks,
Stefan



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] repack: rewrite the shell script in C.

2013-08-21 Thread Matthieu Moy
Stefan Beller  writes:

> On 08/21/2013 10:49 AM, Matthieu Moy wrote:
>>> +   if (start_command(&cmd))
>>> > + return 1;
>> A warning message would be welcome in addition to returning 1.
>> 
>
> Johannes Sixt proposes to retain the return value of
> the sub process, which I'd agree on.

Yes.

> I'd expect the pack-objects to bring up the warning as
> the stderr is untouched in the command invocation.

I was more thinking of weird cases like failure to fork or so. But
according to api-run-command.txt:

  . If a system call failed, errno is set and -1 is returned. A diagnostic
is printed.

So you actually don't need it. In this case, following Johannes's
suggestion, you'd return -1 from the main function, which is unusual but
AFAICT is OK.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] repack: rewrite the shell script in C.

2013-08-21 Thread Matthieu Moy
Stefan Beller  writes:

> But as these follow up changes heavily rely on the very first patch
> I will first try to get that right, meaning accepted into pu.
> Then I can send patches with these proposals such as making more
> functions.

I think it's better to get the style right before, to avoid doubling the
review effort (review a hard-to-review patch first, and then re-review a
style-fix one).

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Document the HTTP transport protocols

2013-08-21 Thread Nguyễn Thái Ngọc Duy
From: "Shawn O. Pearce" 

Signed-off-by: Shawn O. Pearce 
Revised-by: Tay Ray Chuan 
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 This is basically the version from Tay Ray Chuan [1] with one more
 proofreading round from me. My changes besides realignments are below.
 There are TODOs remain but I think the document is valuable as it is.

 On the topic, C Git's (maybe) violations on this spec are:
 
  - The client does not strip trailing slashes from $GIT_URL before
sending to the server, as described in section "URL Format".
 
  - The client does not check that HTTP status code is either 200 or
304 when receiving response in discovering references phase.
 
  - The client verifies the first 5 bytes against pattern
"^[0-9a-fA-F]{4}#" instead of "^[0-9a-f]{4}#" as described in
section "discovering references".

 [1] 
https://raw.github.com/rctay/git/feature/http-doc/Documentation/technical/http-protocol.txt

diff --git b/home/pclouds/t/http-protocol.txt 
a/Documentation/technical/http-protocol.txt
index 0b1c96c..8812bf6 100644
--- b/home/pclouds/t/http-protocol.txt
+++ a/Documentation/technical/http-protocol.txt
@@ -315,7 +315,8 @@ Servers SHOULD support all capabilities defined here.
 
 Clients MUST send at least one 'want' command in the request body.
 Clients MUST NOT reference an id in a 'want' command which did not
-appear in the response obtained through ref discovery.
+appear in the response obtained through ref discovery unless the
+server advertises capability "allow-tip-sha1-in-want".
 
   compute_request   =  want_list
have_list
@@ -330,31 +331,9 @@ appear in the response obtained through ref discovery.
 
   have_list =  *PKT-LINE("have" SP id LF)
 
-  command   =  create / delete / update
-  create=  zero-id SP new_id SP name
-  delete=  old_id SP zero-id SP name
-  update=  old_id SP new_id SP name
-
 TODO: Document this further.
 TODO: Don't use uppercase for variable names below.
 
-Capability include-tag
-~~
-
-When packing an object that an annotated tag points at, include the
-tag object too.  Clients can request this if they want to fetch
-tags, but don't know which tags they will need until after they
-receive the branch data.  By enabling include-tag an extra call
-to upload-pack can be avoided.
-
-Capability thin-pack
-
-
-When packing a deltified object the base is not included if the base
-is reachable from an object listed in the COMMON set by the client.
-This reduces the bandwidth required to transfer, but it does slightly
-increase processing time for the client to save the pack to disk.
-
 The Negotiation Algorithm
 ~
 The computation to select the minimal pack proceeds as follows
@@ -520,4 +499,5 @@ References
 
 link:http://www.ietf.org/rfc/rfc1738.txt[RFC 1738: Uniform Resource 
Locators (URL)]
 link:http://www.ietf.org/rfc/rfc2616.txt[RFC 2616: Hypertext Transfer 
Protocol -- HTTP/1.1]
-
+link:technical/pack-protocol.txt
+link:technical/protocol-capabilities.txt

 Documentation/technical/http-protocol.txt (new) | 503 
 1 file changed, 503 insertions(+)
 create mode 100644 Documentation/technical/http-protocol.txt

diff --git a/Documentation/technical/http-protocol.txt 
b/Documentation/technical/http-protocol.txt
new file mode 100644
index 000..8812bf6
--- /dev/null
+++ b/Documentation/technical/http-protocol.txt
@@ -0,0 +1,503 @@
+HTTP transfer protocols
+===
+
+Git supports two HTTP based transfer protocols.  A "dumb" protocol
+which requires only a standard HTTP server on the server end of the
+connection, and a "smart" protocol which requires a Git aware CGI
+(or server module).  This document describes both protocols.
+
+As a design feature smart clients can automatically upgrade "dumb"
+protocol URLs to smart URLs.  This permits all users to have the
+same published URL, and the peers automatically select the most
+efficient transport available to them.
+
+
+URL Format
+--
+
+URLs for Git repositories accessed by HTTP use the standard HTTP
+URL syntax documented by RFC 1738, so they are of the form:
+
+  http://:/?
+
+Within this documentation the placeholder $GIT_URL will stand for
+the http:// repository URL entered by the end-user.
+
+Servers SHOULD handle all requests to locations matching $GIT_URL, as
+both the "smart" and "dumb" HTTP protocols used by Git operate
+by appending additional path components onto the end of the user
+supplied $GIT_URL string.
+
+An example of a dumb client requesting for a loose object:
+
+  $GIT_URL: http://example.com:8080/git/repo.git
+  URL request:  
http://example.com:8080/git/repo.git/objects/d0/49f6c27a2244e1204

[PATCH v6 0/2] Fix IO >= 2GB on Mac, fixed typo

2013-08-21 Thread Steffen Prohaska
Fixed typo in comment.

Steffen Prohaska (2):
  xread, xwrite: Limit size of IO, fixing IO of 2GB and more on Mac OS X
  Revert "compate/clipped-write.c: large write(2) fails on Mac OS X/XNU"

 Makefile   |  8 
 compat/clipped-write.c | 13 -
 config.mak.uname   |  1 -
 git-compat-util.h  |  5 -
 t/t0021-conversion.sh  | 14 ++
 wrapper.c  | 12 
 6 files changed, 26 insertions(+), 27 deletions(-)
 delete mode 100644 compat/clipped-write.c

-- 
1.8.4.rc3.5.g4f480ff

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 2/2] Revert "compate/clipped-write.c: large write(2) fails on Mac OS X/XNU"

2013-08-21 Thread Steffen Prohaska
The previous commit introduced a size limit on IO chunks on all
platforms.  The compat clipped_write() is not needed anymore.

This reverts commit 6c642a878688adf46b226903858b53e2d31ac5c3.

Signed-off-by: Steffen Prohaska 
---
 Makefile   |  8 
 compat/clipped-write.c | 13 -
 config.mak.uname   |  1 -
 git-compat-util.h  |  5 -
 4 files changed, 27 deletions(-)
 delete mode 100644 compat/clipped-write.c

diff --git a/Makefile b/Makefile
index 3588ca1..4026211 100644
--- a/Makefile
+++ b/Makefile
@@ -69,9 +69,6 @@ all::
 # Define NO_MSGFMT_EXTENDED_OPTIONS if your implementation of msgfmt
 # doesn't support GNU extensions like --check and --statistics
 #
-# Define NEEDS_CLIPPED_WRITE if your write(2) cannot write more than
-# INT_MAX bytes at once (e.g. MacOS X).
-#
 # Define HAVE_PATHS_H if you have paths.h and want to use the default PATH
 # it specifies.
 #
@@ -1493,11 +1490,6 @@ ifndef NO_MSGFMT_EXTENDED_OPTIONS
MSGFMT += --check --statistics
 endif
 
-ifdef NEEDS_CLIPPED_WRITE
-   BASIC_CFLAGS += -DNEEDS_CLIPPED_WRITE
-   COMPAT_OBJS += compat/clipped-write.o
-endif
-
 ifneq (,$(XDL_FAST_HASH))
BASIC_CFLAGS += -DXDL_FAST_HASH
 endif
diff --git a/compat/clipped-write.c b/compat/clipped-write.c
deleted file mode 100644
index b8f98ff..000
--- a/compat/clipped-write.c
+++ /dev/null
@@ -1,13 +0,0 @@
-#include "../git-compat-util.h"
-#undef write
-
-/*
- * Version of write that will write at most INT_MAX bytes.
- * Workaround a xnu bug on Mac OS X
- */
-ssize_t clipped_write(int fildes, const void *buf, size_t nbyte)
-{
-   if (nbyte > INT_MAX)
-   nbyte = INT_MAX;
-   return write(fildes, buf, nbyte);
-}
diff --git a/config.mak.uname b/config.mak.uname
index b27f51d..7d61531 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -95,7 +95,6 @@ ifeq ($(uname_S),Darwin)
NO_MEMMEM = YesPlease
USE_ST_TIMESPEC = YesPlease
HAVE_DEV_TTY = YesPlease
-   NEEDS_CLIPPED_WRITE = YesPlease
COMPAT_OBJS += compat/precompose_utf8.o
BASIC_CFLAGS += -DPRECOMPOSE_UNICODE
 endif
diff --git a/git-compat-util.h b/git-compat-util.h
index 115cb1d..96d8881 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -185,11 +185,6 @@ typedef unsigned long uintptr_t;
 #define probe_utf8_pathname_composition(a,b)
 #endif
 
-#ifdef NEEDS_CLIPPED_WRITE
-ssize_t clipped_write(int fildes, const void *buf, size_t nbyte);
-#define write(x,y,z) clipped_write((x),(y),(z))
-#endif
-
 #ifdef MKDIR_WO_TRAILING_SLASH
 #define mkdir(a,b) compat_mkdir_wo_trailing_slash((a),(b))
 extern int compat_mkdir_wo_trailing_slash(const char*, mode_t);
-- 
1.8.4.rc3.5.g4f480ff

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 1/2] xread, xwrite: Limit size of IO, fixing IO of 2GB and more on Mac OS X

2013-08-21 Thread Steffen Prohaska
Previously, filtering 2GB or more through an external filter (see test)
failed on Mac OS X 10.8.4 (12E55) for a 64-bit executable with:

error: read from external filter cat failed
error: cannot feed the input to external filter cat
error: cat died of signal 13
error: external filter cat failed 141
error: external filter cat failed

The reason was that read() immediately returns with EINVAL if nbyte >=
2GB.  According to POSIX [1], if the value of nbyte passed to read() is
greater than SSIZE_MAX, the result is implementation-defined.  The write
function has the same restriction [2].  Since OS X still supports
running 32-bit executables, the 32-bit limit (SSIZE_MAX = INT_MAX
= 2GB - 1) seems to be also imposed on 64-bit executables under certain
conditions.  For write, the problem has been addressed earlier [6c642a].

This commit addresses the problem for read() and write() by limiting
size of IO chunks unconditionally on all platforms in xread() and
xwrite().  Large chunks only cause problems, like triggering the OS
X bug or causing latencies when killing the process.  Reasonably sized
smaller chunks have no negative impact on performance.

The compat wrapper clipped_write() introduced earlier [6c642a] is not
needed anymore.  It will be reverted in a separate commit.  The new test
catches read and write problems.

Note that 'git add' exits with 0 even if it prints filtering errors to
stderr.  The test, therefore, checks stderr.  'git add' should probably
be changed (sometime in another commit) to exit with nonzero if
filtering fails.  The test could then be changed to use test_must_fail.

Thanks to the following people for suggestions and testing:

Johannes Sixt 
John Keeping 
Jonathan Nieder 
Kyle J. McKay 
Linus Torvalds 
Torsten Bögershausen 

[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html
[2] http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html

[6c642a] commit 6c642a878688adf46b226903858b53e2d31ac5c3
compate/clipped-write.c: large write(2) fails on Mac OS X/XNU

Signed-off-by: Steffen Prohaska 
---
 t/t0021-conversion.sh | 14 ++
 wrapper.c | 12 
 2 files changed, 26 insertions(+)

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index e50f0f7..b92e6cb 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -190,4 +190,18 @@ test_expect_success 'required filter clean failure' '
test_must_fail git add test.fc
 '
 
+test -n "$GIT_TEST_LONG" && test_set_prereq EXPENSIVE
+
+test_expect_success EXPENSIVE 'filter large file' '
+   git config filter.largefile.smudge cat &&
+   git config filter.largefile.clean cat &&
+   for i in $(test_seq 1 2048); do printf "%1048576d" 1; done >2GB &&
+   echo "2GB filter=largefile" >.gitattributes &&
+   git add 2GB 2>err &&
+   ! test -s err &&
+   rm -f 2GB &&
+   git checkout -- 2GB 2>err &&
+   ! test -s err
+'
+
 test_done
diff --git a/wrapper.c b/wrapper.c
index 6a015de..66cc727 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -131,6 +131,14 @@ void *xcalloc(size_t nmemb, size_t size)
 }
 
 /*
+ * Limit size of IO chunks, because huge chunks only cause pain.  OS X 64-bit
+ * is buggy, returning EINVAL if len >= INT_MAX; and even in the absense of
+ * bugs, large chunks can result in bad latencies when you decide to kill the
+ * process.
+ */
+#define MAX_IO_SIZE (8*1024*1024)
+
+/*
  * xread() is the same a read(), but it automatically restarts read()
  * operations with a recoverable error (EAGAIN and EINTR). xread()
  * DOES NOT GUARANTEE that "len" bytes is read even if the data is available.
@@ -138,6 +146,8 @@ void *xcalloc(size_t nmemb, size_t size)
 ssize_t xread(int fd, void *buf, size_t len)
 {
ssize_t nr;
+   if (len > MAX_IO_SIZE)
+   len = MAX_IO_SIZE;
while (1) {
nr = read(fd, buf, len);
if ((nr < 0) && (errno == EAGAIN || errno == EINTR))
@@ -154,6 +164,8 @@ ssize_t xread(int fd, void *buf, size_t len)
 ssize_t xwrite(int fd, const void *buf, size_t len)
 {
ssize_t nr;
+   if (len > MAX_IO_SIZE)
+   len = MAX_IO_SIZE;
while (1) {
nr = write(fd, buf, len);
if ((nr < 0) && (errno == EAGAIN || errno == EINTR))
-- 
1.8.4.rc3.5.g4f480ff

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 0/2] Fix IO >= 2GB on Mac, fixed typo

2013-08-21 Thread Junio C Hamano
Steffen Prohaska  writes:

> Fixed typo in comment.

Thanks, and sorry for not being clear that I'll locally tweak before
queuing when I commented on v5 yesterday.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] repack: rewrite the shell script in C.

2013-08-21 Thread Stefan Beller
On 08/21/2013 10:25 AM, Jonathan Nieder wrote:
> Hi,
> 
> Stefan Beller wrote:
> 
>> [PATCH] repack: rewrite the shell script in C.
> 
> Thanks for your work so far.  This review will have mostly cosmetic
> notes.  Hopefully others can try it out to see if the actual behavior
> is good.

Thanks for all the reviews. I hope to have included every suggestion
so far or have send out mail discussing why not.

There have been quite a few changes since last round
because of so many reviews.

Here is a diff to the last sent patch, I'll also send
the whole patch on its one again.
Last time I forgot to label correctly with [RFC PATCHv5],
so the next patch should be v6.

Stefan

Changes since "[PATCH] repack: rewrite the shell script in C.":

--8<--
>From 3cda569cdcd1312679c0035d151515cba7dacc59 Mon Sep 17 00:00:00 2001
From: Stefan Beller 
Date: Wed, 21 Aug 2013 12:33:13 +0200
Subject: [PATCH 2/3] Changes to last round.

 * get_pack_filenames: directly check for .keep files
 * packdir is a global variable now
 * fix help string for parsing options.
 * reenable the delta-base-offset being turned on by default
 * rewrite remove_temporary_files(void), remove_redundant_pack(fname)
   to use more strbuf instead of using mkpath(dup)
 * beautifying the code (line length, empty lines)

Still on the todo list for this patch:
 * Inspect the code for unlink, rename and see if we
   need to deal with their return codes.
 * Check for datatypes (--window-memory could use ulong?)

Later:
 * Move parts of cmd_repack to extra functions
 * check if subprocesses are needed (update-server-info,
   prune-packed)
---
 builtin/repack.c | 191
++-
 1 file changed, 103 insertions(+), 88 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index 9fbe636..fb050c0 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -13,7 +13,9 @@
 #include "string-list.h"
 #include "argv-array.h"

-static int delta_base_offset = 0;
+/* enabled by default since 22c79eab (2008-06-25) */
+static int delta_base_offset = 1;
+char *packdir;

 static const char *const git_repack_usage[] = {
N_("git repack [options]"),
@@ -29,25 +31,39 @@ static int repack_config(const char *var, const char
*value, void *cb)
return git_default_config(var, value, cb);
 }

-static void remove_temporary_files() {
+/*
+ * Remove temporary $GIT_OBJECT_DIRECTORY/pack/.tmp-$$-pack-* files.
+ */
+static void remove_temporary_files(void)
+{
+   struct strbuf buf = STRBUF_INIT;
+   size_t dirlen, prefixlen;
DIR *dir;
struct dirent *e;
-   char *prefix, *path;

-   prefix = mkpathdup(".tmp-%d-pack", getpid());
-   path = mkpathdup("%s/pack", get_object_directory());
+   /* .git/objects/pack */
+   strbuf_addstr(&buf, get_object_directory());
+   strbuf_addstr(&buf, "/pack");
+   dir = opendir(buf.buf);
+   if (!dir) {
+   strbuf_release(&buf);
+   return;
+   }

-   dir = opendir(path);
-   while ((e = readdir(dir)) != NULL) {
-   if (!prefixcmp(e->d_name, prefix)) {
-   struct strbuf fname = STRBUF_INIT;
-   strbuf_addf(&fname, "%s/%s", path, e->d_name);
-   unlink(strbuf_detach(&fname, NULL));
-   }
+   /* .git/objects/pack/.tmp-$$-pack-* */
+   dirlen = buf.len + 1;
+   strbuf_addf(&buf, "/.tmp-%d-pack-", (int)getpid());
+   prefixlen = buf.len - dirlen;
+
+   while ((e = readdir(dir))) {
+   if (strncmp(e->d_name, buf.buf + dirlen, prefixlen))
+   continue;
+   strbuf_setlen(&buf, dirlen);
+   strbuf_addstr(&buf, e->d_name);
+   unlink(buf.buf);
}
-   free(prefix);
-   free(path);
closedir(dir);
+   strbuf_release(&buf);
 }

 static void remove_pack_on_signal(int signo)
@@ -57,52 +73,57 @@ static void remove_pack_on_signal(int signo)
raise(signo);
 }

-/*
- * Fills the filename list with all the files found in the pack directory
- * ending with .pack, without that extension.
- */
-void get_pack_filenames(char *packdir, struct string_list *fname_list)
+static void get_pack_filenames(struct string_list *fname_list)
 {
DIR *dir;
struct dirent *e;
-   char *path, *suffix, *fname;
+   char *fname;

-   path = mkpath("%s/pack", get_object_directory());
-   suffix = ".pack";
+   if (!(dir = opendir(packdir)))
+   return;

-   dir = opendir(path);
while ((e = readdir(dir)) != NULL) {
-   if (!suffixcmp(e->d_name, suffix)) {
-   size_t len = strlen(e->d_name) - strlen(suffix);
-   fname = xmemdupz(e->d_name, len);
+   if (suffixcmp(e->d_name, ".pack"))
+   continue;
+
+   size_t len = strlen(e->d_name) - strlen(".pack");
+   fname = xmemdupz(e->d_name, len);
+
+

[RFC PATCHv6 2/2] repack: retain the return value of pack-objects

2013-08-21 Thread Stefan Beller
During the review process of the previous commit (repack: rewrite the
shell script in C), Johannes Sixt proposed to retain any exit codes from
the sub-process, which makes it probably more obvious in case of failure.

As the commit before should behave as close to the original shell
script, the proposed change is put in this extra commit.
The infrastructure however was already setup in the previous commit.
(Having a local 'ret' variable)

Signed-off-by: Stefan Beller 
---
 builtin/repack.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index fb050c0..1f13e0d 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -231,7 +231,7 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix)
 
ret = start_command(&cmd);
if (ret)
-   return 1;
+   return ret;
 
count_packs = 0;
out = xfdopen(cmd.out, "r");
@@ -245,7 +245,7 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix)
fclose(out);
ret = finish_command(&cmd);
if (ret)
-   return 1;
+   return ret;
argv_array_clear(&cmd_args);
 
if (!count_packs && !quiet)
-- 
1.8.4.rc3.1.gc1ebd90

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCHv6 1/2] repack: rewrite the shell script in C

2013-08-21 Thread Stefan Beller
The motivation of this patch is to get closer to a goal of being
able to have a core subset of git functionality built in to git.
That would mean

 * people on Windows could get a copy of at least the core parts
   of Git without having to install a Unix-style shell

 * people deploying to servers don't have to rewrite the #! line
   or worry about the PATH and quality of installed POSIX
   utilities, if they are only using the built-in part written
   in C

This patch is meant to be mostly a literal translation of the
git-repack script; the intent is that later patches would start using
more library facilities, but this patch is meant to be as close to a
no-op as possible so it doesn't do that kind of thing.

Signed-off-by: Stefan Beller 
---
 Makefile|   2 +-
 builtin.h   |   1 +
 builtin/repack.c| 376 
 git-repack.sh => contrib/examples/git-repack.sh |   0
 git.c   |   1 +
 5 files changed, 379 insertions(+), 1 deletion(-)
 create mode 100644 builtin/repack.c
 rename git-repack.sh => contrib/examples/git-repack.sh (100%)

diff --git a/Makefile b/Makefile
index ef442eb..4ec5bbe 100644
--- a/Makefile
+++ b/Makefile
@@ -464,7 +464,6 @@ SCRIPT_SH += git-pull.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-rebase.sh
 SCRIPT_SH += git-remote-testgit.sh
-SCRIPT_SH += git-repack.sh
 SCRIPT_SH += git-request-pull.sh
 SCRIPT_SH += git-stash.sh
 SCRIPT_SH += git-submodule.sh
@@ -972,6 +971,7 @@ BUILTIN_OBJS += builtin/reflog.o
 BUILTIN_OBJS += builtin/remote.o
 BUILTIN_OBJS += builtin/remote-ext.o
 BUILTIN_OBJS += builtin/remote-fd.o
+BUILTIN_OBJS += builtin/repack.o
 BUILTIN_OBJS += builtin/replace.o
 BUILTIN_OBJS += builtin/rerere.o
 BUILTIN_OBJS += builtin/reset.o
diff --git a/builtin.h b/builtin.h
index 8afa2de..b56cf07 100644
--- a/builtin.h
+++ b/builtin.h
@@ -102,6 +102,7 @@ extern int cmd_reflog(int argc, const char **argv, const 
char *prefix);
 extern int cmd_remote(int argc, const char **argv, const char *prefix);
 extern int cmd_remote_ext(int argc, const char **argv, const char *prefix);
 extern int cmd_remote_fd(int argc, const char **argv, const char *prefix);
+extern int cmd_repack(int argc, const char **argv, const char *prefix);
 extern int cmd_repo_config(int argc, const char **argv, const char *prefix);
 extern int cmd_rerere(int argc, const char **argv, const char *prefix);
 extern int cmd_reset(int argc, const char **argv, const char *prefix);
diff --git a/builtin/repack.c b/builtin/repack.c
new file mode 100644
index 000..fb050c0
--- /dev/null
+++ b/builtin/repack.c
@@ -0,0 +1,376 @@
+/*
+ * The shell version was written by Linus Torvalds (2005) and many others.
+ * This is a translation into C by Stefan Beller (2013)
+ */
+
+#include "builtin.h"
+#include "cache.h"
+#include "dir.h"
+#include "parse-options.h"
+#include "run-command.h"
+#include "sigchain.h"
+#include "strbuf.h"
+#include "string-list.h"
+#include "argv-array.h"
+
+/* enabled by default since 22c79eab (2008-06-25) */
+static int delta_base_offset = 1;
+char *packdir;
+
+static const char *const git_repack_usage[] = {
+   N_("git repack [options]"),
+   NULL
+};
+
+static int repack_config(const char *var, const char *value, void *cb)
+{
+   if (!strcmp(var, "repack.usedeltabaseoffset")) {
+   delta_base_offset = git_config_bool(var, value);
+   return 0;
+   }
+   return git_default_config(var, value, cb);
+}
+
+/*
+ * Remove temporary $GIT_OBJECT_DIRECTORY/pack/.tmp-$$-pack-* files.
+ */
+static void remove_temporary_files(void)
+{
+   struct strbuf buf = STRBUF_INIT;
+   size_t dirlen, prefixlen;
+   DIR *dir;
+   struct dirent *e;
+
+   /* .git/objects/pack */
+   strbuf_addstr(&buf, get_object_directory());
+   strbuf_addstr(&buf, "/pack");
+   dir = opendir(buf.buf);
+   if (!dir) {
+   strbuf_release(&buf);
+   return;
+   }
+
+   /* .git/objects/pack/.tmp-$$-pack-* */
+   dirlen = buf.len + 1;
+   strbuf_addf(&buf, "/.tmp-%d-pack-", (int)getpid());
+   prefixlen = buf.len - dirlen;
+
+   while ((e = readdir(dir))) {
+   if (strncmp(e->d_name, buf.buf + dirlen, prefixlen))
+   continue;
+   strbuf_setlen(&buf, dirlen);
+   strbuf_addstr(&buf, e->d_name);
+   unlink(buf.buf);
+   }
+   closedir(dir);
+   strbuf_release(&buf);
+}
+
+static void remove_pack_on_signal(int signo)
+{
+   remove_temporary_files();
+   sigchain_pop(signo);
+   raise(signo);
+}
+
+static void get_pack_filenames(struct string_list *fname_list)
+{
+   DIR *dir;
+   struct dirent *e;
+   char *fname;
+
+   if (!(dir = opendir(packdir)))
+   return;
+
+   while ((e = readdir(dir)) != NULL) {
+   if (suffixcmp(

[PATCH] git-diff: Clarify operation when not inside a repository.

2013-08-21 Thread Dale R. Worley
Clarify documentation for git-diff:  State that when not inside a
repository, --no-index is implied (and thus two arguments are
mandatory).

Clarify error message from diff-no-index to inform user that CWD is
not inside a repository and thus two arguments are mandatory.

Signed-off-by: Dale Worley 
---


This clarification is to avoid a problem I ran into.  I executed 'git
diff' in the remote working tree of a repository, and not in the
repository directory itself.  Because of that, git-diff assumed
git-diff --no-index, and executed diff-no-index.  Since I hadn't
provided paths, diff-no-index produced an error message.
Unfortunately, the error message presupposes that the decision to
execute diff-no-index reflects the user's intention, thus leaving me
confused, as the error message is only:
usage: git diff [--no-index]  
and does not cover the case I intended.  This patch changes the
message to notify the user that he is getting --no-index semantics
because he is outside of a repository:
Not within a git repository:
usage: git diff [--no-index]  
The additional line is suppressed if the user specified --no-index.

The documentation is expanded to state that execution outside of a
repository forces --no-index behavior.  Previously, the manual implied
this but did not state it, making it easy for the user to overlook
that it's possible to run git-diff outside of a repository.

Dale


 Documentation/git-diff.txt |3 ++-
 diff-no-index.c|6 +-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt
index 78d6d50..9f74989 100644
--- a/Documentation/git-diff.txt
+++ b/Documentation/git-diff.txt
@@ -31,7 +31,8 @@ two blob objects, or changes between two files on disk.
 +
 If exactly two paths are given and at least one points outside
 the current repository, 'git diff' will compare the two files /
-directories. This behavior can be forced by --no-index.
+directories. This behavior can be forced by --no-index or by 
+executing 'git diff' outside of a working tree.
 
 'git diff' [--options] --cached [] [--] [...]::
 
diff --git a/diff-no-index.c b/diff-no-index.c
index e66fdf3..98c5f76 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -215,9 +215,13 @@ void diff_no_index(struct rev_info *revs,
 path_inside_repo(prefix, argv[i+1])))
return;
}
-   if (argc != i + 2)
+   if (argc != i + 2) {
+   if (!no_index) {
+   fprintf(stderr, "Not within a git repository:\n");
+   }
usagef("git diff %s  ",
   no_index ? "--no-index" : "[--no-index]");
+   }
 
diff_setup(&revs->diffopt);
for (i = 1; i < argc - 2; ) {
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] git-remote-mediawiki: reset private ref after non-dumb push

2013-08-21 Thread Junio C Hamano
Matthieu Moy  writes:

> Git-mediawiki's "dumb push" sends the local revisions to the remote wiki,
> but does not update the local metadata to reflect the push (hence, the
> next pull will have to re-import the exported revisions).
>
> The previous implementation was simply omitting the update to the private
> ref after a dumb push. This was broken by 664059fb62 (Felipe Contreras,
> Apr 17 2013, transport-helper: update remote helper namespace), which
> does an automatic update of the private ref (not just the
> remote-tracking) on push.
>
> This patch fixes git-remote-mediawiki to reset the private ref after the
> push is completed, cancelling the automatic update triggered by
> 664059fb62.
>
> Signed-off-by: Matthieu Moy 
> ---
> Just a resend of the RFC
> ( http://thread.gmane.org/gmane.comp.version-control.git/232224 ),
> which received no comment.
>
>  contrib/mw-to-git/git-remote-mediawiki.perl | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl 
> b/contrib/mw-to-git/git-remote-mediawiki.perl
> index f8d7d2c..13919ad 100755
> --- a/contrib/mw-to-git/git-remote-mediawiki.perl
> +++ b/contrib/mw-to-git/git-remote-mediawiki.perl
> @@ -53,6 +53,7 @@ if (@ARGV != 2) {
>  
>  my $remotename = $ARGV[0];
>  my $url = $ARGV[1];
> +my $reset_private_ref_to = undef;
>  
>  # Accept both space-separated and multiple keys in config file.
>  # Spaces should be written as _ anyway because we'll use chomp.
> @@ -161,6 +162,9 @@ sub parse_command {
>   my ($line) = @_;
>   my @cmd = split(/ /, $line);
>   if (!defined $cmd[0]) {
> + if ($reset_private_ref_to) {
> + run_git("update-ref -m \"Git-MediaWiki non-dumb push\" 
> refs/mediawiki/$remotename/master $reset_private_ref_to");
> + }

So reset-private-ref-to is recorded for a non-dumb push, but...

>   return 0;
>   }
>   if ($cmd[0] eq 'capabilities') {
> @@ -1209,9 +1213,10 @@ sub mw_push_revision {
>   die("Unknown error from mw_push_file()\n");
>   }
>   }
> - if (!$dumb_push) {
> + if ($dumb_push) {
> + $reset_private_ref_to = $remoteorigin_sha1;

... it is set for dumb-push?  I am confused.

> + } else {
>   run_git(qq(notes --ref=${remotename}/mediawiki add -f 
> -m "mediawiki_revision: ${mw_revision}" ${sha1_commit}));
> - run_git(qq(update-ref -m "Git-MediaWiki push" 
> refs/mediawiki/${remotename}/master ${sha1_commit} ${sha1_child}));
>   }
>   }
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git-diff: Clarify operation when not inside a repository.

2013-08-21 Thread Junio C Hamano
wor...@alum.mit.edu (Dale R. Worley) writes:

> Unfortunately, the error message presupposes that the decision to
> execute diff-no-index reflects the user's intention, thus leaving me
> confused, as the error message is only:
> usage: git diff [--no-index]  
> and does not cover the case I intended.  This patch changes the
> message to notify the user that he is getting --no-index semantics
> because he is outside of a repository:
> Not within a git repository:
> usage: git diff [--no-index]  
> The additional line is suppressed if the user specified --no-index.

It makes perfect sense for your situation, I think.

Do we say "within" in other error messages for similar situations?
Many commands require you to be in a working tree---the ones marked
as NEED_WORK_TREE in git.c call setup.c::setup_work_tree() to do
this check---and the error message phrases "run in a work tree".  We
would want to use the matching phrasing here, too.

For that matter, as no_index variable knows we didn't get an
explicit "--no-index", we can be even more explicit, e.g.

fatal: If you want to compare two files outside a working
tree, use "git diff  ".

which hopefully will also clarify the consequence of the command,
i.e. compares two files that are _outside_ a working tree.

I am not sure which one is better, though.  Just a random thought
that came to my mind while reading your error message.

> The documentation is expanded to state that execution outside of a
> repository forces --no-index behavior.  Previously, the manual implied
> this but did not state it, making it easy for the user to overlook
> that it's possible to run git-diff outside of a repository.

I am not sure if "forced" is a good description here.  An explicit
"--no-index" does force the command to ignore the fact that the
command is run inside a working tree and compare two paths without
involving Git at all, but the behaviour you saw was to fall back to
the no-index hack instead of failing (the latter of which is a
logical but unfriendly thing to do, as Git is about data managed by
Git, and running Git command that wants a working tree without
having a working tree).  It feels that it is more like "Also, this
mode is used when the command is run outside a working tree" to me.

>  Documentation/git-diff.txt |3 ++-
>  diff-no-index.c|6 +-
>  2 files changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt
> index 78d6d50..9f74989 100644
> --- a/Documentation/git-diff.txt
> +++ b/Documentation/git-diff.txt
> @@ -31,7 +31,8 @@ two blob objects, or changes between two files on disk.
>  +
>  If exactly two paths are given and at least one points outside
>  the current repository, 'git diff' will compare the two files /
> -directories. This behavior can be forced by --no-index.
> +directories. This behavior can be forced by --no-index or by 
> +executing 'git diff' outside of a working tree.
>  
>  'git diff' [--options] --cached [] [--] [...]::
>  
> diff --git a/diff-no-index.c b/diff-no-index.c
> index e66fdf3..98c5f76 100644
> --- a/diff-no-index.c
> +++ b/diff-no-index.c
> @@ -215,9 +215,13 @@ void diff_no_index(struct rev_info *revs,
>path_inside_repo(prefix, argv[i+1])))
>   return;
>   }
> - if (argc != i + 2)
> + if (argc != i + 2) {
> + if (!no_index) {
> + fprintf(stderr, "Not within a git repository:\n");
> + }
>   usagef("git diff %s  ",
>  no_index ? "--no-index" : "[--no-index]");
> + }
>  
>   diff_setup(&revs->diffopt);
>   for (i = 1; i < argc - 2; ) {
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Document the HTTP transport protocols

2013-08-21 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy  writes:

>  This is basically the version from Tay Ray Chuan [1] with one more
>  proofreading round from me. My changes besides realignments are below.
>  There are TODOs remain but I think the document is valuable as it is.

Thanks.

>  On the topic, C Git's (maybe) violations on this spec are:
>  
>   - The client does not strip trailing slashes from $GIT_URL before
> sending to the server, as described in section "URL Format".
>  
>   - The client does not check that HTTP status code is either 200 or
> 304 when receiving response in discovering references phase.
>
>   - The client verifies the first 5 bytes against pattern
> "^[0-9a-fA-F]{4}#" instead of "^[0-9a-f]{4}#" as described in
> section "discovering references".

Perhaps add these to "small projects ideas"?  The last one may want
to be phrased a bit better, though.  I needed to read it twice to
realize that you are saying "the client accepts hexadecimal using
uppercase alphabets where it should not".
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] rebase --preserve-merges: ignore "merge.log" config

2013-08-21 Thread Ralf Thielow
When "merge.log" config is set, "rebase --preserve-merges"
will add the log lines to the message of the rebased merge
commit.

A rebase should not modify a commit message automatically.

Teach "git-rebase" to ignore that configuration by passing "--no-log"
to the git-merge call.

Signed-off-by: Ralf Thielow 
---
 git-rebase--interactive.sh|  3 ++-
 t/t3409-rebase-preserve-merges.sh | 25 +
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 83d6d46..4743c59 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -352,8 +352,9 @@ pick_one_preserving_merges () {
msg_content="$(commit_message $sha1)"
# No point in merging the first parent, that's HEAD
new_parents=${new_parents# $first_parent}
+   merge_args="--no-log --no-ff"
if ! do_with_author output eval \
-   'git merge --no-ff $strategy_args -m "$msg_content" 
$new_parents'
+   'git merge $merge_args $strategy_args -m "$msg_content" 
$new_parents'
then
printf "%s\n" "$msg_content" > 
"$GIT_DIR"/MERGE_MSG
die_with_patch $sha1 "Error redoing merge $sha1"
diff --git a/t/t3409-rebase-preserve-merges.sh 
b/t/t3409-rebase-preserve-merges.sh
index 2e0c364..2454811 100755
--- a/t/t3409-rebase-preserve-merges.sh
+++ b/t/t3409-rebase-preserve-merges.sh
@@ -28,6 +28,8 @@ export GIT_AUTHOR_EMAIL
 # \--A3<-- topic2
 #  \
 #   B2 <-- origin/topic
+#
+# Clone 4 (same as Clone 3)
 
 test_expect_success 'setup for merge-preserving rebase' \
'echo First > A &&
@@ -64,6 +66,16 @@ test_expect_success 'setup for merge-preserving rebase' \
git merge --no-ff topic2
) &&
 
+   git clone ./. clone4 &&
+   (
+   cd clone4 &&
+   git checkout -b topic2 origin/topic &&
+   echo Sixth > A &&
+   git commit -a -m "Modify A3" &&
+   git checkout -b topic origin/topic &&
+   git merge --no-ff topic2
+   ) &&
+
git checkout topic &&
echo Fourth >> B &&
git commit -a -m "Modify B2"
@@ -96,4 +108,17 @@ test_expect_success 'rebase -p preserves no-ff merges' '
)
 '
 
+test_expect_success 'rebase -p ignores merge.log config' '
+   (
+   cd clone4 &&
+   git fetch &&
+   git -c merge.log=1 rebase -p origin/topic &&
+   cat >expected <<-\EOF &&
+
+   EOF
+   git log --format="%b" -1 >current
+   test_cmp expected current
+   )
+'
+
 test_done
-- 
1.8.4.rc4.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/2] git-send-email: Message-ID caching

2013-08-21 Thread Rasmus Villemoes
This is a more polished attempt at implementing the message-id
caching. I apologize for the sloppiness (unrelated changes, unused
variables etc.) in the first version.

I have split the patch in two. The first half moves the choice-logic
inside ask(): If ask() decides to return the default immediately
(because the $term->{IN,OUT} checks fail), there's no reason to print
all the choices. In my first attempt, I handled this by passing a huge
prompt-string, but that made everything underlined (the prompt may be
emphasized in some other way on other terminals). Passing a different
valid_re and handling a numeric return is also slightly more
cumbersome for the caller than making ask() take care of it. There
might be other future uses of this 'choices' ability, but if the
message-id-caching is ultimately rejected, [1/2] can of course also be
dropped.

[2/2] is the new implementation. The two main changes are that old
entries are expunged when the file grows larger than, by default, 100
kB, and the old emails are scored according to a simple scheme (which
might need improvement). The input to the scoring is {first subject in
new batch, old subject, was the old email first in a batch?}. [*]

I also now simply store the unix time stamp instead of storing the
contents of the date header. This reduces processing time (no need to
parse the date header when reading the file), and eliminates the
Date::Parse dependency. The minor downside is that if the user has
changed time zone since the old email was sent, the date in the prompt
will not entirely match the Date:-header in the email that was sent.

I had to rearrange the existing code a little to make certain global
variables ($time, @files) contain the right thing at the right time,
but hopefully I haven't broken anything in the process.

[*] Suggestions for improving that heuristic are welcome. One thing
that might be worthwhile is to give words ending in a colon higher
weight.

Rasmus Villemoes (2):
  git-send-email: add optional 'choices' parameter to the ask sub
  git-send-email: Cache generated message-ids, use them when prompting

 git-send-email.perl |  144 ---
 1 file changed, 138 insertions(+), 6 deletions(-)


Thanks, Junio, for the comments. A few replies:

Junio C Hamano  writes:

> Rasmus Villemoes  writes:
>>  my %config_path_settings = (
>> @@ -311,6 +314,7 @@ my $rc = GetOptions("h" => \$help,
>>  "8bit-encoding=s" => \$auto_8bit_encoding,
>>  "compose-encoding=s" => \$compose_encoding,
>>  "force" => \$force,
>> +"msgid-cache-file=s" => \$msgid_cache_file,
>>   );
>
> Is there a standard, recommended location we suggest users to store
> this?  

I don't know. It is obviously a local, per-repository, thing. I don't
know enough about git's internals to know if something breaks if one
puts it in .git (say, ".git/msgid.cache"). If that's a bad idea, then
I think it should be in the root of the repository, and one would then
probably want to add it to .git/info/exclude.

If storing it under .git is possible, one could consider making the
option a boolean ('msgid-use-cache' ?) and always use
".git/msgid.cache".

>> +my @choices;
>> +if ($msgid_cache_file) {
>> +@choices = msgid_cache_getmatches();
>
> It is a bit strange that "filename is specified => we will find the
> latest 10" before seeing if we even check to see if that file
> exists.  I would have expected that a two-step "filename is given =>
> try to read it" and "if we did read something => give choices"
> process would be used.

I'm not sure I understand how this is different from what I was or am
doing. Sure, I don't do the test for existence before calling
msgid_cache_getmatches(), but that will just return an empty list if
the file does not exist. That will end up doing the same thing as if
no cache file was given.

> Also "getmatches" that does not take any clue from what the caller
> knows (the title of the series, for example) seems much less useful
> than ideal.

Fixed in the new version.

>> +msgid_cache_this($message_id, $subject, $date) if ($msgid_cache_file && 
>> !$dry_run);
>
> Is this caching each and every one, even for things like "[PATCH 23/47]"?

Yes, but now I remember whether it was the first or not. 

>> +msgid_cache_write() if $msgid_cache_file;
>
> Is this done under --dry-run?

Well, it was, but the msgid_cache_this() was not, so there was an
empty list of new entries. Of course, better to be explicit and safe,
so fixed.

>> +if (not open ($fh, '<', $msgid_cache_file)) {
>> +# A non-existing cache file is ok, but should we warn if errno 
>> != ENOENT?
>
> It should not be a warning but an informational message, "creating a
> new cachefile", when bootstrapping, no?

If so, it should probably be when writing the new file. What I'm
trying to say is that errno == ENOENT is ok (and expected), but errno
== EPERM or anything e

[PATCH 1/2] git-send-email: add optional 'choices' parameter to the ask sub

2013-08-21 Thread Rasmus Villemoes
Make it possible for callers of ask() to provide a list of
choices. Entering an appropriate integer chooses from that list,
otherwise the input is treated as usual.

Each choice can either be a single string, which is used both for the
prompt and for the return value, or a two-element array ref, where the
zeroth element is the choice and the first is used for the prompt.

Signed-off-by: Rasmus Villemoes 
---
 git-send-email.perl |   11 +++
 1 file changed, 11 insertions(+)

diff --git a/git-send-email.perl b/git-send-email.perl
index 2162478..ac3b02d 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -680,11 +680,18 @@ sub ask {
my $valid_re = $arg{valid_re};
my $default = $arg{default};
my $confirm_only = $arg{confirm_only};
+   my $choices = $arg{choices} || [];
my $resp;
my $i = 0;
return defined $default ? $default : undef
unless defined $term->IN and defined fileno($term->IN) and
   defined $term->OUT and defined fileno($term->OUT);
+   for (@$choices) {
+   printf "(%d) %s\n", $i++, ref($_) eq 'ARRAY' ? $_->[1] : $_;
+   }
+   printf "Enter 0-%d to choose from the above list\n", $i-1
+   if (@$choices);
+   $i = 0;
while ($i++ < 10) {
$resp = $term->readline($prompt);
if (!defined $resp) { # EOF
@@ -694,6 +701,10 @@ sub ask {
if ($resp eq '' and defined $default) {
return $default;
}
+   if (@$choices && $resp =~ m/^[0-9]+$/ && $resp < @$choices) {
+   my $c = $choices->[$resp];
+   return ref($c) eq 'ARRAY' ? $c->[0] : $c;
+   }
if (!defined $valid_re or $resp =~ /$valid_re/) {
return $resp;
}
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] git-send-email: Cache generated message-ids, use them when prompting

2013-08-21 Thread Rasmus Villemoes
Allow the user to specify a file (sendemail.msgidcachefile) in which
to store the message-ids generated by git-send-email, along with time
and subject information. When prompting for a Message-ID to be used in
In-Reply-To, that file can be used to generate a list of options.

When composing v2 or v3 of a patch or patch series, this avoids the
need to get one's MUA to display the Message-ID of the earlier email
(which is cumbersome in some MUAs) and then copy-paste that.

Listing all previously sent emails is useless, so currently only the
10 most "relevant" emails. "Relevant" is based on a simple scoring,
which might need to be revised: Count the words in the old subject
which also appear in the subject of the first email to be sent; add a
bonus if the old email was first in a batch (that is, [00/74] is more
likely to be relevant than [43/74]). Resort to comparing timestamps
(newer is more relevant) when the scores tie.

To limit disk usage, the oldest half of the cached entries are
expunged when the cache file exceeds sendemail.msgidcachemaxsize
(default 100kB). This also ensures that we will never have to read,
score, and sort 1000s of entries on each invocation of git-send-email.

Signed-off-by: Rasmus Villemoes 
---
 git-send-email.perl |  133 ---
 1 file changed, 127 insertions(+), 6 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index ac3b02d..5094267 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -203,6 +203,7 @@ my ($validate, $confirm);
 my (@suppress_cc);
 my ($auto_8bit_encoding);
 my ($compose_encoding);
+my ($msgid_cache_file, $msgid_cache_maxsize);
 
 my ($debug_net_smtp) = 0;  # Net::SMTP, see send_message()
 
@@ -237,6 +238,8 @@ my %config_settings = (
 "from" => \$sender,
 "assume8bitencoding" => \$auto_8bit_encoding,
 "composeencoding" => \$compose_encoding,
+"msgidcachefile" => \$msgid_cache_file,
+"msgidcachemaxsize" => \$msgid_cache_maxsize,
 );
 
 my %config_path_settings = (
@@ -796,11 +799,23 @@ sub expand_one_alias {
 @bcclist = expand_aliases(@bcclist);
 @bcclist = validate_address_list(sanitize_address_list(@bcclist));
 
+if ($compose && $compose > 0) {
+   @files = ($compose_filename . ".final", @files);
+}
+
 if ($thread && !defined $initial_reply_to && $prompting) {
+   my @choices = ();
+   if ($msgid_cache_file) {
+   my $first_subject = get_patch_subject($files[0]);
+   $first_subject =~ s/^GIT: //;
+   @choices = msgid_cache_getmatches($first_subject, 10);
+   @choices = map {[$_->{id}, sprintf "[%s] %s", 
format_2822_time($_->{epoch}), $_->{subject}]} @choices;
+   }
$initial_reply_to = ask(
"Message-ID to be used as In-Reply-To for the first email (if 
any)? ",
default => "",
-   valid_re => qr/\@.*\./, confirm_only => 1);
+   valid_re => qr/\@.*\./, confirm_only => 1,
+   choices => \@choices);
 }
 if (defined $initial_reply_to) {
$initial_reply_to =~ s/^\s* 0) {
-   @files = ($compose_filename . ".final", @files);
-}
-
 # Variables we set as part of the loop over files
 our ($message_id, %mail, $subject, $reply_to, $references, $message,
$needs_confirm, $message_num, $ask_default);
@@ -1136,7 +1147,7 @@ sub send_message {
my $to = join (",\n\t", @recipients);
@recipients = unique_email_list(@recipients,@cc,@bcclist);
@recipients = (map { extract_valid_address_or_die($_) } @recipients);
-   my $date = format_2822_time($time++);
+   my $date = format_2822_time($time);
my $gitversion = '@@GIT_VERSION@@';
if ($gitversion =~ m/..GIT_VERSION../) {
$gitversion = Git::version();
@@ -1477,6 +1488,11 @@ foreach my $t (@files) {
 
my $message_was_sent = send_message();
 
+   if ($message_was_sent && $msgid_cache_file && !$dry_run) {
+   msgid_cache_this($message_id, $message_num == 1 ? 1 : 0, , 
$time, $subject);
+   }
+   $time++;
+
# set up for the next message
if ($thread && $message_was_sent &&
($chain_reply_to || !defined $reply_to || length($reply_to) == 
0 ||
@@ -1521,6 +1537,8 @@ sub cleanup_compose_files {
 
 $smtp->quit if $smtp;
 
+msgid_cache_write() if $msgid_cache_file && !$dry_run;
+
 sub unique_email_list {
my %seen;
my @emails;
@@ -1569,3 +1587,106 @@ sub body_or_subject_has_nonascii {
}
return 0;
 }
+
+my @msgid_new_entries;
+sub msgid_cache_this {
+   my $msgid = shift;
+   my $first = shift;
+   my $epoch = shift;
+   my $subject = shift;
+   # Make sure there are no tabs which will confuse us, and save
+   # some valuable horizontal real-estate by removing redundant
+   # whitespace.
+   if ($subject) {
+   $subject =~ s/^\s+|\s+$//g;
+   $subject =~ s/\s+/ /g;
+   }
+   

[PATCH 0/3] t3404 incremental improvements

2013-08-21 Thread Eric Sunshine
This set of patches was meant to be a re-roll of [1] addressing Junio's
comments, however [1] graduated to 'next' before I found time to work on
it further, so these are instead incremental patches atop 'next'.

patch 1: address Junio's comment [2]

patch 2: address Junio's comment [3] with a sledge-hammer approach;
whether to accept this patch is a judgment call

patch 3: trivial cleanup which should make the test easier to comprehend
for future readers

[1]: http://thread.gmane.org/gmane.comp.version-control.git/232146
[2]: http://thread.gmane.org/gmane.comp.version-control.git/232146/focus=232159
[3]: http://thread.gmane.org/gmane.comp.version-control.git/232146/focus=232158

Eric Sunshine (3):
  t3404: preserve test_tick state across short SHA-1 collision test
  t3404: make tests more self-contained
  t3404: simplify short SHA-1 collision test

 t/t3404-rebase-interactive.sh | 77 ---
 1 file changed, 72 insertions(+), 5 deletions(-)

-- 
1.8.4.rc4.499.gfb33910

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] t3404: preserve test_tick state across short SHA-1 collision test

2013-08-21 Thread Eric Sunshine
The short SHA-1 collision test requires carefully crafted commits in
order to ensure a collision at rebase time. This involves managing state
which impacts the resulting SHA-1, including commit time. To accomplish
this, test_tick is set to a known state for the short SHA-1 collision
test.

Unfortunately, doing so throws away the existing state of test_tick,
which may be problematic for subsequently added tests. Fix this by
preserving the existing state of test_tick across the short SHA-1
collision test.

Signed-off-by: Eric Sunshine 
---
 t/t3404-rebase-interactive.sh | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index b65b774..6be97ba 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -994,17 +994,23 @@ test_expect_success 'short SHA-1 setup' '
test_when_finished "git checkout master" &&
git checkout --orphan collide &&
git rm -rf . &&
+   (
unset test_tick &&
test_commit collide1 collide &&
test_commit --notick collide2 collide &&
test_commit --notick "collide3 115158b5" collide collide3 collide3
+   )
 '
 
 test_expect_success 'short SHA-1 collide' '
test_when_finished "reset_rebase && git checkout master" &&
git checkout collide &&
+   (
+   unset test_tick &&
+   test_tick &&
FAKE_COMMIT_MESSAGE="collide2 815200e" \
FAKE_LINES="reword 1 2" git rebase -i HEAD~2
+   )
 '
 
 test_done
-- 
1.8.4.rc4.499.gfb33910

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] t3404: make tests more self-contained

2013-08-21 Thread Eric Sunshine
As its very first action, t3404 installs (via set_fake_editor) a
specialized $EDITOR which simplifies automated 'rebase -i' testing. Many
tests rely upon this setting, thus tests which need a different editor
must take extra care upon completion to restore $EDITOR in order to
avoid breaking following tests. This places extra burden upon such tests
and requires that they undesirably have extra knowledge about
surrounding tests. Ease this burden by having each test install the
$EDITOR it requires, rather than relying upon a global setting.

Signed-off-by: Eric Sunshine 
---
 t/t3404-rebase-interactive.sh | 67 +--
 1 file changed, 64 insertions(+), 3 deletions(-)

diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index 6be97ba..7d15c7a 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -29,8 +29,6 @@ Initial setup:
 
 . "$TEST_DIRECTORY"/lib-rebase.sh
 
-set_fake_editor
-
 # WARNING: Modifications to the initial repository can change the SHA ID used
 # in the expect2 file for the 'stop on conflicting pick' test.
 
@@ -72,6 +70,7 @@ export SHELL
 test_expect_success 'rebase -i with the exec command' '
git checkout master &&
(
+   set_fake_editor &&
FAKE_LINES="1 exec_>touch-one
2 exec_>touch-two exec_false exec_>touch-three
3 4 
exec_>\"touch-file__name_with_spaces\";_>touch-after-semicolon 5" &&
@@ -93,6 +92,7 @@ test_expect_success 'rebase -i with the exec command' '
 test_expect_success 'rebase -i with the exec command runs from tree root' '
git checkout master &&
mkdir subdir && (cd subdir &&
+   set_fake_editor &&
FAKE_LINES="1 exec_>touch-subdir" \
git rebase -i HEAD^
) &&
@@ -103,6 +103,7 @@ test_expect_success 'rebase -i with the exec command runs 
from tree root' '
 test_expect_success 'rebase -i with the exec command checks tree cleanness' '
git checkout master &&
(
+   set_fake_editor &&
FAKE_LINES="exec_echo_foo_>file1 1" &&
export FAKE_LINES &&
test_must_fail git rebase -i HEAD^
@@ -116,6 +117,7 @@ test_expect_success 'rebase -i with exec of inexistent 
command' '
git checkout master &&
test_when_finished "git rebase --abort" &&
(
+   set_fake_editor &&
FAKE_LINES="exec_this-command-does-not-exist 1" &&
export FAKE_LINES &&
test_must_fail git rebase -i HEAD^ >actual 2>&1
@@ -125,6 +127,7 @@ test_expect_success 'rebase -i with exec of inexistent 
command' '
 
 test_expect_success 'no changes are a nop' '
git checkout branch2 &&
+   set_fake_editor &&
git rebase -i F &&
test "$(git symbolic-ref -q HEAD)" = "refs/heads/branch2" &&
test $(git rev-parse I) = $(git rev-parse HEAD)
@@ -134,6 +137,7 @@ test_expect_success 'test the [branch] option' '
git checkout -b dead-end &&
git rm file6 &&
git commit -m "stop here" &&
+   set_fake_editor &&
git rebase -i F branch2 &&
test "$(git symbolic-ref -q HEAD)" = "refs/heads/branch2" &&
test $(git rev-parse I) = $(git rev-parse branch2) &&
@@ -142,6 +146,7 @@ test_expect_success 'test the [branch] option' '
 
 test_expect_success 'test --onto ' '
git checkout -b test-onto branch2 &&
+   set_fake_editor &&
git rebase -i --onto branch1 F &&
test "$(git symbolic-ref -q HEAD)" = "refs/heads/test-onto" &&
test $(git rev-parse HEAD^) = $(git rev-parse branch1) &&
@@ -151,6 +156,7 @@ test_expect_success 'test --onto ' '
 test_expect_success 'rebase on top of a non-conflicting commit' '
git checkout branch1 &&
git tag original-branch1 &&
+   set_fake_editor &&
git rebase -i branch2 &&
test file6 = $(git diff --name-only original-branch1) &&
test "$(git symbolic-ref -q HEAD)" = "refs/heads/branch1" &&
@@ -163,6 +169,7 @@ test_expect_success 'reflog for the branch shows state 
before rebase' '
 '
 
 test_expect_success 'exchange two commits' '
+   set_fake_editor &&
FAKE_LINES="2 1" git rebase -i HEAD~2 &&
test H = $(git cat-file commit HEAD^ | sed -ne \$p) &&
test G = $(git cat-file commit HEAD | sed -ne \$p)
@@ -188,6 +195,7 @@ EOF
 
 test_expect_success 'stop on conflicting pick' '
git tag new-branch1 &&
+   set_fake_editor &&
test_must_fail git rebase -i master &&
test "$(git rev-parse HEAD~3)" = "$(git rev-parse master)" &&
test_cmp expect .git/rebase-merge/patch &&
@@ -208,6 +216,7 @@ test_expect_success 'abort' '
 test_expect_success 'abort with error when new base cannot be checked out' '
git rm --cached file1 &&
git commit -m "remove file in base" &&
+   set_fake_editor &&
test_must_fail git rebase -i master > output 2>&1 &&
grep "The following untracked working tree files would be overwritten 
by che

[PATCH 3/3] t3404: simplify short SHA-1 collision test

2013-08-21 Thread Eric Sunshine
The short SHA-1 collision test employs two "magic" values in commit
messages to trigger an "ambiguous SHA-1 error" during 'rebase -i' -- one
for 'collide3' during setup, and one for 'collide2' at rebase time --
even though the collision can just as easily be triggered by a single
magic value at rebase time only. The setup-time 'collide3' magic value
serves no purpose [1], but can easily mislead readers into thinking that
it is somehow significant even though it is not. Drop this unneeded bit
of magic.

As a side-effect, this also simplifies the setup-time "test_commit
collide3" invocation, making it easier to read.

[1]: The magic value on 'collide3' gave it a short SHA-1 prefix of
"badbeef" which could be spotted easily in verbose output during
development of the test, but is otherwise not meaningful or helpful.

Signed-off-by: Eric Sunshine 
---
 t/t3404-rebase-interactive.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index 7d15c7a..50e22b1 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -1058,7 +1058,7 @@ test_expect_success 'short SHA-1 setup' '
unset test_tick &&
test_commit collide1 collide &&
test_commit --notick collide2 collide &&
-   test_commit --notick "collide3 115158b5" collide collide3 collide3
+   test_commit --notick collide3 collide
)
 '
 
@@ -1069,7 +1069,7 @@ test_expect_success 'short SHA-1 collide' '
unset test_tick &&
test_tick &&
set_fake_editor &&
-   FAKE_COMMIT_MESSAGE="collide2 815200e" \
+   FAKE_COMMIT_MESSAGE="collide2 ac4f2ee" \
FAKE_LINES="reword 1 2" git rebase -i HEAD~2
)
 '
-- 
1.8.4.rc4.499.gfb33910

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH] git-remote-mediawiki: reset private ref after non-dumb push

2013-08-21 Thread Felipe Contreras
On Tue, Aug 13, 2013 at 8:31 AM, Matthieu Moy  wrote:
>
> Git-mediawiki's "dumb push" sends the local revisions to the remote wiki,
> but does not update the local metadata to reflect the push (hence, the
> next pull will have to re-import the exported revisions).
>
> The previous implementation was simply omitting the update to the private
> ref after a dumb push. This was broken by 664059fb62 (Felipe Contreras,
> Apr 17 2013, transport-helper: update remote helper namespace), which
> does an automatic update of the private ref (not just the
> remote-tracking) on push.
>
> This patch fixes git-remote-mediawiki to reset the private ref after the
> push is completed, cancelling the automatic update triggered by
> 664059fb62.
>
> Signed-off-by: Matthieu Moy 
> ---
> Felipe: Is this the right fix for git-remote-mediawiki? Any better idea?

Why not keep track of the revisions yourself? You can have file where
you store which was the last revision that was fetched.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 1/2] xread, xwrite: Limit size of IO, fixing IO of 2GB and more on Mac OS X

2013-08-21 Thread Torsten Bögershausen
On 2013-08-20 08.43, Steffen Prohaska wrote:
[]
Thanks for V5. It was tested OK on my system here.
(And apologies for recommending a wrapper on top of a wrapper).

One question is left: 
As xread() is tolerant against EAGAIN and especially EINTR,
could it make sense to replace read() with xread() everywhere?

(The risk for getting EINTR is smaller when we only read a small amount
of data, but it is more on the safe side)

And s/write/xwrite/

/Torsten


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git access to Bzr repository fails for enwc

2013-08-21 Thread Felipe Contreras
On Thu, Aug 15, 2013 at 11:31 AM, Stefan Monnier
 wrote:
> I've had good success recently with the git->bzr bridge, but the
> following still fails.  This is on Debian with the git from "unstable".

Which version are you using? The latest version works fine here:

https://github.com/felipec/git/blob/fc/master/git-remote-bzr.py

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rebase --preserve-merges: ignore "merge.log" config

2013-08-21 Thread Eric Sunshine
On Wed, Aug 21, 2013 at 2:48 PM, Ralf Thielow  wrote:
> When "merge.log" config is set, "rebase --preserve-merges"
> will add the log lines to the message of the rebased merge
> commit.
>
> A rebase should not modify a commit message automatically.
>
> Teach "git-rebase" to ignore that configuration by passing "--no-log"
> to the git-merge call.
>
> Signed-off-by: Ralf Thielow 
> ---
> diff --git a/t/t3409-rebase-preserve-merges.sh 
> b/t/t3409-rebase-preserve-merges.sh
> index 2e0c364..2454811 100755
> --- a/t/t3409-rebase-preserve-merges.sh
> +++ b/t/t3409-rebase-preserve-merges.sh
> @@ -96,4 +108,17 @@ test_expect_success 'rebase -p preserves no-ff merges' '
> )
>  '
>
> +test_expect_success 'rebase -p ignores merge.log config' '
> +   (
> +   cd clone4 &&
> +   git fetch &&
> +   git -c merge.log=1 rebase -p origin/topic &&
> +   cat >expected <<-\EOF &&
> +
> +   EOF

This might be clearer with a simple 'echo' instead of 'cat' with heredoc:

  echo >expected &&

> +   git log --format="%b" -1 >current
> +   test_cmp expected current
> +   )
> +'
> +
>  test_done
> --
> 1.8.4.rc4.dirty
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] git-remote-mediawiki: reset private ref after non-dumb push

2013-08-21 Thread Matthieu Moy
Junio C Hamano  writes:

>> diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl 
>> b/contrib/mw-to-git/git-remote-mediawiki.perl
>> index f8d7d2c..13919ad 100755
>> --- a/contrib/mw-to-git/git-remote-mediawiki.perl
>> +++ b/contrib/mw-to-git/git-remote-mediawiki.perl
>> @@ -53,6 +53,7 @@ if (@ARGV != 2) {
>>  
>>  my $remotename = $ARGV[0];
>>  my $url = $ARGV[1];
>> +my $reset_private_ref_to = undef;
>>  
>>  # Accept both space-separated and multiple keys in config file.
>>  # Spaces should be written as _ anyway because we'll use chomp.
>> @@ -161,6 +162,9 @@ sub parse_command {
>>  my ($line) = @_;
>>  my @cmd = split(/ /, $line);
>>  if (!defined $cmd[0]) {
>> +if ($reset_private_ref_to) {
>> +run_git("update-ref -m \"Git-MediaWiki non-dumb push\" 
>> refs/mediawiki/$remotename/master $reset_private_ref_to");
>> +}
>
> So reset-private-ref-to is recorded for a non-dumb push, but...

> ... it is set for dumb-push?  I am confused.

Oops, I'm the one who did the confusion indeed. It should be
s/non-dumb/dumb/ here and in the subject line.

Don't merge this one, I've fixed locally and will resend (this or
another fix, depending on the outcome of the discussion).

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


t3010 broken by 2eac2a4

2013-08-21 Thread Brian Gernhardt
With 2eac2a4: "ls-files -k: a directory only can be killed if the index has a 
non-directory" applied, t3010 fails test 3 "validate git ls-files -k output".  
It ends up missing the pathx/ju/nk file.

OS X 10.8.4
Xcode 4.6.3
clang "Apple LLVM version 4.2 (clang-425.0.28) (based on LLVM 3.2svn)" 

~~ Brian Gernhardt

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] t9902-completion.sh: old Bash still does not support array+=('') notation

2013-08-21 Thread Brandon Casey
From: Brandon Casey 

Old Bash (3.0) which is distributed with RHEL 4.X and other ancient
platforms that are still in wide use, does not understand the
array+=() notation.  Let's use an explicit assignment to the new array
element which works everywhere, like:

   array[${#array[@]}+1]=''

The right-hand side '' is not strictly necessary, but in this case
I think it is more clear.

Signed-off-by: Brandon Casey 
---
 t/t9902-completion.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
index 272a071..2d4beb5 100755
--- a/t/t9902-completion.sh
+++ b/t/t9902-completion.sh
@@ -69,7 +69,7 @@ run_completion ()
local -a COMPREPLY _words
local _cword
_words=( $1 )
-   test "${1: -1}" = ' ' && _words+=('')
+   test "${1: -1}" = ' ' && _words[${#_words[@]}+1]=''
(( _cword = ${#_words[@]} - 1 ))
__git_wrap__git_main && print_comp
 }
-- 
1.8.4.rc0.2.g6cf5c31


---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] git-completion.bash: use correct Bash/Zsh array length syntax

2013-08-21 Thread Brandon Casey
From: Brandon Casey 

The syntax for retrieving the number of elements in an array is:

   ${#name[@]}

Signed-off-by: Brandon Casey 
---
 contrib/completion/git-completion.bash | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 5da920e..e1b7313 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -2580,7 +2580,7 @@ if [[ -n ${ZSH_VERSION-} ]]; then
--*=*|*.) ;;
*) c="$c " ;;
esac
-   array[$#array+1]="$c"
+   array[${#array[@]}+1]="$c"
done
compset -P '*[=:]'
compadd -Q -S '' -p "${2-}" -a -- array && _ret=0
-- 
1.8.4.rc0.2.g6cf5c31


---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC/PATCH 0/4] duplicate objects in packfiles

2013-08-21 Thread Jeff King
On Fri, Aug 16, 2013 at 11:01:38AM -0400, Jeff King wrote:

> That makes me inclined to teach index-pack to reject duplicate objects
> in a single pack in order to prevent denial-of-service attacks. We can
> potentially make them work in all code paths, but given that nobody
> should be doing this legitimately, rejecting the duplicates outright
> keeps our attack surface small, and nobody but attackers or users of
> broken implementations should care.

Here's a patch series in that direction:

  [1/4]: sha1-lookup: handle duplicate keys with GIT_USE_LOOKUP

This reproduces and fixes the sha1-lookup bug. We should do this no
matter what else we do.

  [2/4]: index-pack: optionally reject packs with duplicate objects

This adds a pack.indexDuplicates option so that sites receiving
packfiles from random folks on the internet can protect themselves from
the potential denial-of-service mentioned above. The default remains to
allow it.

  [3/4]: reject duplicates when indexing a packfile we created

This is a safety check for packs we generate. Optional, but I think it's
probably a good idea (and doesn't cost very much).

  [4/4]: index-pack: optionally skip duplicate packfile entries

I really wanted to have a "fix" mode where we could take in packs with
duplicate entries and just use them as-is. It's not correct to throw
away the duplicates (they may be bases in a delta cycle), but we could
maybe get by with simply not referencing them in the pack index.
Unfortunately, the pack reader does not like the index we generate; see
the patch for details and possible solutions (all of which are ugly).
And it only helps in a delta cycle when delta base offsets are used.

I had hoped to have a 5/4 flipping the default to "skip", since it would
potentially fix the infinite loop problem and wouldn't have a downside.
But since it doesn't work (and cannot fix the REF_DELTA case), it seems
like a bad idea.

Which leaves the open question: should the default for index-pack flip
to reject duplicates rather than allow? It seems like it would be worth
it to identify buggy packfiles before they move between repos. And in an
emergency, we have the config flag to be more permissive in case
somebody really needs to move the data via git.

Thoughts?

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] Revert "bash prompt: avoid command substitution when finalizing gitstring"

2013-08-21 Thread Brandon Casey
From: Brandon Casey 

This reverts commit 69a8141a5d81925b7e08cb228535e9ea4a7a02e3.

Old Bash (3.0) which is distributed with RHEL 4.X and other ancient
platforms that are still in wide use, does not have a printf that
supports -v.  Let's revert this patch and go back to using printf
in the traditional way.

Signed-off-by: Brandon Casey 
---
 contrib/completion/git-prompt.sh | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/contrib/completion/git-prompt.sh b/contrib/completion/git-prompt.sh
index a81ef5a..7698ec4 100644
--- a/contrib/completion/git-prompt.sh
+++ b/contrib/completion/git-prompt.sh
@@ -433,11 +433,7 @@ __git_ps1 ()
local gitstring="$c${b##refs/heads/}${f:+$z$f}$r$p"
 
if [ $pcmode = yes ]; then
-   if [[ -n ${ZSH_VERSION-} ]]; then
-   gitstring=$(printf -- "$printf_format" "$gitstring")
-   else
-   printf -v gitstring -- "$printf_format" "$gitstring"
-   fi
+   gitstring=$(printf -- "$printf_format" "$gitstring")
PS1="$ps1pc_start$gitstring$ps1pc_end"
else
printf -- "$printf_format" "$gitstring"
-- 
1.8.4.rc0.2.g6cf5c31


---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] sha1-lookup: handle duplicate keys with GIT_USE_LOOKUP

2013-08-21 Thread Jeff King
The sha1_entry_pos function tries to be smart about
selecting the middle of a range for its binary search by
looking at the value differences between the "lo" and "hi"
constraints. However, it is unable to cope with entries with
duplicate keys in the sorted list.

We may hit a point in the search where both our "lo" and
"hi" point to the same key. In this case, the range of
values between our endpoints is 0, and trying to scale the
difference between our key and the endpoints over that range
is undefined (i.e., divide by zero). The current code
catches this with an "assert(lov < hiv)".

Moreover, after seeing that the first 20 byte of the key are
the same, we will try to establish a value from the 21st
byte. Which is nonsensical.

Instead, we can detect the case that we are in a run of
duplicates, and simply do a final comparison against any one
of them (since they are all the same, it does not matter
which). If the keys match, we have found our entry (or one
of them, anyway).  If not, then we know that we do not need
to look further, as we must be in a run of the duplicate
key.

Furthermore, we know that one of our endpoints must be
the edge of the run of duplicates. For example, given this
sequence:

 idx 0 1 2 3 4 5
 key A C C C C D

If we are searching for "B", we might hit the duplicate run
at lo=1, hi=3 (e.g., by first mi=3, then mi=0). But we can
never have lo > 1, because B < C. That is, if our key is
less than the run, we know that "lo" is the edge, but we can
say nothing of "hi". Similarly, if our key is greater than
the run, we know that "hi" is the edge, but we can say
nothing of "lo". But that is enough for us to return not
only "not found", but show the position at which we would
insert the new item.

Signed-off-by: Jeff King 
---
 sha1-lookup.c | 23 ++
 t/t5308-pack-detect-duplicates.sh | 97 +++
 2 files changed, 120 insertions(+)
 create mode 100755 t/t5308-pack-detect-duplicates.sh

diff --git a/sha1-lookup.c b/sha1-lookup.c
index c4dc55d..614cbb6 100644
--- a/sha1-lookup.c
+++ b/sha1-lookup.c
@@ -204,7 +204,30 @@ int sha1_entry_pos(const void *table,
 * byte 0 thru (ofs-1) are the same between
 * lo and hi; ofs is the first byte that is
 * different.
+*
+* If ofs==20, then no bytes are different,
+* meaning we have entries with duplicate
+* keys. We know that we are in a solid run
+* of this entry (because the entries are
+* sorted, and our lo and hi are the same,
+* there can be nothing but this single key
+* in between). So we can stop the search.
+* Either one of these entries is it (and
+* we do not care which), or we do not have
+* it.
 */
+   if (ofs == 20) {
+   mi = lo;
+   mi_key = base + elem_size * mi + key_offset;
+   cmp = memcmp(mi_key, key, 20);
+   if (!cmp)
+   return mi;
+   if (cmp < 0)
+   return -1 - hi;
+   else
+   return -1 - lo;
+   }
+
hiv = hi_key[ofs_0];
if (ofs_0 < 19)
hiv = (hiv << 8) | hi_key[ofs_0+1];
diff --git a/t/t5308-pack-detect-duplicates.sh 
b/t/t5308-pack-detect-duplicates.sh
new file mode 100755
index 000..4800379
--- /dev/null
+++ b/t/t5308-pack-detect-duplicates.sh
@@ -0,0 +1,97 @@
+#!/bin/sh
+
+test_description='handling of duplicate objects in incoming packfiles'
+. ./test-lib.sh
+
+# git will never intentionally create packfiles with
+# duplicate objects, so we have to construct them by hand.
+#
+# $1 is the number of times to duplicate each object (must be
+# less than 127 for simplicify of implementation).
+create_pack() {
+   {
+   # header magic
+   printf 'PACK' &&
+   # version 2
+   printf '\0\0\0\2' &&
+   # num of objects
+   printf '\0\0\0\'"$(printf "%o" $(($1*2)))" &&
+
+   for i in $(test_seq 1 "$1"); do
+   # blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
+   printf '\060\170\234\003\0\0\0\0\1' &&
+   # blob e68fe8129b546b101aee9510c5328e7f21ca1d18
+   printf '\062\170\234\143\267\3\0\0\116\0\106'
+   done
+   } >tmp.pack &&
+   # we store and cat the pack because we also have to output
+   # the sha1 pack trailer
+   cat tmp.pack

[PATCH 2/4] index-pack: optionally reject packs with duplicate objects

2013-08-21 Thread Jeff King
Git should never generate packs with duplicate objects.
However, we may see such packs due to bugs in Git or other
implementations (e.g., JGit had such a bug a few years ago).

In theory, such packs should not be a problem for us (we
will simply find one of the instances of the object when
looking in the pack). However, the JGit bug report mentioned
possible infinite loops during repacking due to cycles in
the delta chain.  Though this problem hasn't specifically
been reproduced on modern git, there is no reason not to be
careful with incoming packs, given that only buggy
implementations should be producing such packs, anyway.

This patch introduces the pack.indexDuplicates option to
allow or reject such packs from index-pack. The default
remains to allow it.

Signed-off-by: Jeff King 
---
 builtin/index-pack.c  | 10 ++
 pack-write.c  | 23 +++
 pack.h|  5 +
 t/t5308-pack-detect-duplicates.sh |  8 
 4 files changed, 46 insertions(+)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 9c1cfac..83fd3bb 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1369,6 +1369,16 @@ static int git_index_pack_config(const char *k, const 
char *v, void *cb)
 #endif
return 0;
}
+   if (!strcmp(k, "pack.indexduplicates")) {
+   int boolval = git_config_maybe_bool(k, v);
+   if (boolval > 0 || !strcmp(v, "ignore"))
+   opts->duplicates = WRITE_IDX_DUPLICATES_IGNORE;
+   else if (boolval == 0 || !strcmp(v, "reject"))
+   opts->duplicates = WRITE_IDX_DUPLICATES_REJECT;
+   else
+   die("unknown value for pack.indexduplicates: %s", v);
+   return 0;
+   }
return git_default_config(k, v, cb);
 }
 
diff --git a/pack-write.c b/pack-write.c
index ca9e63b..542b081 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -37,6 +37,19 @@ static int need_large_offset(off_t offset, const struct 
pack_idx_option *opts)
 sizeof(ofsval), cmp_uint32);
 }
 
+static void *find_duplicate(void *vbase, size_t n, size_t size,
+   int (*cmp)(const void *, const void *))
+{
+   unsigned char *base = vbase;
+   while (n > 1) {
+   if (!cmp(base, base + size))
+   return base;
+   base += size;
+   n--;
+   }
+   return NULL;
+}
+
 /*
  * On entry *sha1 contains the pack content SHA1 hash, on exit it is
  * the SHA1 hash of sorted object names. The objects array passed in
@@ -68,6 +81,16 @@ const char *write_idx_file(const char *index_name, struct 
pack_idx_entry **objec
else
sorted_by_sha = list = last = NULL;
 
+   if (opts->duplicates == WRITE_IDX_DUPLICATES_REJECT) {
+   struct pack_idx_entry **dup;
+
+   dup = find_duplicate(sorted_by_sha, nr_objects,
+sizeof(*sorted_by_sha), sha1_compare);
+   if (dup)
+   die("pack has duplicate entries for %s",
+   sha1_to_hex((*dup)->sha1));
+   }
+
if (opts->flags & WRITE_IDX_VERIFY) {
assert(index_name);
f = sha1fd_check(index_name);
diff --git a/pack.h b/pack.h
index aa6ee7d..cd4b536 100644
--- a/pack.h
+++ b/pack.h
@@ -44,6 +44,11 @@ struct pack_idx_option {
uint32_t version;
uint32_t off32_limit;
 
+   enum {
+   WRITE_IDX_DUPLICATES_IGNORE = 0,
+   WRITE_IDX_DUPLICATES_REJECT
+   } duplicates;
+
/*
 * List of offsets that would fit within off32_limit but
 * need to be written out as 64-bit entity for byte-for-byte
diff --git a/t/t5308-pack-detect-duplicates.sh 
b/t/t5308-pack-detect-duplicates.sh
index 4800379..0f2d928 100755
--- a/t/t5308-pack-detect-duplicates.sh
+++ b/t/t5308-pack-detect-duplicates.sh
@@ -94,4 +94,12 @@ test_expect_success 'lookup in duplicated pack 
(GIT_USE_LOOKUP)' '
test_cmp expect actual
 '
 
+test_expect_success 'index-pack can reject packs with duplicates' '
+   clear_packs &&
+   create_pack 2 >dups.pack &&
+   test_must_fail \
+   git -c pack.indexDuplicates=0 index-pack --stdin http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] reject duplicates when indexing a packfile we created

2013-08-21 Thread Jeff King
The pack index-writing code is capable of detecting and
rejecting duplicate entries. This can be used with
index-pack to prevent buggy foreign packs from entering the
repository.  However, we can also be careful about the packs
we generate, by double-checking during the index write that
we do not have any duplicate objects. This should never
happen, but it serves as an additional check that we are not
generating such packfiles.

Signed-off-by: Jeff King 
---
This turns on rejection for everywhere _except_ a separately-run
index-pack. If we decide to turn it on there, too, it would make sense
to scrap this patch and just put it in reset_pack_idx_opts().

 builtin/pack-objects.c | 1 +
 bulk-checkin.c | 1 +
 fast-import.c  | 1 +
 3 files changed, 3 insertions(+)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index f069462..8da2a66 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2511,6 +2511,7 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
read_replace_refs = 0;
 
reset_pack_idx_option(&pack_idx_opts);
+   pack_idx_opts.duplicates = WRITE_IDX_DUPLICATES_REJECT;
git_config(git_pack_config, NULL);
if (!pack_compression_seen && core_compression_seen)
pack_compression_level = core_compression_level;
diff --git a/bulk-checkin.c b/bulk-checkin.c
index 6b0b6d4..bad191f 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -174,6 +174,7 @@ static void prepare_to_stream(struct bulk_checkin_state 
*state,
 
state->f = create_tmp_packfile(&state->pack_tmp_name);
reset_pack_idx_option(&state->pack_idx_opts);
+   state->pack_idx_opts.duplicates = WRITE_IDX_DUPLICATES_REJECT;
 
/* Pretend we are going to write only one object */
state->offset = write_pack_header(state->f, 1);
diff --git a/fast-import.c b/fast-import.c
index 23f625f..b7beab0 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -3360,6 +3360,7 @@ int main(int argc, char **argv)
 
setup_git_directory();
reset_pack_idx_option(&pack_idx_opts);
+   pack_idx_opts.duplicates = WRITE_IDX_DUPLICATES_REJECT;
git_config(git_pack_config, NULL);
if (!pack_compression_seen && core_compression_seen)
pack_compression_level = core_compression_level;
-- 
1.8.4.rc2.28.g6bb5f3f

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[DO NOT APPLY PATCH 4/4] index-pack: optionally skip duplicate packfile entries

2013-08-21 Thread Jeff King
When we are building the pack index, we can notice that
there are duplicate objects, pick one "winner" instance, and
mention the object only once in the index (mapped to the
winner's offset).

This has the effect that the duplicate packfile entries are
never found by lookup. The data still exists in the
packfile, though, and can be used as a delta base if delta
base offsets are used. If delta refs are used, then it is
possible that some deltas may be broken.

Unfortunately, this scheme does not quite work, as the pack
reader checks that the index and packfile claim the same
number of objects, and will refuse to open such a packfile.

We have a few options:

  1. Loosen the check in the reader. This drops a
 potentially useful sanity check on the data, and it
 won't work for any other implementations (including
 older versions of git).

  2. Loosen the check only if a special flag is found in the
 index indicating that we removed duplicates. This means
 that we only lose the safety check in the rare case
 that duplicates are found. But there is actually no
 place in the index to put such a flag, and in addition
 no other implementation would understand our flag.

  3. Instead of reducing the numnber of objects in the
 index, include "dummy" entries using the null sha1 or a
 similar sentinel value (and pointing to some invalid
 offset). This should work, but is awfully hacky (and
 will probably create havoc as we will now claim to have
 the null sha1, but will throw errors if you actually
 look at it).

Signed-off-by: Jeff King 
---
I think this line of "fixing" should probably be scrapped. Truly fixing
it and covering the REF_DELTA case would involve magic in the actual
object lookups (we would have to detect cycles and find the "other"
object that is the real base). And it's probably just not worth the
effort.

 builtin/index-pack.c  |  2 ++
 pack-write.c  | 30 ++
 pack.h|  3 ++-
 t/t5308-pack-detect-duplicates.sh |  8 
 4 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 83fd3bb..1dadd56 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1375,6 +1375,8 @@ static int git_index_pack_config(const char *k, const 
char *v, void *cb)
opts->duplicates = WRITE_IDX_DUPLICATES_IGNORE;
else if (boolval == 0 || !strcmp(v, "reject"))
opts->duplicates = WRITE_IDX_DUPLICATES_REJECT;
+   else if (!strcmp(v, "skip"))
+   opts->duplicates = WRITE_IDX_DUPLICATES_SKIP;
else
die("unknown value for pack.indexduplicates: %s", v);
return 0;
diff --git a/pack-write.c b/pack-write.c
index 542b081..657da2a 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -50,6 +50,27 @@ static void *find_duplicate(void *vbase, size_t n, size_t 
size,
return NULL;
 }
 
+static size_t remove_duplicates(void *base, size_t n, size_t size,
+   int (*cmp)(const void *, const void *))
+{
+   unsigned char *from = base, *to = base;
+
+   from += size;
+   to += size;
+   n--;
+
+   while (n > 0) {
+   if (cmp(from, from - size)) {
+   if (to != from)
+   memcpy(to, from, size);
+   to += size;
+   }
+   from += size;
+   n--;
+   }
+   return (to - (unsigned char *)base) / size;
+}
+
 /*
  * On entry *sha1 contains the pack content SHA1 hash, on exit it is
  * the SHA1 hash of sorted object names. The objects array passed in
@@ -89,6 +110,15 @@ const char *write_idx_file(const char *index_name, struct 
pack_idx_entry **objec
if (dup)
die("pack has duplicate entries for %s",
sha1_to_hex((*dup)->sha1));
+   } else if (opts->duplicates == WRITE_IDX_DUPLICATES_SKIP) {
+   int nr_unique = remove_duplicates(sorted_by_sha,
+ nr_objects,
+ sizeof(*sorted_by_sha),
+ sha1_compare);
+   if (nr_unique != nr_objects) {
+   nr_objects = nr_unique;
+   last = sorted_by_sha + nr_objects;
+   }
}
 
if (opts->flags & WRITE_IDX_VERIFY) {
diff --git a/pack.h b/pack.h
index cd4b536..3017ea4 100644
--- a/pack.h
+++ b/pack.h
@@ -46,7 +46,8 @@ struct pack_idx_option {
 
enum {
WRITE_IDX_DUPLICATES_IGNORE = 0,
-   WRITE_IDX_DUPLICATES_REJECT
+   WRITE_IDX_DUPLICATES_REJECT,
+   WRITE_IDX_DUPLICATES_SKIP
} duplicates;
 
/*
diff --git a/t/t5308-pack-detect-du

Re: [RFC PATCHv6 1/2] repack: rewrite the shell script in C

2013-08-21 Thread Junio C Hamano
Stefan Beller  writes:

> The motivation of this patch is to get closer to a goal of being
> able to have a core subset of git functionality built in to git.
> That would mean
>
>  * people on Windows could get a copy of at least the core parts
>of Git without having to install a Unix-style shell
>
>  * people deploying to servers don't have to rewrite the #! line
>or worry about the PATH and quality of installed POSIX
>utilities, if they are only using the built-in part written
>in C

I am not sure what is meant by the latter.  Rewriting #! is part of
any scripted Porcelain done by the top-level Makefile, and I do not
think we have seen any problem reports on it.

As to "quality of ... utilities", I think the real issue some people
in the thread had was not about "deploying to servers" but about
installing in a minimalistic chrooted environment where standard
tools may be lacking.

> diff --git a/builtin/repack.c b/builtin/repack.c
> new file mode 100644
> index 000..fb050c0
> --- /dev/null
> +++ b/builtin/repack.c
> @@ -0,0 +1,376 @@
> +/*
> + * The shell version was written by Linus Torvalds (2005) and many others.
> + * This is a translation into C by Stefan Beller (2013)
> + */

I am not sure if we want to record "ownership" in the code like
this; it will go stale over time.

> +#include "builtin.h"
> +#include "cache.h"
> +#include "dir.h"
> +#include "parse-options.h"
> +#include "run-command.h"
> +#include "sigchain.h"
> +#include "strbuf.h"
> +#include "string-list.h"
> +#include "argv-array.h"
> +
> +/* enabled by default since 22c79eab (2008-06-25) */

It may be of some value that by default --delta-base-offset is used,
but that can be read from the initialization.  Do we need this
comment?

> +static int delta_base_offset = 1;
> +char *packdir;

Does this have to be global?

> +static const char *const git_repack_usage[] = {
> + N_("git repack [options]"),
> + NULL
> +};
> +
> +static int repack_config(const char *var, const char *value, void *cb)
> +{
> + if (!strcmp(var, "repack.usedeltabaseoffset")) {
> + delta_base_offset = git_config_bool(var, value);
> + return 0;
> + }
> + return git_default_config(var, value, cb);
> +}
> +
> +/*
> + * Remove temporary $GIT_OBJECT_DIRECTORY/pack/.tmp-$$-pack-* files.
> + */
> +static void remove_temporary_files(void)
> +{
> + struct strbuf buf = STRBUF_INIT;
> + size_t dirlen, prefixlen;
> + DIR *dir;
> + struct dirent *e;
> +
> + /* .git/objects/pack */

We can read what is in there from two strbuf calls without comment.

> + strbuf_addstr(&buf, get_object_directory());
> + strbuf_addstr(&buf, "/pack");

More importantly, you already know what this directory and what
packtmp prefix are.

Also, you can keep &buf empty until opendir() succeeds.

> + dir = opendir(buf.buf);
> + if (!dir) {
> + strbuf_release(&buf);
> + return;
> + }
> +
> + /* .git/objects/pack/.tmp-$$-pack-* */
> + dirlen = buf.len + 1;

Likewise; it is a good idea to document what "dirlen" points at,
though.

> + strbuf_addf(&buf, "/.tmp-%d-pack-", (int)getpid());
> + prefixlen = buf.len - dirlen;

So in summary:

dir = opendir(packdir);
if (!dir)
return;

strbuf_addf(&buf, "%s-", packtmp);

/* Point at the slash at the end of ".../objects/pack/" */
dirlen = strlen(packdir) + 1;
/* Point at the dash at the end of ".../.tmp-%d-pack-" */
prefixlen = buf.len - dirlen;

You would need to move the initialization of packdir and packtmp
before sigchain_push() in cmd_repack() if you were to do this.

> + while ((e = readdir(dir))) {
> + if (strncmp(e->d_name, buf.buf + dirlen, prefixlen))
> + continue;
> + strbuf_setlen(&buf, dirlen);
> + strbuf_addstr(&buf, e->d_name);
> + unlink(buf.buf);

This unlink(2) could fail, but there is not much we could do here.

> + }
> + closedir(dir);
> + strbuf_release(&buf);
> +}
> +
> +static void remove_pack_on_signal(int signo)
> +{
> + remove_temporary_files();
> + sigchain_pop(signo);
> + raise(signo);
> +}
> +
> +static void get_pack_filenames(struct string_list *fname_list)
> +{
> + DIR *dir;
> + struct dirent *e;
> + char *fname;
> +
> + if (!(dir = opendir(packdir)))
> + return;
> +
> + while ((e = readdir(dir)) != NULL) {
> + if (suffixcmp(e->d_name, ".pack"))
> + continue;

We may want to tighten this to ignore cruft that does not match

/^pack-[0-9a-f]{40}\.pack$/

in a later patch, but this is a faithful rewrite from the original.

> + size_t len = strlen(e->d_name) - strlen(".pack");

decl-after-stmt.

> + fname = xmemdupz(e->d_name, len);
> +
> + if (!file_exists(mkpath("%s/%s.keep", packdir, fname)))
> + string_lis

Re: [RFC/PATCH] git-remote-mediawiki: reset private ref after non-dumb push

2013-08-21 Thread Matthieu Moy
Felipe Contreras  writes:

> On Tue, Aug 13, 2013 at 8:31 AM, Matthieu Moy  wrote:
>
>> Felipe: Is this the right fix for git-remote-mediawiki? Any better idea?
>
> Why not keep track of the revisions yourself? You can have file where
> you store which was the last revision that was fetched.

I don't really understand the point of the "private namespace" anymore I
guess. Why do we have both refs/remotes/$remote and
refs/$foreign_vcs/$remote, if they are always kept in sync?

Keeping the last imported revision in a separate file would be possible,
but then we'd have information about the remote in one file plus two
refs, and I don't understand why we need to split the information in so
many places. A ref seemed the right tool to store a revision.

(These are genuine questions, you have far more experience than me with
remote-helpers)

Thanks,

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t3010 broken by 2eac2a4

2013-08-21 Thread Junio C Hamano
Brian Gernhardt  writes:

> With 2eac2a4: "ls-files -k: a directory only can be killed if the index has a 
> non-directory" applied, t3010 fails test 3 "validate git ls-files -k output". 
>  It ends up missing the pathx/ju/nk file.
>
> OS X 10.8.4
> Xcode 4.6.3
> clang "Apple LLVM version 4.2 (clang-425.0.28) (based on LLVM 3.2svn)" 

Very interesting, as it obviously does not reproduce for me.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] Revert "bash prompt: avoid command substitution when finalizing gitstring"

2013-08-21 Thread Junio C Hamano
Brandon Casey  writes:

> From: Brandon Casey 
>
> This reverts commit 69a8141a5d81925b7e08cb228535e9ea4a7a02e3.
>
> Old Bash (3.0) which is distributed with RHEL 4.X and other ancient
> platforms that are still in wide use, does not have a printf that
> supports -v.  Let's revert this patch and go back to using printf
> in the traditional way.
>
> Signed-off-by: Brandon Casey 
> ---

Is this something you can detect at load-time once, store the result
in a private variable and then switch on it at runtime, something
along the lines of...


# on load...
printf -v __git_printf_supports_v -- "%s" yes >/dev/null 2>&1

...

if test "${__git_printf_supports_v}" = yes
then
printf -v gitstring -- "$printf_format" "$gitstring"
else
gitstring=$(printf -- "$printf_format" "$gitstring")
fi


>  contrib/completion/git-prompt.sh | 6 +-
>  1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/contrib/completion/git-prompt.sh 
> b/contrib/completion/git-prompt.sh
> index a81ef5a..7698ec4 100644
> --- a/contrib/completion/git-prompt.sh
> +++ b/contrib/completion/git-prompt.sh
> @@ -433,11 +433,7 @@ __git_ps1 ()
>   local gitstring="$c${b##refs/heads/}${f:+$z$f}$r$p"
>  
>   if [ $pcmode = yes ]; then
> - if [[ -n ${ZSH_VERSION-} ]]; then
> - gitstring=$(printf -- "$printf_format" "$gitstring")
> - else
> - printf -v gitstring -- "$printf_format" "$gitstring"
> - fi
> + gitstring=$(printf -- "$printf_format" "$gitstring")
>   PS1="$ps1pc_start$gitstring$ps1pc_end"
>   else
>   printf -- "$printf_format" "$gitstring"
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCHv6 1/2] repack: rewrite the shell script in C

2013-08-21 Thread Matthieu Moy
Junio C Hamano  writes:

> Stefan Beller  writes:
>
>> The motivation of this patch is to get closer to a goal of being
>> able to have a core subset of git functionality built in to git.
>> That would mean
>>
>>  * people on Windows could get a copy of at least the core parts
>>of Git without having to install a Unix-style shell
>>
>>  * people deploying to servers don't have to rewrite the #! line
>>or worry about the PATH and quality of installed POSIX
>>utilities, if they are only using the built-in part written
>>in C
>
> I am not sure what is meant by the latter.  Rewriting #! is part of
> any scripted Porcelain done by the top-level Makefile, and I do not
> think we have seen any problem reports on it.

I think the case of a server with exotic OS and totally broken /bin/sh
would also benefit from this (the user won't have to find a non-broken
sh and point SHELL_PATH to it). I have no concrete example though.

>> +size_t len = strlen(e->d_name) - strlen(".pack");
>
> decl-after-stmt.

Stefan: you can make sure this does not happen again by adding

CFLAGS += -Wdeclaration-after-statement

in config.mak.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Document the HTTP transport protocols

2013-08-21 Thread Jeff King
On Wed, Aug 21, 2013 at 08:45:13PM +0700, Nguyen Thai Ngoc Duy wrote:

>  On the topic, C Git's (maybe) violations on this spec are:
>  
>   - The client does not strip trailing slashes from $GIT_URL before
> sending to the server, as described in section "URL Format".

Yeah. We get the basic gist right by not adding an extra "/" if there is
already a trailing slash (so you do not have http://host/path//info/refs";).
But we do not go out of our way to remove multiple slashes that the user
hands out (either at the end or in the middle of the URL). I doubt that
it matters in practice.

>   - The client does not check that HTTP status code is either 200 or
> 304 when receiving response in discovering references phase.

We rely on curl's CURLOPT_FAILONERROR to handle errors. And curl handles
redirects internally. So yes, we could get a "204" or something weird,
but it would almost certainly not pass the other checks (proper
content-type, starting with pkt-line, etc). I doubt it's a problem in
practice.

We also handle 401 these days, which is not in the document, but
obviously makes sense to do (ditto for 407, but I cannot remember if we
actually handle that or not; there were patches, but I think they may
have been dropped).

>   - The client verifies the first 5 bytes against pattern
> "^[0-9a-fA-F]{4}#" instead of "^[0-9a-f]{4}#" as described in
> section "discovering references".

I think this could be counted as "be liberal in what you accept",
although I do not know offhand of any implementations that use
uppercase. But if it is not true pkt-line we would figure it out pretty
quickly anyway.

> [...]

I read through the rest and did not see anything inaccurate. Thanks for
working on this.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] Revert "bash prompt: avoid command substitution when finalizing gitstring"

2013-08-21 Thread Brandon Casey
On Wed, Aug 21, 2013 at 2:47 PM, Junio C Hamano  wrote:
> Brandon Casey  writes:
>
>> From: Brandon Casey 
>>
>> This reverts commit 69a8141a5d81925b7e08cb228535e9ea4a7a02e3.
>>
>> Old Bash (3.0) which is distributed with RHEL 4.X and other ancient
>> platforms that are still in wide use, does not have a printf that
>> supports -v.  Let's revert this patch and go back to using printf
>> in the traditional way.
>>
>> Signed-off-by: Brandon Casey 
>> ---
>
> Is this something you can detect at load-time once, store the result
> in a private variable and then switch on it at runtime, something
> along the lines of...
>
>
> # on load...
> printf -v __git_printf_supports_v -- "%s" yes >/dev/null 2>&1
>
> ...
>
> if test "${__git_printf_supports_v}" = yes
> then
> printf -v gitstring -- "$printf_format" "$gitstring"
> else
> gitstring=$(printf -- "$printf_format" "$gitstring")
> fi

Yes, that appears to work.

-Brandon


>>  contrib/completion/git-prompt.sh | 6 +-
>>  1 file changed, 1 insertion(+), 5 deletions(-)
>>
>> diff --git a/contrib/completion/git-prompt.sh 
>> b/contrib/completion/git-prompt.sh
>> index a81ef5a..7698ec4 100644
>> --- a/contrib/completion/git-prompt.sh
>> +++ b/contrib/completion/git-prompt.sh
>> @@ -433,11 +433,7 @@ __git_ps1 ()
>>   local gitstring="$c${b##refs/heads/}${f:+$z$f}$r$p"
>>
>>   if [ $pcmode = yes ]; then
>> - if [[ -n ${ZSH_VERSION-} ]]; then
>> - gitstring=$(printf -- "$printf_format" "$gitstring")
>> - else
>> - printf -v gitstring -- "$printf_format" "$gitstring"
>> - fi
>> + gitstring=$(printf -- "$printf_format" "$gitstring")
>>   PS1="$ps1pc_start$gitstring$ps1pc_end"
>>   else
>>   printf -- "$printf_format" "$gitstring"
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCHv6 1/2] repack: rewrite the shell script in C

2013-08-21 Thread Stefan Beller
On 08/21/2013 10:56 PM, Junio C Hamano wrote:
> Stefan Beller  writes:
> 
>> The motivation of this patch is to get closer to a goal of being
>> able to have a core subset of git functionality built in to git.
>> That would mean
>>
>>  * people on Windows could get a copy of at least the core parts
>>of Git without having to install a Unix-style shell
>>
>>  * people deploying to servers don't have to rewrite the #! line
>>or worry about the PATH and quality of installed POSIX
>>utilities, if they are only using the built-in part written
>>in C
> 
> I am not sure what is meant by the latter.  Rewriting #! is part of
> any scripted Porcelain done by the top-level Makefile, and I do not
> think we have seen any problem reports on it.
> 
> As to "quality of ... utilities", I think the real issue some people
> in the thread had was not about "deploying to servers" but about
> installing in a minimalistic chrooted environment where standard
> tools may be lacking.
> 
>> diff --git a/builtin/repack.c b/builtin/repack.c
>> new file mode 100644
>> index 000..fb050c0
>> --- /dev/null
>> +++ b/builtin/repack.c
>> @@ -0,0 +1,376 @@
>> +/*
>> + * The shell version was written by Linus Torvalds (2005) and many others.
>> + * This is a translation into C by Stefan Beller (2013)
>> + */
> 
> I am not sure if we want to record "ownership" in the code like
> this; it will go stale over time.

I'll remove it. Initially I put it there as I found similar 
comments in other files as well.


>> +static int delta_base_offset = 1;
>> +char *packdir;
> 
> Does this have to be global?

We could pass it to all the functions, making it not global.
I'd be ok with that for the functions get_pack_filenames 
and remove_redundant_pack, but we also need to know
packdir in remove_temporary_files which is called from
the signal handler remove_pack_on_signal.

As the path is pretty obvious (get_object_directory() + "/pack"),
we could however also construct it again in the signal handler.


> So in summary:
> 
>   dir = opendir(packdir);
> if (!dir)
>   return;
> 
>   strbuf_addf(&buf, "%s-", packtmp);

packtmp is not yet a global variable, but could be passed to 
to this function. Currently we're reconstructing it here.

> 
> /* Point at the slash at the end of ".../objects/pack/" */
>   dirlen = strlen(packdir) + 1;
> /* Point at the dash at the end of ".../.tmp-%d-pack-" */
> prefixlen = buf.len - dirlen;
> 
> You would need to move the initialization of packdir and packtmp
> before sigchain_push() in cmd_repack() if you were to do this.

Ah ok, I'll do so.

>> +
>> +if (!file_exists(mkpath("%s/%s.keep", packdir, fname)))
>> +string_list_append_nodup(fname_list, fname);
> 
> mental note: this is getting names of non-kept packs, not all packs.

I should document that. ;)


>> +while (strbuf_getline(&line, out, '\n') != EOF) {
>> +if (line.len != 40)
>> +die("repack: Expecting 40 character sha1 lines only 
>> from pack-objects.");
>> +strbuf_addstr(&line, "");
> 
> What is this addstr() about?

According to the documentation of strbufs, we cannot assume to have sane 
strings, but anything. Adding an empty string however will make sure to
add a NUL-terminated string to the buffer, no?

In a previous roll of this patch, which operated on char* line,
there was just line[40] = '\0'; // replacing '\n' by '\0'
to have it sane in the string list.


> 
>> +string_list_append(&names, line.buf);
>> +count_packs++;
> 
> It probably is more in line with our naming convention to call this
> nr_packs, num_packs, etc.  "count_packs" sounds more like a boolean
> that instructs the code to either count or not bother counting,
> which this thing is not.

This is something subtle, but important to know. Thanks, will be fixed in
the reroll.


>> +
>> +if (rename(fname, fname_old)) {
>> +failed = 1;
>> +break;
> 
> "break"-ing from here leaks fname_old.  As the only out-of-line call
> file_exists() is just a thin wrapper around lstat(), I think it is
> fine not to pathdup the fname_old here.

fixed

I'd really appreciate, if there was documentation on these functions.
(When is mkpath safe? What is better in which situation: mkpath or strbufs?)
Maybe I could start doing it (but only those functions I used so far,
there are many more in cache.h)

> 
>> +}
>> +string_list_append_nodup(&rollback, fname);
>> +free(fname);
> 
> This looks bad, doesn't it?  append_nodup() lets &rollback string
> list to take the ownership of the piece of memory pointed at by
> fname, but then you free it here, no?
> 
> If you initialize &rollback with INIT_NODUP, you would not have to
> call append_nodup().

Removed the free.
Having rollback initialized with NODUP and then not

Re: [RFC/PATCH 0/4] duplicate objects in packfiles

2013-08-21 Thread Junio C Hamano
Jeff King  writes:

> Which leaves the open question: should the default for index-pack flip
> to reject duplicates rather than allow?

I'd say so.

In am emergency, people with broken packfiles can feed them to
unpack-objects ;-)

More seriously, an alternative emergency mode may be an option to
unpack-objects that does not unpack objects but streams to a new
pack or something to resurret the objects that are salvageable from
a corrupt packfile.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Dokumenting api-paths.txt

2013-08-21 Thread Stefan Beller
On 08/20/2013 11:59 PM, Jonathan Nieder wrote:
> Stefan Beller wrote:
 On 08/20/2013 03:31 PM, Johannes Sixt wrote:
> Stefan Beller wrote:
> 
>> +packdir = mkpathdup("%s/pack", get_object_directory());
>> +packtmp = mkpathdup("%s/.tmp-%d-pack", packdir, getpid());
>
> Should this not be
>
> packdir = xstrdup(git_path("pack"));
> packtmp = xstrdup(git_path("pack/.tmp-%d-pack", getpid()));
> [...]
>> So if I have 
>>  packdir = xstrdup(git_path("pack"));
>>  ...
>>  path = git_path("%s/%s", packdir, filename)
>>
>> This produces something as:
>> .git/.git/objects/pack/.tmp-13199-pack-c59c5758ef159b272f6ab10cb9fadee443966e71.idx
>> definitely having one .git too much.
> 
> The version with get_object_directory() was right.  The object
> directory is not even necessarily under .git/, since it can be
> overridden using the GIT_OBJECT_DIRECTORY envvar.
> 
>> Also interesting to add would be that git_path operates in the
>> .git/objects directory?
> 
> git_path is for resolving paths within GIT_DIR, such as
> git_path("config") and git_path("COMMIT_EDITMSG").
> 
> Jonathan
> 

Before we're doing double work, I just wrote down my understanding
so far. Feel free to tweak it, or remove obvious parts.

Thanks,
Stefan


---
path API


The functions described in this document are meant to be
used when dealing with pathes in the filesystem. The functions
are just for the string manipulations of the pathes, none of
the functions touches the actual filesystem.


`mkpath`::
The parameters are in printf format. This function can be
used to construct short-lived filename strings. It is meant
to be used for direct use in system functions such as
dir(mkpath("%s/pack", get_objects_directory())).
The return value is a pointer to such a sanitized filename
string, but it resides in a static buffer, so it will
be overwritten by the next call to mkpath (or other functions?)
This function only does string handling. It doesn't actually
change anything on the filesystem. (This is not Gits mkdir -p)

`mkpathdup`::
The same as mkpath, but the memory is duplicated into a new
buffer, so it is not short-lived, but stays as long as the
caller doesn't free the memory, which the caller is supposed
to do.

`xstrdup`::
Duplicates the given string, making the caller responsible
to free the return value. Basically the same as strdup(2)
with errorhandling.

I am not sure if this belongs into the path api documentation,
but it's not documented anywhere else.

`git_path`::
git_path is for resolving paths within GIT_DIR, such as
git_path("config") and git_path("COMMIT_EDITMSG").
This is similar to mkpath, returning a pointer to a static
buffer, which may be overwritten soon.

`git_pathdup`::
The same as git_path, but creating a new buffer. The caller
is responsible to free the returned buffer.


`git_path_submodule`::

`mksnpath`::

`git_snpath`::

`sha1_file_name`::
Returns the filename to a given sha1 value within
the objects directory.

`sha1_pack_name`::

`sha1_pack_index_name`::




signature.asc
Description: OpenPGP digital signature


Re: [PATCH] rebase --preserve-merges: ignore "merge.log" config

2013-08-21 Thread Junio C Hamano
Ralf Thielow  writes:

> When "merge.log" config is set, "rebase --preserve-merges"
> will add the log lines to the message of the rebased merge
> commit.
>
> A rebase should not modify a commit message automatically.
>
> Teach "git-rebase" to ignore that configuration by passing "--no-log"
> to the git-merge call.
>
> Signed-off-by: Ralf Thielow 
> ---

Thanks; will queue with the following squashed-in.

 t/t3409-rebase-preserve-merges.sh | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/t/t3409-rebase-preserve-merges.sh 
b/t/t3409-rebase-preserve-merges.sh
index 2454811..8c251c5 100755
--- a/t/t3409-rebase-preserve-merges.sh
+++ b/t/t3409-rebase-preserve-merges.sh
@@ -113,10 +113,8 @@ test_expect_success 'rebase -p ignores merge.log config' '
cd clone4 &&
git fetch &&
git -c merge.log=1 rebase -p origin/topic &&
-   cat >expected <<-\EOF &&
-
-   EOF
-   git log --format="%b" -1 >current
+   echo >expected &&
+   git log --format="%b" -1 >current &&
test_cmp expected current
)
 '
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCHv6 1/2] repack: rewrite the shell script in C

2013-08-21 Thread Junio C Hamano
Stefan Beller  writes:

>>> +static int delta_base_offset = 1;
>>> +char *packdir;
>> 
>> Does this have to be global?
>
> We could pass it to all the functions, making it not global.

Sorry for being unclear; I meant "not static".  It is perfectly fine
for this to be a file-scope static.

>>> +
>>> +   if (!file_exists(mkpath("%s/%s.keep", packdir, fname)))
>>> +   string_list_append_nodup(fname_list, fname);
>> 
>> mental note: this is getting names of non-kept packs, not all packs.
>
> I should document that. ;)

Rather, consider giving the function a better name, perhaps?

>>> +   while (strbuf_getline(&line, out, '\n') != EOF) {
>>> +   if (line.len != 40)
>>> +   die("repack: Expecting 40 character sha1 lines only 
>>> from pack-objects.");
>>> +   strbuf_addstr(&line, "");
>> 
>> What is this addstr() about?
>
> According to the documentation of strbufs, we cannot assume to have sane 
> strings, but anything.

Sorry, I do not get this.  What is a sane string and what is an
insane string?  sb->buf[sb-len] is always terminated with a NUL
when strbuf_getline() returns success, isn't it?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCHv6 1/2] repack: rewrite the shell script in C

2013-08-21 Thread Stefan Beller
On 08/22/2013 12:50 AM, Junio C Hamano wrote:
> Stefan Beller  writes:
> 
 +static int delta_base_offset = 1;
 +char *packdir;
>>>
>>> Does this have to be global?
>>
>> We could pass it to all the functions, making it not global.
> 
> Sorry for being unclear; I meant "not static".  It is perfectly fine
> for this to be a file-scope static.

No need to be sorry! I am sleepy, and may missunderstand even clear
messages. I'll change it to static of course.

> 
 +
 +  if (!file_exists(mkpath("%s/%s.keep", packdir, fname)))
 +  string_list_append_nodup(fname_list, fname);
>>>
>>> mental note: this is getting names of non-kept packs, not all packs.
>>
>> I should document that. ;)
> 
> Rather, consider giving the function a better name, perhaps?

What about one of:
get_non_kept_pack_filenames
get_prunable_pack_filenames
get_remove_candidate_pack_filenames

> 
 +  while (strbuf_getline(&line, out, '\n') != EOF) {
 +  if (line.len != 40)
 +  die("repack: Expecting 40 character sha1 lines only 
 from pack-objects.");
 +  strbuf_addstr(&line, "");
>>>
>>> What is this addstr() about?
>>
>> According to the documentation of strbufs, we cannot assume to have sane 
>> strings, but anything.
> 
> Sorry, I do not get this.  What is a sane string and what is an
> insane string?  sb->buf[sb-len] is always terminated with a NUL
> when strbuf_getline() returns success, isn't it?
> 

I should read the strbuf documentation again. Thanks for pointing it
out. I'll remove the strbuf_addstr(&line, "");

Thanks for your patience in the reviews,
Stefan



signature.asc
Description: OpenPGP digital signature


Re: [DO NOT APPLY PATCH 4/4] index-pack: optionally skip duplicate packfile entries

2013-08-21 Thread Junio C Hamano
Jeff King  writes:

> When we are building the pack index, we can notice that
> there are duplicate objects, pick one "winner" instance, and
> mention the object only once in the index (mapped to the
> winner's offset).
>
> This has the effect that the duplicate packfile entries are
> never found by lookup. The data still exists in the
> packfile, though, and can be used as a delta base if delta
> base offsets are used. If delta refs are used, then it is
> possible that some deltas may be broken.

I do not understand the last bit.  If two copies of an object exist
but you have only one slot for the object in the index, and another
object names it as its base with ref-delta, then reconstituting it
should work just fine---whichever representation of the base object
is recorded in the .idx, that first needs to be reconstituted before
the delta is applied to it, and both copies should yield identical
contents for the delta base object, no?

In any case, ejecting one from the pack .idx would not help in the
presense of either broken or malicious packer that reuses delta too
aggressively.  Suppose you have objects A and B and somehow manage
to create a cycle of deltas, A names B as its delta base and B names
A as its delta base.  The packer may notice its mistake and then add
another copy of A as a base object.  The pack contains two copies of
A (one is delta based on B, the other is full) and B (delta against
A).

If B refers to the copy of A that is delta against B using ofs-delta,
fixing the .idx file will have no effect.  read_packed_sha1(B) will
read the delta data of B, finds the offset to start reading the data
for A which was excised from the .idx and unless that codepath is
updated to consult revindex (i.e. you mark one copy of A in the .pack
as bad, and when B refers to that bad copy of A via ofs-delta, you
check the offset against revindex to get the object name of A and go
to the good copy of A), you will never finish reading B because
reading the bad copy of A will lead you to first reconstitute B.

> I think this line of "fixing" should probably be scrapped.

I tend to agree.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] t3404 incremental improvements

2013-08-21 Thread Junio C Hamano
Eric Sunshine  writes:

> This set of patches was meant to be a re-roll of [1] addressing Junio's
> comments, however [1] graduated to 'next' before I found time to work on
> it further, so these are instead incremental patches atop 'next'.

Just FYI, 'next' will be rewound once the upcoming release is done,
so we have a chance to rewind and squash.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] t3404: make tests more self-contained

2013-08-21 Thread Junio C Hamano
Eric Sunshine  writes:

> As its very first action, t3404 installs (via set_fake_editor) a
> specialized $EDITOR which simplifies automated 'rebase -i' testing. Many
> tests rely upon this setting, thus tests which need a different editor
> must take extra care upon completion to restore $EDITOR in order to
> avoid breaking following tests. This places extra burden upon such tests
> and requires that they undesirably have extra knowledge about
> surrounding tests. Ease this burden by having each test install the
> $EDITOR it requires, rather than relying upon a global setting.
>
> Signed-off-by: Eric Sunshine 
> ---

Makes sense.  Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 2/3] branch: mark missing tracking branch as gone

2013-08-21 Thread Jiang Xin
2013/8/21 Matthieu Moy :
> Jiang Xin  writes:
>
>> $ git status
>> # On branch topic
>> # Your branch is based on 'topicbase', but the upstream is gone.
>> #   (use "git branch --unset-upstream" to fixup)
>
> Sorry, I didn't follow closely the previous discussions. I'm not sure
> "gone" is right either, since the user may just have configured an
> upstream that does not exist and never existed. Perhaps "absent" would
> be better.
>
> Just a thought, shouldn't block the patch.

Thank you for following this, and offering better statements. I will
make another reroll after the end of my business trip this week.

-- 
Jiang Xin
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/2] git-send-email: Message-ID caching

2013-08-21 Thread Junio C Hamano
Rasmus Villemoes  writes:

>> Rasmus Villemoes  writes:
>>>  my %config_path_settings = (
>>> @@ -311,6 +314,7 @@ my $rc = GetOptions("h" => \$help,
>>> "8bit-encoding=s" => \$auto_8bit_encoding,
>>> "compose-encoding=s" => \$compose_encoding,
>>> "force" => \$force,
>>> +   "msgid-cache-file=s" => \$msgid_cache_file,
>>>  );
>>
>> Is there a standard, recommended location we suggest users to store
>> this?  
>
> I don't know. It is obviously a local, per-repository, thing. I don't
> know enough about git's internals to know if something breaks if one
> puts it in .git (say, ".git/msgid.cache").

I think $GIT_DIR is OK, when we _know_ we are in a Git controlled
directory.  "git send-email" can however be invoked in a random
directory that is _not_ a Git controlled directory, though.

In any case, if we were to store it inside $GIT_DIR, I'd prefer to
have "send-email" somewhere in the name of the file, as there are
other Git programs that deal with things that have "msgid" (notably,
"am") that will not have anything to do with this file.

> If storing it under .git is possible, one could consider making the
> option a boolean ('msgid-use-cache' ?) and always use
> ".git/msgid.cache".

Another possibility is to have it in the output directory specified
via the "format-patch -o $dir" option.  When you are rerolling a
series multiple times, you will only look at the message ID from the
previous round; you do not even need to look at old messages in an
unrelated topic.

I could imagine that

git send-email $dir/0*.txt

can notice that these input files are all in the same $dir
directory, check to see if $dir/message-id file exists, read it to
offer it as the suggested initial-reply-to.  Similarly, when sending
the _first_ message in such an invocation, it can just write the
generated message-id to that file.  Then we need no choices.  It is
sufficient to just keep a single message-id of the first message in
the previous round and offer it as a possible initial-reply-to in a
Yes/No question.

Just a random thought.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] Revert "bash prompt: avoid command substitution when finalizing gitstring"

2013-08-21 Thread Junio C Hamano
Brandon Casey  writes:

> On Wed, Aug 21, 2013 at 2:47 PM, Junio C Hamano  wrote:
>> Brandon Casey  writes:
>>
>>> From: Brandon Casey 
>>>
>>> This reverts commit 69a8141a5d81925b7e08cb228535e9ea4a7a02e3.
>>>
>>> Old Bash (3.0) which is distributed with RHEL 4.X and other ancient
>>> platforms that are still in wide use, does not have a printf that
>>> supports -v.  Let's revert this patch and go back to using printf
>>> in the traditional way.
>>>
>>> Signed-off-by: Brandon Casey 
>>> ---
>>
>> Is this something you can detect at load-time once, store the result
>> in a private variable and then switch on it at runtime, something
>> along the lines of...
>>
>>
>> # on load...
>> printf -v __git_printf_supports_v -- "%s" yes >/dev/null 2>&1
>>
>> ...
>>
>> if test "${__git_printf_supports_v}" = yes
>> then
>> printf -v gitstring -- "$printf_format" "$gitstring"
>> else
>> gitstring=$(printf -- "$printf_format" "$gitstring")
>> fi
>
> Yes, that appears to work.

A real patch needs to be a bit more careful, though.  The variable
needs to be cleared before all of the above, and the testing would
want to consider that the variable may not be set (i.e. use
"${var-}" when checking).

Thanks.

> -Brandon
>
>
>>>  contrib/completion/git-prompt.sh | 6 +-
>>>  1 file changed, 1 insertion(+), 5 deletions(-)
>>>
>>> diff --git a/contrib/completion/git-prompt.sh 
>>> b/contrib/completion/git-prompt.sh
>>> index a81ef5a..7698ec4 100644
>>> --- a/contrib/completion/git-prompt.sh
>>> +++ b/contrib/completion/git-prompt.sh
>>> @@ -433,11 +433,7 @@ __git_ps1 ()
>>>   local gitstring="$c${b##refs/heads/}${f:+$z$f}$r$p"
>>>
>>>   if [ $pcmode = yes ]; then
>>> - if [[ -n ${ZSH_VERSION-} ]]; then
>>> - gitstring=$(printf -- "$printf_format" "$gitstring")
>>> - else
>>> - printf -v gitstring -- "$printf_format" "$gitstring"
>>> - fi
>>> + gitstring=$(printf -- "$printf_format" "$gitstring")
>>>   PS1="$ps1pc_start$gitstring$ps1pc_end"
>>>   else
>>>   printf -- "$printf_format" "$gitstring"
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] Revert "bash prompt: avoid command substitution when finalizing gitstring"

2013-08-21 Thread Brandon Casey
On Wed, Aug 21, 2013 at 5:22 PM, Junio C Hamano  wrote:
> Brandon Casey  writes:
>
>> On Wed, Aug 21, 2013 at 2:47 PM, Junio C Hamano  wrote:

>>> # on load...
>>> printf -v __git_printf_supports_v -- "%s" yes >/dev/null 2>&1
>>>
>>> ...
>>>
>>> if test "${__git_printf_supports_v}" = yes
>>> then
>>> printf -v gitstring -- "$printf_format" "$gitstring"
>>> else
>>> gitstring=$(printf -- "$printf_format" "$gitstring")
>>> fi
>>
>> Yes, that appears to work.
>
> A real patch needs to be a bit more careful, though.  The variable
> needs to be cleared before all of the above,

Agreed.

> and the testing would
> want to consider that the variable may not be set (i.e. use
> "${var-}" when checking).

Why is "${var-}" necessary?  Wouldn't that be equivalent to "${var}"
or "$var"?  We obviously wouldn't want to do 'if test $var = yes', but
I would have thought it was sufficient to wrap the variable
dereference in quotes as your original did.

-Brandon
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [DO NOT APPLY PATCH 4/4] index-pack: optionally skip duplicate packfile entries

2013-08-21 Thread Jeff King
On Wed, Aug 21, 2013 at 04:20:07PM -0700, Junio C Hamano wrote:

> I do not understand the last bit.  If two copies of an object exist
> but you have only one slot for the object in the index, and another
> object names it as its base with ref-delta, then reconstituting it
> should work just fine---whichever representation of the base object
> is recorded in the .idx, that first needs to be reconstituted before
> the delta is applied to it, and both copies should yield identical
> contents for the delta base object, no?
> 
> In any case, ejecting one from the pack .idx would not help in the
> presense of either broken or malicious packer that reuses delta too
> aggressively.  Suppose you have objects A and B and somehow manage
> to create a cycle of deltas, A names B as its delta base and B names
> A as its delta base.  The packer may notice its mistake and then add
> another copy of A as a base object.  The pack contains two copies of
> A (one is delta based on B, the other is full) and B (delta against
> A).

Yes, that is the potentially problematic scenario...

> If B refers to the copy of A that is delta against B using ofs-delta,
> fixing the .idx file will have no effect.  read_packed_sha1(B) will
> read the delta data of B, finds the offset to start reading the data
> for A which was excised from the .idx and unless that codepath is
> updated to consult revindex (i.e. you mark one copy of A in the .pack
> as bad, and when B refers to that bad copy of A via ofs-delta, you
> check the offset against revindex to get the object name of A and go
> to the good copy of A), you will never finish reading B because
> reading the bad copy of A will lead you to first reconstitute B.

Yes. There are two levels of brokenness here.

One is that the pack has a delta base offset for B that leads to the
"wrong" A, creating a cycle. This is an utterly broken pack and cannot
be fixed automatically (you do not even know the name of the cycled A
because you cannot constitute it to find its name and realize that it
has a duplicate!). But we should notice this during index-pack, because
we cannot reconstruct the object.

Another situation is that your delta base points to the "right" A. You
can reconstruct either A or B, because although there is a duplicate,
there are no cycles in the delta chain (i.e., the chain does not care
about object name, only offset, and offsets point only one way).

So we do not have a problem at all with reconstructing the object data.
We do have two ways we might access A from the same pack. For regular
object lookup, that is probably not a big deal. For operations like
repacking that look more closely at the on-disk representation, I do not
know. We may mark A as "has a delta" or "does not have a delta" at one
point in the code, but then later find the other copy of A which does
not match. I did not trace through all of pack-objects to find whether
that is possible, or what bugs it might cause. But indexing only one
copy as the "master" means that we will always get a consistent view of
the object (from that particular pack, at least).

Now consider REF_DELTA. We lookup the base object by name, so we may get
a different answer depending on how it is indexed. E.g., index-pack
keeps an internal hash table, whereas later callers will use the .idx
file. When looking at the .idx file, we may use a regular binary search,
or sha1_entry_pos. The exact order of the search may even change from
git version to git version.

So you may have a situation where index-pack finds the "right" A and
properly reconstructs the file while creating the .idx, and thinks all
is well. But later lookups may find the "wrong" A, and fail. In other
words, we cannot trust that data that makes it through index-pack is not
corrupted (and it is index-pack's output that lets receive-pack commit
to the client that we have their objects, so we want to be reasonably
sure we have uncorrupted copies at that point).

Choosing a single "master" A to go in the .idx means we will get a
consistent view of A once the .idx is generated. But it may not be the
right one, and it may not be the one we used to check that we can
resolve A and B in the first place.

The right fix for that situation is to keep both entries in the .idx,
detect delta cycles, then look up extra copies of A, but being aware
that you already found one copy in the .idx and that sha1_entry_pos (or
the vanilla binary search) should return any other copies, if they
exist. I do not think we detect delta cycles at all (though I didn't
check), but certainly we do not have a way to tell sha1_entry_pos any
different information so that it will find the "other" A.

So I think it is a recoverable state, but it is a non-trivial bit of
code for something that should never happen. Rejecting on push/fetch via
index-pack is simple and easy, and we can deal with the fallout if and
when it ever actually happens.

> > I think this line of "fixing" should probably be scrapped.
> 
> I tend 

[PATCH] contrib/git-prompt.sh: handle missing 'printf -v' more gracefully

2013-08-21 Thread Brandon Casey
From: Brandon Casey 

Old Bash (3.0) which is distributed with RHEL 4.X and other ancient
platforms that are still in wide use, do not have a printf that
supports -v.  Neither does Zsh (which is already handled in the code).

As suggested by Junio, let's test whether printf supports the -v
option and store the result.  Then later, we can use it to
determine whether 'printf -v' can be used, or whether printf
must be called in a subshell.

Signed-off-by: Brandon Casey 
---

This replaces [PATCH 3/3] Revert "bash prompt: avoid command substitution
when finalizing gitstring".

This may or may not need to be updated to use "${var-}" depending on
your response to my other email, but this seems sufficient.

-Brandon

 contrib/completion/git-prompt.sh | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/contrib/completion/git-prompt.sh b/contrib/completion/git-prompt.sh
index a81ef5a..639888a 100644
--- a/contrib/completion/git-prompt.sh
+++ b/contrib/completion/git-prompt.sh
@@ -84,6 +84,10 @@
 # the colored output of "git status -sb" and are available only when
 # using __git_ps1 for PROMPT_COMMAND or precmd.
 
+# check whether printf supports -v
+__git_printf_supports_v=
+printf -v __git_printf_supports_v -- '%s' yes >/dev/null 2>&1
+
 # stores the divergence from upstream in $p
 # used by GIT_PS1_SHOWUPSTREAM
 __git_ps1_show_upstream ()
@@ -433,10 +437,10 @@ __git_ps1 ()
local gitstring="$c${b##refs/heads/}${f:+$z$f}$r$p"
 
if [ $pcmode = yes ]; then
-   if [[ -n ${ZSH_VERSION-} ]]; then
-   gitstring=$(printf -- "$printf_format" "$gitstring")
-   else
+   if test "$__git_printf_supports_v" = yes; then
printf -v gitstring -- "$printf_format" "$gitstring"
+   else
+   gitstring=$(printf -- "$printf_format" "$gitstring")
fi
PS1="$ps1pc_start$gitstring$ps1pc_end"
else
-- 
1.8.4.rc0.2.g6cf5c31


---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git access to Bzr repository fails for enwc

2013-08-21 Thread Stefan Monnier
>> I've had good success recently with the git->bzr bridge, but the
>> following still fails.  This is on Debian with the git from "unstable".
> Which version are you using?

The one that comes in Debian unstable.

> The latest version works fine here:
> https://github.com/felipec/git/blob/fc/master/git-remote-bzr.py

Indeed, that works, thank you,


Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] Revert "bash prompt: avoid command substitution when finalizing gitstring"

2013-08-21 Thread Junio C Hamano
Brandon Casey  writes:

> Why is "${var-}" necessary?  Wouldn't that be equivalent to "${var}"
> or "$var"?

set -u
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] contrib/git-prompt.sh: handle missing 'printf -v' more gracefully

2013-08-21 Thread Brandon Casey
From: Brandon Casey 

Old Bash (3.0) which is distributed with RHEL 4.X and other ancient
platforms that are still in wide use, do not have a printf that
supports -v.  Neither does Zsh (which is already handled in the code).

As suggested by Junio, let's test whether printf supports the -v
option and store the result.  Then later, we can use it to
determine whether 'printf -v' can be used, or whether printf
must be called in a subshell.

Signed-off-by: Brandon Casey 
---


On 8/21/2013 6:27 PM, Junio C Hamano wrote:> Brandon Casey  
writes:
>
>> Why is "${var-}" necessary?  Wouldn't that be equivalent to "${var}"
>> or "$var"?
>
> set -u

Ah.  Thanks.  Updated.  Also minor tweak to use [ ] instead of test ...
to conform with the rest of the script.

-Brandon


 contrib/completion/git-prompt.sh | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/contrib/completion/git-prompt.sh b/contrib/completion/git-prompt.sh
index a81ef5a..ca7fb35 100644
--- a/contrib/completion/git-prompt.sh
+++ b/contrib/completion/git-prompt.sh
@@ -84,6 +84,10 @@
 # the colored output of "git status -sb" and are available only when
 # using __git_ps1 for PROMPT_COMMAND or precmd.
 
+# check whether printf supports -v
+__git_printf_supports_v=
+printf -v __git_printf_supports_v -- '%s' yes >/dev/null 2>&1
+
 # stores the divergence from upstream in $p
 # used by GIT_PS1_SHOWUPSTREAM
 __git_ps1_show_upstream ()
@@ -433,10 +437,10 @@ __git_ps1 ()
local gitstring="$c${b##refs/heads/}${f:+$z$f}$r$p"
 
if [ $pcmode = yes ]; then
-   if [[ -n ${ZSH_VERSION-} ]]; then
-   gitstring=$(printf -- "$printf_format" "$gitstring")
-   else
+   if [ "${__git_printf_supports_v-}" = yes ]; then
printf -v gitstring -- "$printf_format" "$gitstring"
+   else
+   gitstring=$(printf -- "$printf_format" "$gitstring")
fi
PS1="$ps1pc_start$gitstring$ps1pc_end"
else
-- 
1.8.4.rc0.2.g6cf5c31


---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html