date:20180206

Re: [PATCH 5/8] rebase: introduce the --recreate-merges option

2018-02-06 Thread Johannes Sixt


Am 07.02.2018 um 07:16 schrieb Sergey Organov:

Johannes Schindelin  writes:

[...]


+--recreate-merges::
+   Recreate merge commits instead of flattening the history by replaying
+   merges. Merge conflict resolutions or manual amendments to merge
+   commits are not preserved.


I wonder why you guys still hold on replaying "merge-the-operation"
instead of replaying "merge-the-result"? The latter, the merge commit
itself, no matter how exactly it was created in the first place, is the
most valuable thing git keeps about the merge, and you silently drop it
entirely! OTOH, git keeps almost no information about
"merge-the-operation", so it's virtually impossible to reliably replay
the operation automatically, and yet you try to.


Very well put. I share your concerns.

-- Hannes


IMHO that was severe mistake in the original --preserve-merges, and you
bring with you to this new --recreate-merges... It's sad. Even more sad
as solution is already known for years:

 bc00341838a8faddcd101da9e746902994eef38a
 Author: Johannes Sixt 
 Date:   Sun Jun 16 15:50:42 2013 +0200
 
 rebase -p --first-parent: redo merge by cherry-picking first-parent change


and it works like a charm.

-- Sergey

Re: [PATCH 5/8] rebase: introduce the --recreate-merges option

2018-02-06 Thread Jacob Keller

On Tue, Feb 6, 2018 at 10:16 PM, Sergey Organov  wrote:
> Johannes Schindelin  writes:
>
> [...]
>
>> +--recreate-merges::
>> + Recreate merge commits instead of flattening the history by replaying
>> + merges. Merge conflict resolutions or manual amendments to merge
>> + commits are not preserved.
>
> I wonder why you guys still hold on replaying "merge-the-operation"
> instead of replaying "merge-the-result"? The latter, the merge commit
> itself, no matter how exactly it was created in the first place, is the
> most valuable thing git keeps about the merge, and you silently drop it
> entirely! OTOH, git keeps almost no information about
> "merge-the-operation", so it's virtually impossible to reliably replay
> the operation automatically, and yet you try to.
>

I'm not sure I follow what you mean here?

You mean that you'd want this to actually attempt to re-create the
original merge including conflict resolutions by taking the contents
of the result?

How do you handle if that result has conflicts? What UX do you present
to the user to handle such conflicts? I don't think the normal 3-way
conflicts would even be possible in this case?

Thanks,
Jake

> IMHO that was severe mistake in the original --preserve-merges, and you
> bring with you to this new --recreate-merges... It's sad. Even more sad
> as solution is already known for years:
>
> bc00341838a8faddcd101da9e746902994eef38a
> Author: Johannes Sixt 
> Date:   Sun Jun 16 15:50:42 2013 +0200
>
> rebase -p --first-parent: redo merge by cherry-picking first-parent 
> change
>
> and it works like a charm.
>
> -- Sergey
>

Re: [PATCH] t0050: remove the unused $test_case variable

2018-02-06 Thread Johannes Sixt


Am 07.02.2018 um 00:13 schrieb Ævar Arnfjörð Bjarmason:

The $test_case variable hasn't been used since
decd3c0c28 ("t0050-*.sh: mark the rename (case change) test as
passing", 2014-11-28) when its last user went away.

Let's remove the "say" as well, since it's obvious from subsequent
output that we're testing on a case sensitive filesystem.


Am I misunderstanding the message? I think it reports properties of the 
test environment. And the tests do run on case-insensitive filesystems. 
IMO, the message should be kept.




Signed-off-by: Ævar Arnfjörð Bjarmason 
---
  t/t0050-filesystem.sh | 8 
  1 file changed, 8 deletions(-)

diff --git a/t/t0050-filesystem.sh b/t/t0050-filesystem.sh
index b29d749bb7..606ffddd92 100755
--- a/t/t0050-filesystem.sh
+++ b/t/t0050-filesystem.sh
@@ -7,14 +7,6 @@ test_description='Various filesystem issues'
  auml=$(printf '\303\244')
  aumlcdiar=$(printf '\141\314\210')
  
-if test_have_prereq CASE_INSENSITIVE_FS

-then
-   say "will test on a case insensitive filesystem"
-   test_case=test_expect_failure
-else
-   test_case=test_expect_success
-fi
-
  if test_have_prereq UTF8_NFD_TO_NFC
  then
say "will test on a unicode corrupting filesystem"

Good day

2018-02-06 Thread Mikail Bari

Greetings Sir/Madam,

I am introducing to you our Funding Programme.

National Investment Corporation (NIC) is a company driven by ambitious
vision that turned our name into a global brand. A company leading the
way in Financing and Capital Investment within United Arab Emirates,
Middle East Region and now expanding worldwide.

We offer outsourcing of fund services representing over $US18 billion
(Eighteen Billion United States Dollars) in committed capital for
various investment opportunities in UAE and around the globe and we
wish to re-invest this fund by putting it into the management of
private businesses and corporations with good business ideas that can
generate a minimum of 10% ROI per annum over 5 -10 years duration.

You are invited to participate in our Entrepreneur Capitalization
Program. Our funds are set aside for business entities and companies
within UAE and overseas with international investment engagements and
it attracts only 3.5% interest with part equity position of 5 to 10
year hold.

Applications are accepted from business executives and companies with
proven business records in search of fund for expansion or for capital
investment.

For further enquiries:

Mikail Bari(Investment Manager)
National Investment Corporation (NIC)
Marina Mall, Al Marina, Abu Dhabi
United Arab Emirates.

Re: [PATCH/RFC v5 7/7] Careful with CRLF when using e.g. UTF-16 for working-tree-encoding

2018-02-06 Thread Torsten Bögershausen

On Fri, Feb 02, 2018 at 11:17:04AM -0800, Junio C Hamano wrote:
> Torsten Bögershausen  writes:
> 
> > There are 2 opposite opionions/user expectations here:
> >
> > a) They are binary in the working tree, so git should leave the line endings
> >as is. (Unless specified otherwise in the .attributes file)
> > ...
> > b) They are text files in the index. Git will convert line endings
> >if core.autocrlf is true (or the .gitattributes file specifies "-text")
> 
> I sense that you seem to be focusing on the distinction between "in
> the working tree" vs "in the index" while contrasting.  The "binary
> vs text" in your "binary in wt, text in index" is based on the
> default heuristics without any input from end-users or the project
> that uses Git that happens to contain such files.  If the users and
> the project that uses Git want to treat contents in a path as text,
> it is text even when it is (re-)encoded to UTF-16, no?
> 
> Such files may be (mis)classified as binary with the default
> heuristics when there is no help from what is written in the
> .gitattributes file, but here we are talking about the case where
> the user explicitly tells us it is in UTF-16, right?  Is there such a
> thing as UTF-16 binary?

I don't think so, by definiton UTF-16 is ment to be text.
(this means that git ls-files --eol needs some update, I can have a look)

Do we agree that UTF-16 is text ?
If yes, could Git assume that the "text" attribute is set when
working-tree-encoding is set ?

I would even go a step further and demand that the user -must- make a decision
about the line endings for working-tree-encoded files:
working-tree-encoding=UTF-16 # illegal, die()
working-tree-encoding=UTF-16 text=auto   # illegal, die()
working-tree-encoding=UTF-16 -text   # no eol conversion
working-tree-encoding=UTF-16 text# eol according to core.eol
working-tree-encoding=UTF-16 text eol=lf # LF
working-tree-encoding=UTF-16 text eol=crlf   # CRLF

What do you think ?

Re: [PATCH 5/8] rebase: introduce the --recreate-merges option

2018-02-06 Thread Sergey Organov

Johannes Schindelin  writes:

[...]

> +--recreate-merges::
> + Recreate merge commits instead of flattening the history by replaying
> + merges. Merge conflict resolutions or manual amendments to merge
> + commits are not preserved.

I wonder why you guys still hold on replaying "merge-the-operation"
instead of replaying "merge-the-result"? The latter, the merge commit
itself, no matter how exactly it was created in the first place, is the
most valuable thing git keeps about the merge, and you silently drop it
entirely! OTOH, git keeps almost no information about
"merge-the-operation", so it's virtually impossible to reliably replay
the operation automatically, and yet you try to.

IMHO that was severe mistake in the original --preserve-merges, and you
bring with you to this new --recreate-merges... It's sad. Even more sad
as solution is already known for years:

bc00341838a8faddcd101da9e746902994eef38a
Author: Johannes Sixt 
Date:   Sun Jun 16 15:50:42 2013 +0200

rebase -p --first-parent: redo merge by cherry-picking first-parent 
change

and it works like a charm.

-- Sergey

nabídka pujcky

2018-02-06 Thread James Lewandowski

Hledali jste možnosti financování pro vaše podnikání, nový nákup domu, 
výstavbu, pujcku na nemovitosti, refinancování, konsolidaci dluhu, osobní nebo 
obchodní úcel? Vítejte v budoucnosti! Financování nám ulehcilo. Kontaktujte 
nás, protože nabízíme naši financní službu za nízkou a cenove dostupnou 
úrokovou sazbu ve výši 3% za dlouhou a krátkou dobu pujcky. Zúcastnený žadatel 
by se mel obrátit na nás pro další postupy pri získávání úveru na adrese 
lewandowskijame...@gmail.com

[PATCH v3 07/35] connect: convert get_remote_heads to use struct packet_reader

2018-02-06 Thread Brandon Williams

In order to allow for better control flow when protocol_v2 is introduced
convert 'get_remote_heads()' to use 'struct packet_reader' to read
packet lines.  This enables a client to be able to peek the first line
of a server's response (without consuming it) in order to determine the
protocol version its speaking and then passing control to the
appropriate handler.

This is needed because the initial response from a server speaking
protocol_v0 includes the first ref, while subsequent protocol versions
respond with a version line.  We want to be able to read this first line
without consuming the first ref sent in the protocol_v0 case so that the
protocol version the server is speaking can be determined outside of
'get_remote_heads()' in a future patch.

Signed-off-by: Brandon Williams 
---
 connect.c | 174 ++
 1 file changed, 96 insertions(+), 78 deletions(-)

diff --git a/connect.c b/connect.c
index c3a014c5b..00e90075c 100644
--- a/connect.c
+++ b/connect.c
@@ -48,6 +48,12 @@ int check_ref_type(const struct ref *ref, int flags)
 
 static void die_initial_contact(int unexpected)
 {
+   /*
+* A hang-up after seeing some response from the other end
+* means that it is unexpected, as we know the other end is
+* willing to talk to us.  A hang-up before seeing any
+* response does not necessarily mean an ACL problem, though.
+*/
if (unexpected)
die(_("The remote end hung up upon initial contact"));
else
@@ -56,6 +62,41 @@ static void die_initial_contact(int unexpected)
  "and the repository exists."));
 }
 
+static enum protocol_version discover_version(struct packet_reader *reader)
+{
+   enum protocol_version version = protocol_unknown_version;
+
+   /*
+* Peek the first line of the server's response to
+* determine the protocol version the server is speaking.
+*/
+   switch (packet_reader_peek(reader)) {
+   case PACKET_READ_EOF:
+   die_initial_contact(0);
+   case PACKET_READ_FLUSH:
+   case PACKET_READ_DELIM:
+   version = protocol_v0;
+   break;
+   case PACKET_READ_NORMAL:
+   version = determine_protocol_version_client(reader->line);
+   break;
+   }
+
+   /* Maybe process capabilities here, at least for v2 */
+   switch (version) {
+   case protocol_v1:
+   /* Read the peeked version line */
+   packet_reader_read(reader);
+   break;
+   case protocol_v0:
+   break;
+   case protocol_unknown_version:
+   die("unknown protocol version: '%s'\n", reader->line);
+   }
+
+   return version;
+}
+
 static void parse_one_symref_info(struct string_list *symref, const char *val, 
int len)
 {
char *sym, *target;
@@ -109,60 +150,21 @@ static void annotate_refs_with_symref_info(struct ref 
*ref)
string_list_clear(&symref, 0);
 }
 
-/*
- * Read one line of a server's ref advertisement into packet_buffer.
- */
-static int read_remote_ref(int in, char **src_buf, size_t *src_len,
-  int *responded)
+static void process_capabilities(const char *line, int *len)
 {
-   int len = packet_read(in, src_buf, src_len,
- packet_buffer, sizeof(packet_buffer),
- PACKET_READ_GENTLE_ON_EOF |
- PACKET_READ_CHOMP_NEWLINE);
-   const char *arg;
-   if (len < 0)
-   die_initial_contact(*responded);
-   if (len > 4 && skip_prefix(packet_buffer, "ERR ", &arg))
-   die("remote error: %s", arg);
-
-   *responded = 1;
-
-   return len;
-}
-
-#define EXPECTING_PROTOCOL_VERSION 0
-#define EXPECTING_FIRST_REF 1
-#define EXPECTING_REF 2
-#define EXPECTING_SHALLOW 3
-
-/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. */
-static int process_protocol_version(void)
-{
-   switch (determine_protocol_version_client(packet_buffer)) {
-   case protocol_v1:
-   return 1;
-   case protocol_v0:
-   return 0;
-   default:
-   die("server is speaking an unknown protocol");
-   }
-}
-
-static void process_capabilities(int *len)
-{
-   int nul_location = strlen(packet_buffer);
+   int nul_location = strlen(line);
if (nul_location == *len)
return;
-   server_capabilities = xstrdup(packet_buffer + nul_location + 1);
+   server_capabilities = xstrdup(line + nul_location + 1);
*len = nul_location;
 }
 
-static int process_dummy_ref(void)
+static int process_dummy_ref(const char *line)
 {
struct object_id oid;
const char *name;
 
-   if (parse_oid_hex(packet_buffer, &oid, &name))
+   if (parse_oid_hex(line, &oid, &name))
return 0;
if (*name != ' ')
retur

[PATCH v3 20/35] upload-pack: introduce fetch server command

2018-02-06 Thread Brandon Williams

Introduce the 'fetch' server command.

Signed-off-by: Brandon Williams 
---
 Documentation/technical/protocol-v2.txt | 127 +++
 serve.c |   2 +
 t/t5701-git-serve.sh|   1 +
 upload-pack.c   | 281 
 upload-pack.h   |   5 +
 5 files changed, 416 insertions(+)

diff --git a/Documentation/technical/protocol-v2.txt 
b/Documentation/technical/protocol-v2.txt
index ef81df868..4d5096dae 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -144,3 +144,130 @@ The output of ls-refs is as follows:
 ref-attribute = (symref | peeled)
 symref = "symref-target:" symref-target
 peeled = "peeled:" obj-id
+
+ fetch
+---
+
+`fetch` is the command used to fetch a packfile in v2.  It can be looked
+at as a modified version of the v1 fetch where the ref-advertisement is
+stripped out (since the `ls-refs` command fills that role) and the
+message format is tweaked to eliminate redundancies and permit easy
+addition of future extensions.
+
+Additional features not supported in the base command will be advertised
+as the value of the command in the capability advertisement in the form
+of a space separated list of features, e.g.  "=
+".
+
+A `fetch` request can take the following parameters wrapped in
+packet-lines:
+
+want 
+   Indicates to the server an object which the client wants to
+   retrieve.
+
+have 
+   Indicates to the server an object which the client has locally.
+   This allows the server to make a packfile which only contains
+   the objects that the client needs. Multiple 'have' lines can be
+   supplied.
+
+done
+   Indicates to the server that negotiation should terminate (or
+   not even begin if performing a clone) and that the server should
+   use the information supplied in the request to construct the
+   packfile.
+
+thin-pack
+   Request that a thin pack be sent, which is a pack with deltas
+   which reference base objects not contained within the pack (but
+   are known to exist at the receiving end). This can reduce the
+   network traffic significantly, but it requires the receiving end
+   to know how to "thicken" these packs by adding the missing bases
+   to the pack.
+
+no-progress
+   Request that progress information that would normally be sent on
+   side-band channel 2, during the packfile transfer, should not be
+   sent.  However, the side-band channel 3 is still used for error
+   responses.
+
+include-tag
+   Request that annotated tags should be sent if the objects they
+   point to are being sent.
+
+ofs-delta
+   Indicate that the client understands PACKv2 with delta referring
+   to its base by position in pack rather than by an oid.  That is,
+   they can read OBJ_OFS_DELTA (ake type 6) in a packfile.
+
+The response of `fetch` is broken into a number of sections separated by
+delimiter packets (0001), with each section beginning with its section
+header.
+
+output = *section
+section = (acknowledgments | packfile)
+ (flush-pkt | delim-pkt)
+
+acknowledgments = PKT-LINE("acknowledgments" LF)
+ *(ready | nak | ack)
+ready = PKT-LINE("ready" LF)
+nak = PKT-LINE("NAK" LF)
+ack = PKT-LINE("ACK" SP obj-id LF)
+
+packfile = PKT-LINE("packfile" LF)
+  [PACKFILE]
+
+
+acknowledgments section
+   * Always begins with the section header "acknowledgments"
+
+   * The server will respond with "NAK" if none of the object ids sent
+ as have lines were common.
+
+   * The server will respond with "ACK obj-id" for all of the
+ object ids sent as have lines which are common.
+
+   * A response cannot have both "ACK" lines as well as a "NAK"
+ line.
+
+   * The server will respond with a "ready" line indicating that
+ the server has found an acceptable common base and is ready to
+ make and send a packfile (which will be found in the packfile
+ section of the same response)
+
+   * If the client determines that it is finished with negotiations
+ by sending a "done" line, the acknowledgments sections can be
+ omitted from the server's response as an optimization.
+
+   * If the server has found a suitable cut point and has decided
+ to send a "ready" line, then the server can decide to (as an
+ optimization) omit any "ACK" lines it would have sent during
+ its response.  This is because the server will have already
+ determined the objects it plans to send to the client and no
+ further negotiation is needed.
+
+
+packfile section
+   * Always begins with the section header "packfile"
+
+   * The transmission of the packfile begins immediately after

[PATCH v3 15/35] transport: convert get_refs_list to take a list of ref patterns

2018-02-06 Thread Brandon Williams

Convert the 'struct transport' virtual function 'get_refs_list()' to
optionally take an argv_array of ref patterns.  When communicating with
a server using protocol v2 these ref patterns can be sent when
requesting a listing of their refs allowing the server to filter the
refs it sends based on the sent patterns.

Signed-off-by: Brandon Williams 
---
 transport-helper.c   |  5 +++--
 transport-internal.h |  4 +++-
 transport.c  | 16 +---
 3 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/transport-helper.c b/transport-helper.c
index 508015023..4c334b5ee 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1026,7 +1026,8 @@ static int has_attribute(const char *attrs, const char 
*attr) {
}
 }
 
-static struct ref *get_refs_list(struct transport *transport, int for_push)
+static struct ref *get_refs_list(struct transport *transport, int for_push,
+const struct argv_array *ref_patterns)
 {
struct helper_data *data = transport->data;
struct child_process *helper;
@@ -1039,7 +1040,7 @@ static struct ref *get_refs_list(struct transport 
*transport, int for_push)
 
if (process_connect(transport, for_push)) {
do_take_over(transport);
-   return transport->vtable->get_refs_list(transport, for_push);
+   return transport->vtable->get_refs_list(transport, for_push, 
ref_patterns);
}
 
if (data->push && for_push)
diff --git a/transport-internal.h b/transport-internal.h
index 3c1a29d72..a67657ce3 100644
--- a/transport-internal.h
+++ b/transport-internal.h
@@ -3,6 +3,7 @@
 
 struct ref;
 struct transport;
+struct argv_array;
 
 struct transport_vtable {
/**
@@ -21,7 +22,8 @@ struct transport_vtable {
 * the ref without a huge amount of effort, it should store it
 * in the ref's old_sha1 field; otherwise it should be all 0.
 **/
-   struct ref *(*get_refs_list)(struct transport *transport, int for_push);
+   struct ref *(*get_refs_list)(struct transport *transport, int for_push,
+const struct argv_array *ref_patterns);
 
/**
 * Fetch the objects for the given refs. Note that this gets
diff --git a/transport.c b/transport.c
index ffc6b2614..c54a44630 100644
--- a/transport.c
+++ b/transport.c
@@ -72,7 +72,7 @@ struct bundle_transport_data {
struct bundle_header header;
 };
 
-static struct ref *get_refs_from_bundle(struct transport *transport, int 
for_push)
+static struct ref *get_refs_from_bundle(struct transport *transport, int 
for_push, const struct argv_array *ref_patterns)
 {
struct bundle_transport_data *data = transport->data;
struct ref *result = NULL;
@@ -189,7 +189,8 @@ static int connect_setup(struct transport *transport, int 
for_push)
return 0;
 }
 
-static struct ref *get_refs_via_connect(struct transport *transport, int 
for_push)
+static struct ref *get_refs_via_connect(struct transport *transport, int 
for_push,
+   const struct argv_array *ref_patterns)
 {
struct git_transport_data *data = transport->data;
struct ref *refs = NULL;
@@ -204,7 +205,8 @@ static struct ref *get_refs_via_connect(struct transport 
*transport, int for_pus
data->version = discover_version(&reader);
switch (data->version) {
case protocol_v2:
-   get_remote_refs(data->fd[1], &reader, &refs, for_push, NULL);
+   get_remote_refs(data->fd[1], &reader, &refs, for_push,
+   ref_patterns);
break;
case protocol_v1:
case protocol_v0:
@@ -250,7 +252,7 @@ static int fetch_refs_via_pack(struct transport *transport,
args.update_shallow = data->options.update_shallow;
 
if (!data->got_remote_heads)
-   refs_tmp = get_refs_via_connect(transport, 0);
+   refs_tmp = get_refs_via_connect(transport, 0, NULL);
 
switch (data->version) {
case protocol_v2:
@@ -568,7 +570,7 @@ static int git_transport_push(struct transport *transport, 
struct ref *remote_re
int ret = 0;
 
if (!data->got_remote_heads)
-   get_refs_via_connect(transport, 1);
+   get_refs_via_connect(transport, 1, NULL);
 
memset(&args, 0, sizeof(args));
args.send_mirror = !!(flags & TRANSPORT_PUSH_MIRROR);
@@ -1028,7 +1030,7 @@ int transport_push(struct transport *transport,
if (check_push_refs(local_refs, refspec_nr, refspec) < 0)
return -1;
 
-   remote_refs = transport->vtable->get_refs_list(transport, 1);
+   remote_refs = transport->vtable->get_refs_list(transport, 1, 
NULL);
 
if (flags & TRANSPORT_PUSH_ALL)
match_flags |= MATCH_REFS_ALL;
@@ -1137,7 +1139,7 @@ int transport_push(struct transport *transport,
 const struct ref *tra

[PATCH v3 23/35] fetch-pack: support shallow requests

2018-02-06 Thread Brandon Williams

Enable shallow clones and deepen requests using protocol version 2 if
the server 'fetch' command supports the 'shallow' feature.

Signed-off-by: Brandon Williams 
---
 connect.c| 22 +++
 connect.h|  2 ++
 fetch-pack.c | 69 +++-
 3 files changed, 92 insertions(+), 1 deletion(-)

diff --git a/connect.c b/connect.c
index 7cb1f1df7..9577528f3 100644
--- a/connect.c
+++ b/connect.c
@@ -82,6 +82,28 @@ int server_supports_v2(const char *c, int die_on_error)
return 0;
 }
 
+int server_supports_feature(const char *c, const char *feature,
+   int die_on_error)
+{
+   int i;
+
+   for (i = 0; i < server_capabilities_v2.argc; i++) {
+   const char *out;
+   if (skip_prefix(server_capabilities_v2.argv[i], c, &out) &&
+   (!*out || *(out++) == '=')) {
+   if (parse_feature_request(out, feature))
+   return 1;
+   else
+   break;
+   }
+   }
+
+   if (die_on_error)
+   die("server doesn't support feature '%s'", feature);
+
+   return 0;
+}
+
 static void process_capabilities_v2(struct packet_reader *reader)
 {
while (packet_reader_read(reader) == PACKET_READ_NORMAL)
diff --git a/connect.h b/connect.h
index 8898d4495..0e69c6709 100644
--- a/connect.h
+++ b/connect.h
@@ -17,5 +17,7 @@ struct packet_reader;
 extern enum protocol_version discover_version(struct packet_reader *reader);
 
 extern int server_supports_v2(const char *c, int die_on_error);
+extern int server_supports_feature(const char *c, const char *feature,
+  int die_on_error);
 
 #endif
diff --git a/fetch-pack.c b/fetch-pack.c
index 4fb5805dd..c0807e219 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1008,6 +1008,26 @@ static struct ref *do_fetch_pack(struct fetch_pack_args 
*args,
return ref;
 }
 
+static void add_shallow_requests(struct strbuf *req_buf,
+const struct fetch_pack_args *args)
+{
+   if (is_repository_shallow())
+   write_shallow_commits(req_buf, 1, NULL);
+   if (args->depth > 0)
+   packet_buf_write(req_buf, "deepen %d", args->depth);
+   if (args->deepen_since) {
+   timestamp_t max_age = approxidate(args->deepen_since);
+   packet_buf_write(req_buf, "deepen-since %"PRItime, max_age);
+   }
+   if (args->deepen_not) {
+   int i;
+   for (i = 0; i < args->deepen_not->nr; i++) {
+   struct string_list_item *s = args->deepen_not->items + 
i;
+   packet_buf_write(req_buf, "deepen-not %s", s->string);
+   }
+   }
+}
+
 static void add_wants(const struct ref *wants, struct strbuf *req_buf)
 {
for ( ; wants ; wants = wants->next) {
@@ -1090,6 +1110,10 @@ static int send_fetch_request(int fd_out, const struct 
fetch_pack_args *args,
if (prefer_ofs_delta)
packet_buf_write(&req_buf, "ofs-delta");
 
+   /* Add shallow-info and deepen request */
+   if (server_supports_feature("fetch", "shallow", 1))
+   add_shallow_requests(&req_buf, args);
+
/* add wants */
add_wants(wants, &req_buf);
 
@@ -1119,7 +1143,7 @@ static int process_section_header(struct packet_reader 
*reader,
int ret;
 
if (packet_reader_peek(reader) != PACKET_READ_NORMAL)
-   die("error reading packet");
+   die("error reading section header '%s'", section);
 
ret = !strcmp(reader->line, section);
 
@@ -1174,6 +1198,43 @@ static int process_acks(struct packet_reader *reader, 
struct oidset *common)
return received_ready ? 2 : (received_ack ? 1 : 0);
 }
 
+static void receive_shallow_info(struct fetch_pack_args *args,
+struct packet_reader *reader)
+{
+   process_section_header(reader, "shallow-info", 0);
+   while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+   const char *arg;
+   struct object_id oid;
+
+   if (skip_prefix(reader->line, "shallow ", &arg)) {
+   if (get_oid_hex(arg, &oid))
+   die(_("invalid shallow line: %s"), 
reader->line);
+   register_shallow(&oid);
+   continue;
+   }
+   if (skip_prefix(reader->line, "unshallow ", &arg)) {
+   if (get_oid_hex(arg, &oid))
+   die(_("invalid unshallow line: %s"), 
reader->line);
+   if (!lookup_object(oid.hash))
+   die(_("object not found: %s"), reader->line);
+   /* make sure that it is parsed as shallow */
+   if (!parse_object(&oid))
+   die

[PATCH v3 24/35] connect: refactor git_connect to only get the protocol version once

2018-02-06 Thread Brandon Williams

Instead of having each builtin transport asking for which protocol
version the user has configured in 'protocol.version' by calling
`get_protocol_version_config()` multiple times, factor this logic out
so there is just a single call at the beginning of `git_connect()`.

This will be helpful in the next patch where we can have centralized
logic which determines if we need to request a different protocol
version than what the user has configured.

Signed-off-by: Brandon Williams 
---
 connect.c | 27 +++
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/connect.c b/connect.c
index 9577528f3..dbf4def65 100644
--- a/connect.c
+++ b/connect.c
@@ -1029,6 +1029,7 @@ static enum ssh_variant determine_ssh_variant(const char 
*ssh_command,
  */
 static struct child_process *git_connect_git(int fd[2], char *hostandport,
 const char *path, const char *prog,
+enum protocol_version version,
 int flags)
 {
struct child_process *conn;
@@ -1067,10 +1068,10 @@ static struct child_process *git_connect_git(int fd[2], 
char *hostandport,
target_host, 0);
 
/* If using a new version put that stuff here after a second null byte 
*/
-   if (get_protocol_version_config() > 0) {
+   if (version > 0) {
strbuf_addch(&request, '\0');
strbuf_addf(&request, "version=%d%c",
-   get_protocol_version_config(), '\0');
+   version, '\0');
}
 
packet_write(fd[1], request.buf, request.len);
@@ -1086,14 +1087,14 @@ static struct child_process *git_connect_git(int fd[2], 
char *hostandport,
  */
 static void push_ssh_options(struct argv_array *args, struct argv_array *env,
 enum ssh_variant variant, const char *port,
-int flags)
+enum protocol_version version, int flags)
 {
if (variant == VARIANT_SSH &&
-   get_protocol_version_config() > 0) {
+   version > 0) {
argv_array_push(args, "-o");
argv_array_push(args, "SendEnv=" GIT_PROTOCOL_ENVIRONMENT);
argv_array_pushf(env, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
-get_protocol_version_config());
+version);
}
 
if (flags & CONNECT_IPV4) {
@@ -1146,7 +1147,8 @@ static void push_ssh_options(struct argv_array *args, 
struct argv_array *env,
 
 /* Prepare a child_process for use by Git's SSH-tunneled transport. */
 static void fill_ssh_args(struct child_process *conn, const char *ssh_host,
- const char *port, int flags)
+ const char *port, enum protocol_version version,
+ int flags)
 {
const char *ssh;
enum ssh_variant variant;
@@ -1180,14 +1182,14 @@ static void fill_ssh_args(struct child_process *conn, 
const char *ssh_host,
argv_array_push(&detect.args, ssh);
argv_array_push(&detect.args, "-G");
push_ssh_options(&detect.args, &detect.env_array,
-VARIANT_SSH, port, flags);
+VARIANT_SSH, port, version, flags);
argv_array_push(&detect.args, ssh_host);
 
variant = run_command(&detect) ? VARIANT_SIMPLE : VARIANT_SSH;
}
 
argv_array_push(&conn->args, ssh);
-   push_ssh_options(&conn->args, &conn->env_array, variant, port, flags);
+   push_ssh_options(&conn->args, &conn->env_array, variant, port, version, 
flags);
argv_array_push(&conn->args, ssh_host);
 }
 
@@ -1208,6 +1210,7 @@ struct child_process *git_connect(int fd[2], const char 
*url,
char *hostandport, *path;
struct child_process *conn;
enum protocol protocol;
+   enum protocol_version version = get_protocol_version_config();
 
/* Without this we cannot rely on waitpid() to tell
 * what happened to our children.
@@ -1222,7 +1225,7 @@ struct child_process *git_connect(int fd[2], const char 
*url,
printf("Diag: path=%s\n", path ? path : "NULL");
conn = NULL;
} else if (protocol == PROTO_GIT) {
-   conn = git_connect_git(fd, hostandport, path, prog, flags);
+   conn = git_connect_git(fd, hostandport, path, prog, version, 
flags);
} else {
struct strbuf cmd = STRBUF_INIT;
const char *const *var;
@@ -1265,12 +1268,12 @@ struct child_process *git_connect(int fd[2], const char 
*url,
strbuf_release(&cmd);
return NULL;
}
-   fill_ssh_args(conn, ssh_host, port, flags);
+   fill_ssh_args(conn,

[PATCH v3 16/35] transport: convert transport_get_remote_refs to take a list of ref patterns

2018-02-06 Thread Brandon Williams

Convert 'transport_get_remote_refs()' to optionally take a list of ref
patterns.

Signed-off-by: Brandon Williams 
---
 builtin/clone.c | 2 +-
 builtin/fetch.c | 4 ++--
 builtin/ls-remote.c | 2 +-
 builtin/remote.c| 2 +-
 transport.c | 7 +--
 transport.h | 3 ++-
 6 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 284651797..6e77d993f 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1121,7 +1121,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
if (transport->smart_options && !deepen)
transport->smart_options->check_self_contained_and_connected = 
1;
 
-   refs = transport_get_remote_refs(transport);
+   refs = transport_get_remote_refs(transport, NULL);
 
if (refs) {
mapped_refs = wanted_peer_refs(refs, refspec);
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 7bbcd26fa..850382f55 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -250,7 +250,7 @@ static void find_non_local_tags(struct transport *transport,
struct string_list_item *item = NULL;
 
for_each_ref(add_existing, &existing_refs);
-   for (ref = transport_get_remote_refs(transport); ref; ref = ref->next) {
+   for (ref = transport_get_remote_refs(transport, NULL); ref; ref = 
ref->next) {
if (!starts_with(ref->name, "refs/tags/"))
continue;
 
@@ -336,7 +336,7 @@ static struct ref *get_ref_map(struct transport *transport,
/* opportunistically-updated references: */
struct ref *orefs = NULL, **oref_tail = &orefs;
 
-   const struct ref *remote_refs = transport_get_remote_refs(transport);
+   const struct ref *remote_refs = transport_get_remote_refs(transport, 
NULL);
 
if (refspec_count) {
struct refspec *fetch_refspec;
diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index c4be98ab9..c6e9847c5 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -96,7 +96,7 @@ int cmd_ls_remote(int argc, const char **argv, const char 
*prefix)
if (uploadpack != NULL)
transport_set_option(transport, TRANS_OPT_UPLOADPACK, 
uploadpack);
 
-   ref = transport_get_remote_refs(transport);
+   ref = transport_get_remote_refs(transport, NULL);
if (transport_disconnect(transport))
return 1;
 
diff --git a/builtin/remote.c b/builtin/remote.c
index d95bf904c..d0b6ff6e2 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -862,7 +862,7 @@ static int get_remote_ref_states(const char *name,
if (query) {
transport = transport_get(states->remote, 
states->remote->url_nr > 0 ?
states->remote->url[0] : NULL);
-   remote_refs = transport_get_remote_refs(transport);
+   remote_refs = transport_get_remote_refs(transport, NULL);
transport_disconnect(transport);
 
states->queried = 1;
diff --git a/transport.c b/transport.c
index c54a44630..dfc603b36 100644
--- a/transport.c
+++ b/transport.c
@@ -1136,10 +1136,13 @@ int transport_push(struct transport *transport,
return 1;
 }
 
-const struct ref *transport_get_remote_refs(struct transport *transport)
+const struct ref *transport_get_remote_refs(struct transport *transport,
+   const struct argv_array 
*ref_patterns)
 {
if (!transport->got_remote_refs) {
-   transport->remote_refs = 
transport->vtable->get_refs_list(transport, 0, NULL);
+   transport->remote_refs =
+   transport->vtable->get_refs_list(transport, 0,
+ref_patterns);
transport->got_remote_refs = 1;
}
 
diff --git a/transport.h b/transport.h
index 731c78b67..4b656f315 100644
--- a/transport.h
+++ b/transport.h
@@ -178,7 +178,8 @@ int transport_push(struct transport *connection,
   int refspec_nr, const char **refspec, int flags,
   unsigned int * reject_reasons);
 
-const struct ref *transport_get_remote_refs(struct transport *transport);
+const struct ref *transport_get_remote_refs(struct transport *transport,
+   const struct argv_array 
*ref_patterns);
 
 int transport_fetch_refs(struct transport *transport, struct ref *refs);
 void transport_unlock_pack(struct transport *transport);
-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v3 19/35] push: pass ref patterns when pushing

2018-02-06 Thread Brandon Williams

Construct a list of ref patterns to be passed to 'get_refs_list()' from
the refspec to be used during the push.  This list of ref patterns will
be used to allow the server to filter the ref advertisement when
communicating using protocol v2.

Signed-off-by: Brandon Williams 
---
 transport.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/transport.c b/transport.c
index dfc603b36..6ea3905e3 100644
--- a/transport.c
+++ b/transport.c
@@ -1026,11 +1026,26 @@ int transport_push(struct transport *transport,
int porcelain = flags & TRANSPORT_PUSH_PORCELAIN;
int pretend = flags & TRANSPORT_PUSH_DRY_RUN;
int push_ret, ret, err;
+   struct refspec *tmp_rs;
+   struct argv_array ref_patterns = ARGV_ARRAY_INIT;
+   int i;
 
if (check_push_refs(local_refs, refspec_nr, refspec) < 0)
return -1;
 
-   remote_refs = transport->vtable->get_refs_list(transport, 1, 
NULL);
+   tmp_rs = parse_push_refspec(refspec_nr, refspec);
+   for (i = 0; i < refspec_nr; i++) {
+   if (tmp_rs[i].dst)
+   argv_array_push(&ref_patterns, tmp_rs[i].dst);
+   else if (tmp_rs[i].src && !tmp_rs[i].exact_sha1)
+   argv_array_push(&ref_patterns, tmp_rs[i].src);
+   }
+
+   remote_refs = transport->vtable->get_refs_list(transport, 1,
+  &ref_patterns);
+
+   argv_array_clear(&ref_patterns);
+   free_refspec(refspec_nr, tmp_rs);
 
if (flags & TRANSPORT_PUSH_ALL)
match_flags |= MATCH_REFS_ALL;
-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v3 27/35] transport-helper: refactor process_connect_service

2018-02-06 Thread Brandon Williams

A future patch will need to take advantage of the logic which runs and
processes the response of the connect command on a remote helper so
factor out this logic from 'process_connect_service()' and place it into
a helper function 'run_connect()'.

Signed-off-by: Brandon Williams 
---
 transport-helper.c | 67 +++---
 1 file changed, 38 insertions(+), 29 deletions(-)

diff --git a/transport-helper.c b/transport-helper.c
index d72155768..c032a2a87 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -545,14 +545,13 @@ static int fetch_with_import(struct transport *transport,
return 0;
 }
 
-static int process_connect_service(struct transport *transport,
-  const char *name, const char *exec)
+static int run_connect(struct transport *transport, struct strbuf *cmdbuf)
 {
struct helper_data *data = transport->data;
-   struct strbuf cmdbuf = STRBUF_INIT;
-   struct child_process *helper;
-   int r, duped, ret = 0;
+   int ret = 0;
+   int duped;
FILE *input;
+   struct child_process *helper;
 
helper = get_helper(transport);
 
@@ -568,44 +567,54 @@ static int process_connect_service(struct transport 
*transport,
input = xfdopen(duped, "r");
setvbuf(input, NULL, _IONBF, 0);
 
+   sendline(data, cmdbuf);
+   if (recvline_fh(input, cmdbuf))
+   exit(128);
+
+   if (!strcmp(cmdbuf->buf, "")) {
+   data->no_disconnect_req = 1;
+   if (debug)
+   fprintf(stderr, "Debug: Smart transport connection "
+   "ready.\n");
+   ret = 1;
+   } else if (!strcmp(cmdbuf->buf, "fallback")) {
+   if (debug)
+   fprintf(stderr, "Debug: Falling back to dumb "
+   "transport.\n");
+   } else {
+   die("Unknown response to connect: %s",
+   cmdbuf->buf);
+   }
+
+   fclose(input);
+   return ret;
+}
+
+static int process_connect_service(struct transport *transport,
+  const char *name, const char *exec)
+{
+   struct helper_data *data = transport->data;
+   struct strbuf cmdbuf = STRBUF_INIT;
+   int ret = 0;
+
/*
 * Handle --upload-pack and friends. This is fire and forget...
 * just warn if it fails.
 */
if (strcmp(name, exec)) {
-   r = set_helper_option(transport, "servpath", exec);
+   int r = set_helper_option(transport, "servpath", exec);
if (r > 0)
warning("Setting remote service path not supported by 
protocol.");
else if (r < 0)
warning("Invalid remote service path.");
}
 
-   if (data->connect)
+   if (data->connect) {
strbuf_addf(&cmdbuf, "connect %s\n", name);
-   else
-   goto exit;
-
-   sendline(data, &cmdbuf);
-   if (recvline_fh(input, &cmdbuf))
-   exit(128);
-
-   if (!strcmp(cmdbuf.buf, "")) {
-   data->no_disconnect_req = 1;
-   if (debug)
-   fprintf(stderr, "Debug: Smart transport connection "
-   "ready.\n");
-   ret = 1;
-   } else if (!strcmp(cmdbuf.buf, "fallback")) {
-   if (debug)
-   fprintf(stderr, "Debug: Falling back to dumb "
-   "transport.\n");
-   } else
-   die("Unknown response to connect: %s",
-   cmdbuf.buf);
+   ret = run_connect(transport, &cmdbuf);
+   }
 
-exit:
strbuf_release(&cmdbuf);
-   fclose(input);
return ret;
 }
 
-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v3 14/35] connect: request remote refs using v2

2018-02-06 Thread Brandon Williams

Teach the client to be able to request a remote's refs using protocol
v2.  This is done by having a client issue a 'ls-refs' request to a v2
server.

Signed-off-by: Brandon Williams 
---
 builtin/upload-pack.c  |  10 ++--
 connect.c  | 123 -
 connect.h  |   2 +
 remote.h   |   4 ++
 t/t5702-protocol-v2.sh |  53 +
 transport.c|   2 +-
 6 files changed, 187 insertions(+), 7 deletions(-)
 create mode 100755 t/t5702-protocol-v2.sh

diff --git a/builtin/upload-pack.c b/builtin/upload-pack.c
index 8d53e9794..a757df8da 100644
--- a/builtin/upload-pack.c
+++ b/builtin/upload-pack.c
@@ -5,6 +5,7 @@
 #include "parse-options.h"
 #include "protocol.h"
 #include "upload-pack.h"
+#include "serve.h"
 
 static const char * const upload_pack_usage[] = {
N_("git upload-pack [] "),
@@ -16,6 +17,7 @@ int cmd_upload_pack(int argc, const char **argv, const char 
*prefix)
const char *dir;
int strict = 0;
struct upload_pack_options opts = { 0 };
+   struct serve_options serve_opts = SERVE_OPTIONS_INIT;
struct option options[] = {
OPT_BOOL(0, "stateless-rpc", &opts.stateless_rpc,
 N_("quit after a single request/response exchange")),
@@ -48,11 +50,9 @@ int cmd_upload_pack(int argc, const char **argv, const char 
*prefix)
 
switch (determine_protocol_version_server()) {
case protocol_v2:
-   /*
-* fetch support for protocol v2 has not been implemented yet,
-* so ignore the request to use v2 and fallback to using v0.
-*/
-   upload_pack(&opts);
+   serve_opts.advertise_capabilities = opts.advertise_refs;
+   serve_opts.stateless_rpc = opts.stateless_rpc;
+   serve(&serve_opts);
break;
case protocol_v1:
/*
diff --git a/connect.c b/connect.c
index f2157a821..7cb1f1df7 100644
--- a/connect.c
+++ b/connect.c
@@ -12,9 +12,11 @@
 #include "sha1-array.h"
 #include "transport.h"
 #include "strbuf.h"
+#include "version.h"
 #include "protocol.h"
 
 static char *server_capabilities;
+static struct argv_array server_capabilities_v2 = ARGV_ARRAY_INIT;
 static const char *parse_feature_value(const char *, const char *, int *);
 
 static int check_ref(const char *name, unsigned int flags)
@@ -62,6 +64,33 @@ static void die_initial_contact(int unexpected)
  "and the repository exists."));
 }
 
+/* Checks if the server supports the capability 'c' */
+int server_supports_v2(const char *c, int die_on_error)
+{
+   int i;
+
+   for (i = 0; i < server_capabilities_v2.argc; i++) {
+   const char *out;
+   if (skip_prefix(server_capabilities_v2.argv[i], c, &out) &&
+   (!*out || *out == '='))
+   return 1;
+   }
+
+   if (die_on_error)
+   die("server doesn't support '%s'", c);
+
+   return 0;
+}
+
+static void process_capabilities_v2(struct packet_reader *reader)
+{
+   while (packet_reader_read(reader) == PACKET_READ_NORMAL)
+   argv_array_push(&server_capabilities_v2, reader->line);
+
+   if (reader->status != PACKET_READ_FLUSH)
+   die("protocol error");
+}
+
 enum protocol_version discover_version(struct packet_reader *reader)
 {
enum protocol_version version = protocol_unknown_version;
@@ -85,7 +114,7 @@ enum protocol_version discover_version(struct packet_reader 
*reader)
/* Maybe process capabilities here, at least for v2 */
switch (version) {
case protocol_v2:
-   die("support for protocol v2 not implemented yet");
+   process_capabilities_v2(reader);
break;
case protocol_v1:
/* Read the peeked version line */
@@ -293,6 +322,98 @@ struct ref **get_remote_heads(struct packet_reader *reader,
return list;
 }
 
+static int process_ref_v2(const char *line, struct ref ***list)
+{
+   int ret = 1;
+   int i = 0;
+   struct object_id old_oid;
+   struct ref *ref;
+   struct string_list line_sections = STRING_LIST_INIT_DUP;
+
+   if (string_list_split(&line_sections, line, ' ', -1) < 2) {
+   ret = 0;
+   goto out;
+   }
+
+   if (get_oid_hex(line_sections.items[i++].string, &old_oid)) {
+   ret = 0;
+   goto out;
+   }
+
+   ref = alloc_ref(line_sections.items[i++].string);
+
+   oidcpy(&ref->old_oid, &old_oid);
+   **list = ref;
+   *list = &ref->next;
+
+   for (; i < line_sections.nr; i++) {
+   const char *arg = line_sections.items[i].string;
+   if (skip_prefix(arg, "symref-target:", &arg))
+   ref->symref = xstrdup(arg);
+
+   if (skip_prefix(arg, "peeled:", &arg)) {
+

[PATCH v3 30/35] remote-curl: create copy of the service name

2018-02-06 Thread Brandon Williams

Make a copy of the service name being requested instead of relying on
the buffer pointed to by the passed in 'const char *' to remain
unchanged.

Signed-off-by: Brandon Williams 
---
 remote-curl.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/remote-curl.c b/remote-curl.c
index dae8a4a48..4086aa733 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -165,7 +165,7 @@ static int set_option(const char *name, const char *value)
 }
 
 struct discovery {
-   const char *service;
+   char *service;
char *buf_alloc;
char *buf;
size_t len;
@@ -257,6 +257,7 @@ static void free_discovery(struct discovery *d)
free(d->shallow.oid);
free(d->buf_alloc);
free_refs(d->refs);
+   free(d->service);
free(d);
}
 }
@@ -343,7 +344,7 @@ static struct discovery *discover_refs(const char *service, 
int for_push)
warning(_("redirecting to %s"), url.buf);
 
last= xcalloc(1, sizeof(*last_discovery));
-   last->service = service;
+   last->service = xstrdup(service);
last->buf_alloc = strbuf_detach(&buffer, &last->len);
last->buf = last->buf_alloc;
 
-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v3 29/35] pkt-line: add packet_buf_write_len function

2018-02-06 Thread Brandon Williams

Add the 'packet_buf_write_len()' function which allows for writing an
arbitrary length buffer into a 'struct strbuf' and formatting it in
packet-line format.

Signed-off-by: Brandon Williams 
---
 pkt-line.c | 16 
 pkt-line.h |  1 +
 2 files changed, 17 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index 726e109ca..5a8a17ecc 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -215,6 +215,22 @@ void packet_buf_write(struct strbuf *buf, const char *fmt, 
...)
va_end(args);
 }
 
+void packet_buf_write_len(struct strbuf *buf, const char *data, size_t len)
+{
+   size_t orig_len, n;
+
+   orig_len = buf->len;
+   strbuf_addstr(buf, "");
+   strbuf_add(buf, data, len);
+   n = buf->len - orig_len;
+
+   if (n > LARGE_PACKET_MAX)
+   die("protocol error: impossibly long line");
+
+   set_packet_header(&buf->buf[orig_len], n);
+   packet_trace(buf->buf + orig_len + 4, n - 4, 1);
+}
+
 int write_packetized_from_fd(int fd_in, int fd_out)
 {
static char buf[LARGE_PACKET_DATA_MAX];
diff --git a/pkt-line.h b/pkt-line.h
index 16fe8bdbf..63724d4bf 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -26,6 +26,7 @@ void packet_buf_flush(struct strbuf *buf);
 void packet_buf_delim(struct strbuf *buf);
 void packet_write(int fd_out, const char *buf, size_t size);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) 
__attribute__((format (printf, 2, 3)));
+void packet_buf_write_len(struct strbuf *buf, const char *data, size_t len);
 int packet_flush_gently(int fd);
 int packet_write_fmt_gently(int fd, const char *fmt, ...) 
__attribute__((format (printf, 2, 3)));
 int write_packetized_from_fd(int fd_in, int fd_out);
-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v3 32/35] http: allow providing extra headers for http requests

2018-02-06 Thread Brandon Williams

Add a way for callers to request that extra headers be included when
making http requests.

Signed-off-by: Brandon Williams 
---
 http.c | 8 
 http.h | 2 ++
 2 files changed, 10 insertions(+)

diff --git a/http.c b/http.c
index 597771271..e1757d62b 100644
--- a/http.c
+++ b/http.c
@@ -1723,6 +1723,14 @@ static int http_request(const char *url,
 
headers = curl_slist_append(headers, buf.buf);
 
+   /* Add additional headers here */
+   if (options && options->extra_headers) {
+   const struct string_list_item *item;
+   for_each_string_list_item(item, options->extra_headers) {
+   headers = curl_slist_append(headers, item->string);
+   }
+   }
+
curl_easy_setopt(slot->curl, CURLOPT_URL, url);
curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);
curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "gzip");
diff --git a/http.h b/http.h
index f7bd3b26b..a113915c7 100644
--- a/http.h
+++ b/http.h
@@ -172,6 +172,8 @@ struct http_get_options {
 * for details.
 */
struct strbuf *base_url;
+
+   struct string_list *extra_headers;
 };
 
 /* Return values for http_get_*() */
-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v3 35/35] remote-curl: don't request v2 when pushing

2018-02-06 Thread Brandon Williams

In order to be able to ship protocol v2 with only supporting fetch, we
need clients to not issue a request to use protocol v2 when pushing
(since the client currently doesn't know how to push using protocol v2).
This allows a client to have protocol v2 configured in
`protocol.version` and take advantage of using v2 for fetch and falling
back to using v0 when pushing while v2 for push is being designed.

We could run into issues if we didn't fall back to protocol v2 when
pushing right now.  This is because currently a server will ignore a request to
use v2 when contacting the 'receive-pack' endpoint and fall back to
using v0, but when push v2 is rolled out to servers, the 'receive-pack'
endpoint will start responding using v2.  So we don't want to get into a
state where a client is requesting to push with v2 before they actually
know how to push using v2.

Signed-off-by: Brandon Williams 
---
 remote-curl.c  | 11 ++-
 t/t5702-protocol-v2.sh | 23 +++
 2 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/remote-curl.c b/remote-curl.c
index af431b658..c39b6ece6 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -320,6 +320,7 @@ static struct discovery *discover_refs(const char *service, 
int for_push)
struct discovery *last = last_discovery;
int http_ret, maybe_smart = 0;
struct http_get_options http_options;
+   enum protocol_version version = get_protocol_version_config();
 
if (last && !strcmp(service, last->service))
return last;
@@ -336,8 +337,16 @@ static struct discovery *discover_refs(const char 
*service, int for_push)
strbuf_addf(&refs_url, "service=%s", service);
}
 
+   /*
+* NEEDSWORK: If we are trying to use protocol v2 and we are planning
+* to perform a push, then fallback to v0 since the client doesn't know
+* how to push yet using v2.
+*/
+   if (version == protocol_v2 && !strcmp("git-receive-pack", service))
+   version = protocol_v0;
+
/* Add the extra Git-Protocol header */
-   if (get_protocol_http_header(get_protocol_version_config(), 
&protocol_header))
+   if (get_protocol_http_header(version, &protocol_header))
string_list_append(&extra_headers, protocol_header.buf);
 
memset(&http_options, 0, sizeof(http_options));
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index c2c39fe0c..14d589a7f 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -211,6 +211,29 @@ test_expect_success 'fetch with http:// using protocol v2' 
'
grep "git< version 2" log
 '
 
+test_expect_success 'push with http:// and a config of v2 does not request v2' 
'
+   # Till v2 for push is designed, make sure that if a client has
+   # protocol.version configured to use v2, that the client instead falls
+   # back and uses v0.
+
+   test_commit -C http_child three &&
+
+   # Push to another branch, as the target repository has the
+   # master branch checked out and we cannot push into it.
+   GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \
+   push origin HEAD:client_branch && 2>log &&
+
+   git -C http_child log -1 --format=%s >actual &&
+   git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s 
client_branch >expect &&
+   test_cmp expect actual &&
+
+   # Client didnt request to use protocol v2
+   ! grep "Git-Protocol: version=2" log &&
+   # Server didnt respond using protocol v2
+   ! grep "git< version 2" log
+'
+
+
 stop_httpd
 
 test_done
-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v3 31/35] remote-curl: store the protocol version the server responded with

2018-02-06 Thread Brandon Williams

Store the protocol version the server responded with when performing
discovery.  This will be used in a future patch to either change the
'Git-Protocol' header sent in subsequent requests or to determine if a
client needs to fallback to using a different protocol version.

Signed-off-by: Brandon Williams 
---
 remote-curl.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/remote-curl.c b/remote-curl.c
index 4086aa733..c54035843 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -171,6 +171,7 @@ struct discovery {
size_t len;
struct ref *refs;
struct oid_array shallow;
+   enum protocol_version version;
unsigned proto_git : 1;
 };
 static struct discovery *last_discovery;
@@ -184,7 +185,8 @@ static struct ref *parse_git_refs(struct discovery *heads, 
int for_push)
   PACKET_READ_CHOMP_NEWLINE |
   PACKET_READ_GENTLE_ON_EOF);
 
-   switch (discover_version(&reader)) {
+   heads->version = discover_version(&reader);
+   switch (heads->version) {
case protocol_v2:
die("support for protocol v2 not implemented yet");
break;
-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v3 22/35] upload-pack: support shallow requests

2018-02-06 Thread Brandon Williams

Add the 'shallow' feature to the protocol version 2 command 'fetch'
which indicates that the server supports shallow clients and deepen
requets.

Signed-off-by: Brandon Williams 
---
 Documentation/technical/protocol-v2.txt |  67 +++-
 serve.c |   2 +-
 t/t5701-git-serve.sh|   2 +-
 upload-pack.c   | 138 +++-
 upload-pack.h   |   3 +
 5 files changed, 173 insertions(+), 39 deletions(-)

diff --git a/Documentation/technical/protocol-v2.txt 
b/Documentation/technical/protocol-v2.txt
index 4d5096dae..fedeb6b77 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -201,12 +201,42 @@ packet-lines:
to its base by position in pack rather than by an oid.  That is,
they can read OBJ_OFS_DELTA (ake type 6) in a packfile.
 
+shallow 
+   A client must notify the server of all objects for which it only
+   has shallow copies of (meaning that it doesn't have the parents
+   of a commit) by supplying a 'shallow ' line for each such
+   object so that the serve is aware of the limitations of the
+   client's history.
+
+deepen 
+   Request that the fetch/clone should be shallow having a commit depth of
+relative to the remote side.
+
+deepen-relative
+   Requests that the semantics of the "deepen" command be changed
+   to indicate that the depth requested is relative to the clients
+   current shallow boundary, instead of relative to the remote
+   refs.
+
+deepen-since 
+   Requests that the shallow clone/fetch should be cut at a
+   specific time, instead of depth.  Internally it's equivalent of
+   doing "rev-list --max-age=". Cannot be used with
+   "deepen".
+
+deepen-not 
+   Requests that the shallow clone/fetch should be cut at a
+   specific revision specified by '', instead of a depth.
+   Internally it's equivalent of doing "rev-list --not ".
+   Cannot be used with "deepen", but can be used with
+   "deepen-since".
+
 The response of `fetch` is broken into a number of sections separated by
 delimiter packets (0001), with each section beginning with its section
 header.
 
 output = *section
-section = (acknowledgments | packfile)
+section = (acknowledgments | shallow-info | packfile)
  (flush-pkt | delim-pkt)
 
 acknowledgments = PKT-LINE("acknowledgments" LF)
@@ -215,6 +245,11 @@ header.
 nak = PKT-LINE("NAK" LF)
 ack = PKT-LINE("ACK" SP obj-id LF)
 
+shallow-info = PKT-LINE("shallow-info" LF)
+  *PKT-LINE((shallow | unshallow) LF)
+shallow = "shallow" SP obj-id
+unshallow = "unshallow" SP obj-id
+
 packfile = PKT-LINE("packfile" LF)
   [PACKFILE]
 
@@ -247,6 +282,36 @@ header.
  determined the objects it plans to send to the client and no
  further negotiation is needed.
 
+
+shallow-info section
+   If the client has requested a shallow fetch/clone, a shallow
+   client requests a fetch or the server is shallow then the
+   server's response may include a shallow-info section.  The
+   shallow-info section will be include if (due to one of the above
+   conditions) the server needs to inform the client of any shallow
+   boundaries or adjustments to the clients already existing
+   shallow boundaries.
+
+   * Always begins with the section header "shallow-info"
+
+   * If a positive depth is requested, the server will compute the
+ set of commits which are no deeper than the desired depth.
+
+   * The server sends a "shallow obj-id" line for each commit whose
+ parents will not be sent in the following packfile.
+
+   * The server sends an "unshallow obj-id" line for each commit
+ which the client has indicated is shallow, but is no longer
+ shallow as a result of the fetch (due to its parents being
+ sent in the following packfile).
+
+   * The server MUST NOT send any "unshallow" lines for anything
+ which the client has not indicated was shallow as a part of
+ its request.
+
+   * This section is only included if a packfile section is also
+ included in the response.
+
 
 packfile section
* Always begins with the section header "packfile"
diff --git a/serve.c b/serve.c
index 05cc434cf..c3e58c1e7 100644
--- a/serve.c
+++ b/serve.c
@@ -53,7 +53,7 @@ struct protocol_capability {
 static struct protocol_capability capabilities[] = {
{ "agent", agent_advertise, NULL },
{ "ls-refs", always_advertise, ls_refs },
-   { "fetch", always_advertise, upload_pack_v2 },
+   { "fetch", upload_pack_advertise, upload_pack_v2 },
 };
 
 static void advertise_capabilities(void)
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index 202cb782d..491adc693 100755

[PATCH v3 34/35] remote-curl: implement stateless-connect command

2018-02-06 Thread Brandon Williams

Teach remote-curl the 'stateless-connect' command which is used to
establish a stateless connection with servers which support protocol
version 2.  This allows remote-curl to act as a proxy, allowing the git
client to communicate natively with a remote end, simply using
remote-curl as a pass through to convert requests to http.

Signed-off-by: Brandon Williams 
---
 remote-curl.c  | 187 -
 t/t5702-protocol-v2.sh |  41 +++
 2 files changed, 227 insertions(+), 1 deletion(-)

diff --git a/remote-curl.c b/remote-curl.c
index b4e9db85b..af431b658 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -188,7 +188,10 @@ static struct ref *parse_git_refs(struct discovery *heads, 
int for_push)
heads->version = discover_version(&reader);
switch (heads->version) {
case protocol_v2:
-   die("support for protocol v2 not implemented yet");
+   /*
+* Do nothing.  Client should run 'stateless-connect' and
+* request the refs themselves.
+*/
break;
case protocol_v1:
case protocol_v0:
@@ -1082,6 +1085,184 @@ static void parse_push(struct strbuf *buf)
free(specs);
 }
 
+struct proxy_state {
+   char *service_name;
+   char *service_url;
+   struct curl_slist *headers;
+   struct strbuf request_buffer;
+   int in;
+   int out;
+   struct packet_reader reader;
+   size_t pos;
+   int seen_flush;
+};
+
+static void proxy_state_init(struct proxy_state *p, const char *service_name,
+enum protocol_version version)
+{
+   struct strbuf buf = STRBUF_INIT;
+
+   memset(p, 0, sizeof(*p));
+   p->service_name = xstrdup(service_name);
+
+   p->in = 0;
+   p->out = 1;
+   strbuf_init(&p->request_buffer, 0);
+
+   strbuf_addf(&buf, "%s%s", url.buf, p->service_name);
+   p->service_url = strbuf_detach(&buf, NULL);
+
+   p->headers = http_copy_default_headers();
+
+   strbuf_addf(&buf, "Content-Type: application/x-%s-request", 
p->service_name);
+   p->headers = curl_slist_append(p->headers, buf.buf);
+   strbuf_reset(&buf);
+
+   strbuf_addf(&buf, "Accept: application/x-%s-result", p->service_name);
+   p->headers = curl_slist_append(p->headers, buf.buf);
+   strbuf_reset(&buf);
+
+   p->headers = curl_slist_append(p->headers, "Transfer-Encoding: 
chunked");
+
+   /* Add the Git-Protocol header */
+   if (get_protocol_http_header(version, &buf))
+   p->headers = curl_slist_append(p->headers, buf.buf);
+
+   packet_reader_init(&p->reader, p->in, NULL, 0,
+  PACKET_READ_GENTLE_ON_EOF);
+
+   strbuf_release(&buf);
+}
+
+static void proxy_state_clear(struct proxy_state *p)
+{
+   free(p->service_name);
+   free(p->service_url);
+   curl_slist_free_all(p->headers);
+   strbuf_release(&p->request_buffer);
+}
+
+static size_t proxy_in(char *buffer, size_t eltsize,
+  size_t nmemb, void *userdata)
+{
+   size_t max = eltsize * nmemb;
+   struct proxy_state *p = userdata;
+   size_t avail = p->request_buffer.len - p->pos;
+
+   if (!avail) {
+   if (p->seen_flush) {
+   p->seen_flush = 0;
+   return 0;
+   }
+
+   strbuf_reset(&p->request_buffer);
+   switch (packet_reader_read(&p->reader)) {
+   case PACKET_READ_EOF:
+   die("unexpected EOF when reading from parent process");
+   case PACKET_READ_NORMAL:
+   packet_buf_write_len(&p->request_buffer, p->reader.line,
+p->reader.pktlen);
+   break;
+   case PACKET_READ_DELIM:
+   packet_buf_delim(&p->request_buffer);
+   break;
+   case PACKET_READ_FLUSH:
+   packet_buf_flush(&p->request_buffer);
+   p->seen_flush = 1;
+   break;
+   }
+   p->pos = 0;
+   avail = p->request_buffer.len;
+   }
+
+   if (max < avail)
+   avail = max;
+   memcpy(buffer, p->request_buffer.buf + p->pos, avail);
+   p->pos += avail;
+   return avail;
+}
+
+static size_t proxy_out(char *buffer, size_t eltsize,
+   size_t nmemb, void *userdata)
+{
+   size_t size = eltsize * nmemb;
+   struct proxy_state *p = userdata;
+
+   write_or_die(p->out, buffer, size);
+   return size;
+}
+
+static int proxy_post(struct proxy_state *p)
+{
+   struct active_request_slot *slot;
+   int err;
+
+   slot = get_active_slot();
+
+   curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
+   curl_easy_setopt(slot->curl, CURLOPT_POST, 1);
+   curl_easy_setopt(slot->curl,

[PATCH v3 25/35] connect: don't request v2 when pushing

2018-02-06 Thread Brandon Williams

In order to be able to ship protocol v2 with only supporting fetch, we
need clients to not issue a request to use protocol v2 when pushing
(since the client currently doesn't know how to push using protocol v2).
This allows a client to have protocol v2 configured in
`protocol.version` and take advantage of using v2 for fetch and falling
back to using v0 when pushing while v2 for push is being designed.

We could run into issues if we didn't fall back to protocol v2 when
pushing right now.  This is because currently a server will ignore a request to
use v2 when contacting the 'receive-pack' endpoint and fall back to
using v0, but when push v2 is rolled out to servers, the 'receive-pack'
endpoint will start responding using v2.  So we don't want to get into a
state where a client is requesting to push with v2 before they actually
know how to push using v2.

Signed-off-by: Brandon Williams 
---
 connect.c  |  8 
 t/t5702-protocol-v2.sh | 22 ++
 2 files changed, 30 insertions(+)

diff --git a/connect.c b/connect.c
index dbf4def65..37a6a8935 100644
--- a/connect.c
+++ b/connect.c
@@ -1212,6 +1212,14 @@ struct child_process *git_connect(int fd[2], const char 
*url,
enum protocol protocol;
enum protocol_version version = get_protocol_version_config();
 
+   /*
+* NEEDSWORK: If we are trying to use protocol v2 and we are planning
+* to perform a push, then fallback to v0 since the client doesn't know
+* how to push yet using v2.
+*/
+   if (version == protocol_v2 && !strcmp("git-receive-pack", prog))
+   version = protocol_v0;
+
/* Without this we cannot rely on waitpid() to tell
 * what happened to our children.
 */
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 16304d1c8..60e43bcf5 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -82,6 +82,28 @@ test_expect_success 'pull with git:// using protocol v2' '
grep "fetch< version 2" log
 '
 
+test_expect_success 'push with git:// and a config of v2 does not request v2' '
+   # Till v2 for push is designed, make sure that if a client has
+   # protocol.version configured to use v2, that the client instead falls
+   # back and uses v0.
+
+   test_commit -C daemon_child three &&
+
+   # Push to another branch, as the target repository has the
+   # master branch checked out and we cannot push into it.
+   GIT_TRACE_PACKET=1 git -C daemon_child -c protocol.version=2 \
+   push origin HEAD:client_branch 2>log &&
+
+   git -C daemon_child log -1 --format=%s >actual &&
+   git -C "$daemon_parent" log -1 --format=%s client_branch >expect &&
+   test_cmp expect actual &&
+
+   # Client requested to use protocol v2
+   ! grep "push> .*\\\0\\\0version=2\\\0$" log &&
+   # Server responded using protocol v2
+   ! grep "push< version 2" log
+'
+
 stop_git_daemon
 
 # Test protocol v2 with 'file://' transport
-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v3 28/35] transport-helper: introduce stateless-connect

2018-02-06 Thread Brandon Williams

Introduce the transport-helper capability 'stateless-connect'.  This
capability indicates that the transport-helper can be requested to run
the 'stateless-connect' command which should attempt to make a
stateless connection with a remote end.  Once established, the
connection can be used by the git client to communicate with
the remote end natively in a stateless-rpc manner as supported by
protocol v2.  This means that the client must send everything the server
needs in a single request as the client must not assume any
state-storing on the part of the server or transport.

If a stateless connection cannot be established then the remote-helper
will respond in the same manner as the 'connect' command indicating that
the client should fallback to using the dumb remote-helper commands.

Signed-off-by: Brandon Williams 
---
 transport-helper.c | 8 
 transport.c| 1 +
 transport.h| 6 ++
 3 files changed, 15 insertions(+)

diff --git a/transport-helper.c b/transport-helper.c
index c032a2a87..82eb57c4a 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -26,6 +26,7 @@ struct helper_data {
option : 1,
push : 1,
connect : 1,
+   stateless_connect : 1,
signed_tags : 1,
check_connectivity : 1,
no_disconnect_req : 1,
@@ -188,6 +189,8 @@ static struct child_process *get_helper(struct transport 
*transport)
refspecs[refspec_nr++] = xstrdup(arg);
} else if (!strcmp(capname, "connect")) {
data->connect = 1;
+   } else if (!strcmp(capname, "stateless-connect")) {
+   data->stateless_connect = 1;
} else if (!strcmp(capname, "signed-tags")) {
data->signed_tags = 1;
} else if (skip_prefix(capname, "export-marks ", &arg)) {
@@ -612,6 +615,11 @@ static int process_connect_service(struct transport 
*transport,
if (data->connect) {
strbuf_addf(&cmdbuf, "connect %s\n", name);
ret = run_connect(transport, &cmdbuf);
+   } else if (data->stateless_connect) {
+   strbuf_addf(&cmdbuf, "stateless-connect %s\n", name);
+   ret = run_connect(transport, &cmdbuf);
+   if (ret)
+   transport->stateless_rpc = 1;
}
 
strbuf_release(&cmdbuf);
diff --git a/transport.c b/transport.c
index c275f46ed..9125174f7 100644
--- a/transport.c
+++ b/transport.c
@@ -250,6 +250,7 @@ static int fetch_refs_via_pack(struct transport *transport,
data->options.check_self_contained_and_connected;
args.cloning = transport->cloning;
args.update_shallow = data->options.update_shallow;
+   args.stateless_rpc = transport->stateless_rpc;
 
if (!data->got_remote_heads)
refs_tmp = get_refs_via_connect(transport, 0, NULL);
diff --git a/transport.h b/transport.h
index 4b656f315..9eac809ee 100644
--- a/transport.h
+++ b/transport.h
@@ -55,6 +55,12 @@ struct transport {
 */
unsigned cloning : 1;
 
+   /*
+* Indicates that the transport is connected via a half-duplex
+* connection and should operate in stateless-rpc mode.
+*/
+   unsigned stateless_rpc : 1;
+
/*
 * These strings will be passed to the {pre, post}-receive hook,
 * on the remote side, if both sides support the push options 
capability.
-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v3 21/35] fetch-pack: perform a fetch using v2

2018-02-06 Thread Brandon Williams

When communicating with a v2 server, perform a fetch by requesting the
'fetch' command.

Signed-off-by: Brandon Williams 
---
 builtin/fetch-pack.c   |   2 +-
 fetch-pack.c   | 252 -
 fetch-pack.h   |   4 +-
 t/t5702-protocol-v2.sh |  84 +
 transport.c|   7 +-
 5 files changed, 342 insertions(+), 7 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index f492e8abd..867dd3cc7 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -213,7 +213,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
}
 
ref = fetch_pack(&args, fd, conn, ref, dest, sought, nr_sought,
-&shallow, pack_lockfile_ptr);
+&shallow, pack_lockfile_ptr, protocol_v0);
if (pack_lockfile) {
printf("lock %s\n", pack_lockfile);
fflush(stdout);
diff --git a/fetch-pack.c b/fetch-pack.c
index 9f6b07ad9..4fb5805dd 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1008,6 +1008,247 @@ static struct ref *do_fetch_pack(struct fetch_pack_args 
*args,
return ref;
 }
 
+static void add_wants(const struct ref *wants, struct strbuf *req_buf)
+{
+   for ( ; wants ; wants = wants->next) {
+   const struct object_id *remote = &wants->old_oid;
+   const char *remote_hex;
+   struct object *o;
+
+   /*
+* If that object is complete (i.e. it is an ancestor of a
+* local ref), we tell them we have it but do not have to
+* tell them about its ancestors, which they already know
+* about.
+*
+* We use lookup_object here because we are only
+* interested in the case we *know* the object is
+* reachable and we have already scanned it.
+*/
+   if (((o = lookup_object(remote->hash)) != NULL) &&
+   (o->flags & COMPLETE)) {
+   continue;
+   }
+
+   remote_hex = oid_to_hex(remote);
+   packet_buf_write(req_buf, "want %s\n", remote_hex);
+   }
+}
+
+static void add_common(struct strbuf *req_buf, struct oidset *common)
+{
+   struct oidset_iter iter;
+   const struct object_id *oid;
+   oidset_iter_init(common, &iter);
+
+   while ((oid = oidset_iter_next(&iter))) {
+   packet_buf_write(req_buf, "have %s\n", oid_to_hex(oid));
+   }
+}
+
+static int add_haves(struct strbuf *req_buf, int *in_vain)
+{
+   int ret = 0;
+   int haves_added = 0;
+   const struct object_id *oid;
+
+   while ((oid = get_rev())) {
+   packet_buf_write(req_buf, "have %s\n", oid_to_hex(oid));
+   if (++haves_added >= INITIAL_FLUSH)
+   break;
+   };
+
+   *in_vain += haves_added;
+   if (!haves_added || *in_vain >= MAX_IN_VAIN) {
+   /* Send Done */
+   packet_buf_write(req_buf, "done\n");
+   ret = 1;
+   }
+
+   return ret;
+}
+
+static int send_fetch_request(int fd_out, const struct fetch_pack_args *args,
+ const struct ref *wants, struct oidset *common,
+ int *in_vain)
+{
+   int ret = 0;
+   struct strbuf req_buf = STRBUF_INIT;
+
+   if (server_supports_v2("fetch", 1))
+   packet_buf_write(&req_buf, "command=fetch");
+   if (server_supports_v2("agent", 0))
+   packet_buf_write(&req_buf, "agent=%s", 
git_user_agent_sanitized());
+
+   packet_buf_delim(&req_buf);
+   if (args->use_thin_pack)
+   packet_buf_write(&req_buf, "thin-pack");
+   if (args->no_progress)
+   packet_buf_write(&req_buf, "no-progress");
+   if (args->include_tag)
+   packet_buf_write(&req_buf, "include-tag");
+   if (prefer_ofs_delta)
+   packet_buf_write(&req_buf, "ofs-delta");
+
+   /* add wants */
+   add_wants(wants, &req_buf);
+
+   /* Add all of the common commits we've found in previous rounds */
+   add_common(&req_buf, common);
+
+   /* Add initial haves */
+   ret = add_haves(&req_buf, in_vain);
+
+   /* Send request */
+   packet_buf_flush(&req_buf);
+   write_or_die(fd_out, req_buf.buf, req_buf.len);
+
+   strbuf_release(&req_buf);
+   return ret;
+}
+
+/*
+ * Processes a section header in a server's response and checks if it matches
+ * `section`.  If the value of `peek` is 1, the header line will be peeked (and
+ * not consumed); if 0, the line will be consumed and the function will die if
+ * the section header doesn't match what was expected.
+ */
+static int process_section_header(struct packet_reader *reader,
+ const char *section, int peek)
+{
+   int ret;
+
+   if (packet_reader_peek(reader) !

[PATCH v3 33/35] http: don't always add Git-Protocol header

2018-02-06 Thread Brandon Williams

Instead of always sending the Git-Protocol header with the configured
version with every http request, explicitly send it when discovering
refs and then only send it on subsequent http requests if the server
understood the version requested.

Signed-off-by: Brandon Williams 
---
 http.c| 17 -
 remote-curl.c | 33 +
 2 files changed, 33 insertions(+), 17 deletions(-)

diff --git a/http.c b/http.c
index e1757d62b..8f1129ac7 100644
--- a/http.c
+++ b/http.c
@@ -904,21 +904,6 @@ static void set_from_env(const char **var, const char 
*envname)
*var = val;
 }
 
-static void protocol_http_header(void)
-{
-   if (get_protocol_version_config() > 0) {
-   struct strbuf protocol_header = STRBUF_INIT;
-
-   strbuf_addf(&protocol_header, GIT_PROTOCOL_HEADER ": 
version=%d",
-   get_protocol_version_config());
-
-
-   extra_http_headers = curl_slist_append(extra_http_headers,
-  protocol_header.buf);
-   strbuf_release(&protocol_header);
-   }
-}
-
 void http_init(struct remote *remote, const char *url, int proactive_auth)
 {
char *low_speed_limit;
@@ -949,8 +934,6 @@ void http_init(struct remote *remote, const char *url, int 
proactive_auth)
if (remote)
var_override(&http_proxy_authmethod, 
remote->http_proxy_authmethod);
 
-   protocol_http_header();
-
pragma_header = curl_slist_append(http_copy_default_headers(),
"Pragma: no-cache");
no_pragma_header = curl_slist_append(http_copy_default_headers(),
diff --git a/remote-curl.c b/remote-curl.c
index c54035843..b4e9db85b 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -291,6 +291,19 @@ static int show_http_message(struct strbuf *type, struct 
strbuf *charset,
return 0;
 }
 
+static int get_protocol_http_header(enum protocol_version version,
+   struct strbuf *header)
+{
+   if (version > 0) {
+   strbuf_addf(header, GIT_PROTOCOL_HEADER ": version=%d",
+   version);
+
+   return 1;
+   }
+
+   return 0;
+}
+
 static struct discovery *discover_refs(const char *service, int for_push)
 {
struct strbuf exp = STRBUF_INIT;
@@ -299,6 +312,8 @@ static struct discovery *discover_refs(const char *service, 
int for_push)
struct strbuf buffer = STRBUF_INIT;
struct strbuf refs_url = STRBUF_INIT;
struct strbuf effective_url = STRBUF_INIT;
+   struct strbuf protocol_header = STRBUF_INIT;
+   struct string_list extra_headers = STRING_LIST_INIT_DUP;
struct discovery *last = last_discovery;
int http_ret, maybe_smart = 0;
struct http_get_options http_options;
@@ -318,11 +333,16 @@ static struct discovery *discover_refs(const char 
*service, int for_push)
strbuf_addf(&refs_url, "service=%s", service);
}
 
+   /* Add the extra Git-Protocol header */
+   if (get_protocol_http_header(get_protocol_version_config(), 
&protocol_header))
+   string_list_append(&extra_headers, protocol_header.buf);
+
memset(&http_options, 0, sizeof(http_options));
http_options.content_type = &type;
http_options.charset = &charset;
http_options.effective_url = &effective_url;
http_options.base_url = &url;
+   http_options.extra_headers = &extra_headers;
http_options.initial_request = 1;
http_options.no_cache = 1;
http_options.keep_error = 1;
@@ -389,6 +409,8 @@ static struct discovery *discover_refs(const char *service, 
int for_push)
strbuf_release(&charset);
strbuf_release(&effective_url);
strbuf_release(&buffer);
+   strbuf_release(&protocol_header);
+   string_list_clear(&extra_headers, 0);
last_discovery = last;
return last;
 }
@@ -425,6 +447,7 @@ struct rpc_state {
char *service_url;
char *hdr_content_type;
char *hdr_accept;
+   char *protocol_header;
char *buf;
size_t alloc;
size_t len;
@@ -611,6 +634,10 @@ static int post_rpc(struct rpc_state *rpc)
headers = curl_slist_append(headers, needs_100_continue ?
"Expect: 100-continue" : "Expect:");
 
+   /* Add the extra Git-Protocol header */
+   if (rpc->protocol_header)
+   headers = curl_slist_append(headers, rpc->protocol_header);
+
 retry:
slot = get_active_slot();
 
@@ -751,6 +778,11 @@ static int rpc_service(struct rpc_state *rpc, struct 
discovery *heads)
strbuf_addf(&buf, "Accept: application/x-%s-result", svc);
rpc->hdr_accept = strbuf_detach(&buf, NULL);
 
+   if (get_protocol_http_header(heads->version, &buf))
+   rpc->protocol_header = strbuf_detach(&buf, NULL);
+   else
+   rpc->protocol_header = NULL;
+
whil

[PATCH v3 26/35] transport-helper: remove name parameter

2018-02-06 Thread Brandon Williams

Commit 266f1fdfa (transport-helper: be quiet on read errors from
helpers, 2013-06-21) removed a call to 'die()' which printed the name of
the remote helper passed in to the 'recvline_fh()' function using the
'name' parameter.  Once the call to 'die()' was removed the parameter
was no longer necessary but wasn't removed.  Clean up 'recvline_fh()'
parameter list by removing the 'name' parameter.

Signed-off-by: Brandon Williams 
---
 transport-helper.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/transport-helper.c b/transport-helper.c
index 4c334b5ee..d72155768 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -49,7 +49,7 @@ static void sendline(struct helper_data *helper, struct 
strbuf *buffer)
die_errno("Full write to remote helper failed");
 }
 
-static int recvline_fh(FILE *helper, struct strbuf *buffer, const char *name)
+static int recvline_fh(FILE *helper, struct strbuf *buffer)
 {
strbuf_reset(buffer);
if (debug)
@@ -67,7 +67,7 @@ static int recvline_fh(FILE *helper, struct strbuf *buffer, 
const char *name)
 
 static int recvline(struct helper_data *helper, struct strbuf *buffer)
 {
-   return recvline_fh(helper->out, buffer, helper->name);
+   return recvline_fh(helper->out, buffer);
 }
 
 static void write_constant(int fd, const char *str)
@@ -586,7 +586,7 @@ static int process_connect_service(struct transport 
*transport,
goto exit;
 
sendline(data, &cmdbuf);
-   if (recvline_fh(input, &cmdbuf, name))
+   if (recvline_fh(input, &cmdbuf))
exit(128);
 
if (!strcmp(cmdbuf.buf, "")) {
-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v3 08/35] connect: discover protocol version outside of get_remote_heads

2018-02-06 Thread Brandon Williams

In order to prepare for the addition of protocol_v2 push the protocol
version discovery outside of 'get_remote_heads()'.  This will allow for
keeping the logic for processing the reference advertisement for
protocol_v1 and protocol_v0 separate from the logic for protocol_v2.

Signed-off-by: Brandon Williams 
---
 builtin/fetch-pack.c | 16 +++-
 builtin/send-pack.c  | 17 +++--
 connect.c| 27 ++-
 connect.h|  3 +++
 remote-curl.c| 20 ++--
 remote.h |  5 +++--
 transport.c  | 24 +++-
 7 files changed, 83 insertions(+), 29 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 366b9d13f..85d4faf76 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -4,6 +4,7 @@
 #include "remote.h"
 #include "connect.h"
 #include "sha1-array.h"
+#include "protocol.h"
 
 static const char fetch_pack_usage[] =
 "git fetch-pack [--all] [--stdin] [--quiet | -q] [--keep | -k] [--thin] "
@@ -52,6 +53,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
struct fetch_pack_args args;
struct oid_array shallow = OID_ARRAY_INIT;
struct string_list deepen_not = STRING_LIST_INIT_DUP;
+   struct packet_reader reader;
 
packet_trace_identity("fetch-pack");
 
@@ -193,7 +195,19 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
if (!conn)
return args.diag_url ? 0 : 1;
}
-   get_remote_heads(fd[0], NULL, 0, &ref, 0, NULL, &shallow);
+
+   packet_reader_init(&reader, fd[0], NULL, 0,
+  PACKET_READ_CHOMP_NEWLINE |
+  PACKET_READ_GENTLE_ON_EOF);
+
+   switch (discover_version(&reader)) {
+   case protocol_v1:
+   case protocol_v0:
+   get_remote_heads(&reader, &ref, 0, NULL, &shallow);
+   break;
+   case protocol_unknown_version:
+   BUG("unknown protocol version");
+   }
 
ref = fetch_pack(&args, fd, conn, ref, dest, sought, nr_sought,
 &shallow, pack_lockfile_ptr);
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index fc4f0bb5f..83cb125a6 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -14,6 +14,7 @@
 #include "sha1-array.h"
 #include "gpg-interface.h"
 #include "gettext.h"
+#include "protocol.h"
 
 static const char * const send_pack_usage[] = {
N_("git send-pack [--all | --mirror] [--dry-run] [--force] "
@@ -154,6 +155,7 @@ int cmd_send_pack(int argc, const char **argv, const char 
*prefix)
int progress = -1;
int from_stdin = 0;
struct push_cas_option cas = {0};
+   struct packet_reader reader;
 
struct option options[] = {
OPT__VERBOSITY(&verbose),
@@ -256,8 +258,19 @@ int cmd_send_pack(int argc, const char **argv, const char 
*prefix)
args.verbose ? CONNECT_VERBOSE : 0);
}
 
-   get_remote_heads(fd[0], NULL, 0, &remote_refs, REF_NORMAL,
-&extra_have, &shallow);
+   packet_reader_init(&reader, fd[0], NULL, 0,
+  PACKET_READ_CHOMP_NEWLINE |
+  PACKET_READ_GENTLE_ON_EOF);
+
+   switch (discover_version(&reader)) {
+   case protocol_v1:
+   case protocol_v0:
+   get_remote_heads(&reader, &remote_refs, REF_NORMAL,
+&extra_have, &shallow);
+   break;
+   case protocol_unknown_version:
+   BUG("unknown protocol version");
+   }
 
transport_verify_remote_names(nr_refspecs, refspecs);
 
diff --git a/connect.c b/connect.c
index 00e90075c..db3c9d24c 100644
--- a/connect.c
+++ b/connect.c
@@ -62,7 +62,7 @@ static void die_initial_contact(int unexpected)
  "and the repository exists."));
 }
 
-static enum protocol_version discover_version(struct packet_reader *reader)
+enum protocol_version discover_version(struct packet_reader *reader)
 {
enum protocol_version version = protocol_unknown_version;
 
@@ -234,7 +234,7 @@ enum get_remote_heads_state {
 /*
  * Read all the refs from the other end
  */
-struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
+struct ref **get_remote_heads(struct packet_reader *reader,
  struct ref **list, unsigned int flags,
  struct oid_array *extra_have,
  struct oid_array *shallow_points)
@@ -242,24 +242,17 @@ struct ref **get_remote_heads(int in, char *src_buf, 
size_t src_len,
struct ref **orig_list = list;
int len = 0;
enum get_remote_heads_state state = EXPECTING_FIRST_REF;
-   struct packet_reader reader;
const char *arg;
 
-   packet_reader_init(&reader, in, src_buf, src_len,
-  PACKET_READ_CHOMP_NEWLINE |
-

[PATCH v3 05/35] upload-pack: factor out processing lines

2018-02-06 Thread Brandon Williams

Factor out the logic for processing shallow, deepen, deepen_since, and
deepen_not lines into their own functions to simplify the
'receive_needs()' function in addition to making it easier to reuse some
of this logic when implementing protocol_v2.

Signed-off-by: Brandon Williams 
---
 upload-pack.c | 113 ++
 1 file changed, 74 insertions(+), 39 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index 2ad73a98b..1e8a9e1ca 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -724,6 +724,75 @@ static void deepen_by_rev_list(int ac, const char **av,
packet_flush(1);
 }
 
+static int process_shallow(const char *line, struct object_array *shallows)
+{
+   const char *arg;
+   if (skip_prefix(line, "shallow ", &arg)) {
+   struct object_id oid;
+   struct object *object;
+   if (get_oid_hex(arg, &oid))
+   die("invalid shallow line: %s", line);
+   object = parse_object(&oid);
+   if (!object)
+   return 1;
+   if (object->type != OBJ_COMMIT)
+   die("invalid shallow object %s", oid_to_hex(&oid));
+   if (!(object->flags & CLIENT_SHALLOW)) {
+   object->flags |= CLIENT_SHALLOW;
+   add_object_array(object, NULL, shallows);
+   }
+   return 1;
+   }
+
+   return 0;
+}
+
+static int process_deepen(const char *line, int *depth)
+{
+   const char *arg;
+   if (skip_prefix(line, "deepen ", &arg)) {
+   char *end = NULL;
+   *depth = (int)strtol(arg, &end, 0);
+   if (!end || *end || *depth <= 0)
+   die("Invalid deepen: %s", line);
+   return 1;
+   }
+
+   return 0;
+}
+
+static int process_deepen_since(const char *line, timestamp_t *deepen_since, 
int *deepen_rev_list)
+{
+   const char *arg;
+   if (skip_prefix(line, "deepen-since ", &arg)) {
+   char *end = NULL;
+   *deepen_since = parse_timestamp(arg, &end, 0);
+   if (!end || *end || !deepen_since ||
+   /* revisions.c's max_age -1 is special */
+   *deepen_since == -1)
+   die("Invalid deepen-since: %s", line);
+   *deepen_rev_list = 1;
+   return 1;
+   }
+   return 0;
+}
+
+static int process_deepen_not(const char *line, struct string_list 
*deepen_not, int *deepen_rev_list)
+{
+   const char *arg;
+   if (skip_prefix(line, "deepen-not ", &arg)) {
+   char *ref = NULL;
+   struct object_id oid;
+   if (expand_ref(arg, strlen(arg), &oid, &ref) != 1)
+   die("git upload-pack: ambiguous deepen-not: %s", line);
+   string_list_append(deepen_not, ref);
+   free(ref);
+   *deepen_rev_list = 1;
+   return 1;
+   }
+   return 0;
+}
+
 static void receive_needs(void)
 {
struct object_array shallows = OBJECT_ARRAY_INIT;
@@ -745,49 +814,15 @@ static void receive_needs(void)
if (!line)
break;
 
-   if (skip_prefix(line, "shallow ", &arg)) {
-   struct object_id oid;
-   struct object *object;
-   if (get_oid_hex(arg, &oid))
-   die("invalid shallow line: %s", line);
-   object = parse_object(&oid);
-   if (!object)
-   continue;
-   if (object->type != OBJ_COMMIT)
-   die("invalid shallow object %s", 
oid_to_hex(&oid));
-   if (!(object->flags & CLIENT_SHALLOW)) {
-   object->flags |= CLIENT_SHALLOW;
-   add_object_array(object, NULL, &shallows);
-   }
+   if (process_shallow(line, &shallows))
continue;
-   }
-   if (skip_prefix(line, "deepen ", &arg)) {
-   char *end = NULL;
-   depth = strtol(arg, &end, 0);
-   if (!end || *end || depth <= 0)
-   die("Invalid deepen: %s", line);
+   if (process_deepen(line, &depth))
continue;
-   }
-   if (skip_prefix(line, "deepen-since ", &arg)) {
-   char *end = NULL;
-   deepen_since = parse_timestamp(arg, &end, 0);
-   if (!end || *end || !deepen_since ||
-   /* revisions.c's max_age -1 is special */
-   deepen_since == -1)
-   die("Invalid deepen-since: %s", line);
-   deepen_rev_list = 1;
+   if

[PATCH v3 18/35] fetch: pass ref patterns when fetching

2018-02-06 Thread Brandon Williams

Construct a list of ref patterns to be passed to
'transport_get_remote_refs()' from the refspec to be used during the
fetch.  This list of ref patterns will be used to allow the server to
filter the ref advertisement when communicating using protocol v2.

Signed-off-by: Brandon Williams 
---
 builtin/fetch.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 850382f55..8128450bf 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -332,11 +332,21 @@ static struct ref *get_ref_map(struct transport 
*transport,
struct ref *rm;
struct ref *ref_map = NULL;
struct ref **tail = &ref_map;
+   struct argv_array ref_patterns = ARGV_ARRAY_INIT;
 
/* opportunistically-updated references: */
struct ref *orefs = NULL, **oref_tail = &orefs;
 
-   const struct ref *remote_refs = transport_get_remote_refs(transport, 
NULL);
+   const struct ref *remote_refs;
+
+   for (i = 0; i < refspec_count; i++) {
+   if (!refspecs[i].exact_sha1)
+   argv_array_push(&ref_patterns, refspecs[i].src);
+   }
+
+   remote_refs = transport_get_remote_refs(transport, &ref_patterns);
+
+   argv_array_clear(&ref_patterns);
 
if (refspec_count) {
struct refspec *fetch_refspec;
-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v3 03/35] pkt-line: add delim packet support

2018-02-06 Thread Brandon Williams

One of the design goals of protocol-v2 is to improve the semantics of
flush packets.  Currently in protocol-v1, flush packets are used both to
indicate a break in a list of packet lines as well as an indication that
one side has finished speaking.  This makes it particularly difficult
to implement proxies as a proxy would need to completely understand git
protocol instead of simply looking for a flush packet.

To do this, introduce the special deliminator packet '0001'.  A delim
packet can then be used as a deliminator between lists of packet lines
while flush packets can be reserved to indicate the end of a response.

Signed-off-by: Brandon Williams 
---
 pkt-line.c | 17 +
 pkt-line.h |  3 +++
 2 files changed, 20 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index 4fc9ad4b0..726e109ca 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -91,6 +91,12 @@ void packet_flush(int fd)
write_or_die(fd, "", 4);
 }
 
+void packet_delim(int fd)
+{
+   packet_trace("0001", 4, 1);
+   write_or_die(fd, "0001", 4);
+}
+
 int packet_flush_gently(int fd)
 {
packet_trace("", 4, 1);
@@ -105,6 +111,12 @@ void packet_buf_flush(struct strbuf *buf)
strbuf_add(buf, "", 4);
 }
 
+void packet_buf_delim(struct strbuf *buf)
+{
+   packet_trace("0001", 4, 1);
+   strbuf_add(buf, "0001", 4);
+}
+
 static void set_packet_header(char *buf, const int size)
 {
static char hexchar[] = "0123456789abcdef";
@@ -297,6 +309,9 @@ enum packet_read_status packet_read_with_status(int fd, 
char **src_buffer, size_
} else if (!len) {
packet_trace("", 4, 0);
return PACKET_READ_FLUSH;
+   } else if (len == 1) {
+   packet_trace("0001", 4, 0);
+   return PACKET_READ_DELIM;
} else if (len < 4) {
die("protocol error: bad line length %d", len);
}
@@ -333,6 +348,7 @@ int packet_read(int fd, char **src_buffer, size_t *src_len,
break;
case PACKET_READ_NORMAL:
break;
+   case PACKET_READ_DELIM:
case PACKET_READ_FLUSH:
pktlen = 0;
break;
@@ -445,6 +461,7 @@ enum packet_read_status packet_reader_read(struct 
packet_reader *reader)
case PACKET_READ_NORMAL:
reader->line = reader->buffer;
break;
+   case PACKET_READ_DELIM:
case PACKET_READ_FLUSH:
reader->pktlen = 0;
reader->line = NULL;
diff --git a/pkt-line.h b/pkt-line.h
index 7d9f0e537..16fe8bdbf 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -20,8 +20,10 @@
  * side can't, we stay with pure read/write interfaces.
  */
 void packet_flush(int fd);
+void packet_delim(int fd);
 void packet_write_fmt(int fd, const char *fmt, ...) __attribute__((format 
(printf, 2, 3)));
 void packet_buf_flush(struct strbuf *buf);
+void packet_buf_delim(struct strbuf *buf);
 void packet_write(int fd_out, const char *buf, size_t size);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) 
__attribute__((format (printf, 2, 3)));
 int packet_flush_gently(int fd);
@@ -75,6 +77,7 @@ enum packet_read_status {
PACKET_READ_EOF = -1,
PACKET_READ_NORMAL,
PACKET_READ_FLUSH,
+   PACKET_READ_DELIM,
 };
 enum packet_read_status packet_read_with_status(int fd, char **src_buffer, 
size_t *src_len,
char *buffer, unsigned size, 
int *pktlen,
-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v3 09/35] transport: store protocol version

2018-02-06 Thread Brandon Williams

Once protocol_v2 is introduced requesting a fetch or a push will need to
be handled differently depending on the protocol version.  Store the
protocol version the server is speaking in 'struct git_transport_data'
and use it to determine what to do in the case of a fetch or a push.

Signed-off-by: Brandon Williams 
---
 transport.c | 35 ++-
 1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/transport.c b/transport.c
index 63c3dbab9..2378dcb38 100644
--- a/transport.c
+++ b/transport.c
@@ -118,6 +118,7 @@ struct git_transport_data {
struct child_process *conn;
int fd[2];
unsigned got_remote_heads : 1;
+   enum protocol_version version;
struct oid_array extra_have;
struct oid_array shallow;
 };
@@ -200,7 +201,8 @@ static struct ref *get_refs_via_connect(struct transport 
*transport, int for_pus
   PACKET_READ_CHOMP_NEWLINE |
   PACKET_READ_GENTLE_ON_EOF);
 
-   switch (discover_version(&reader)) {
+   data->version = discover_version(&reader);
+   switch (data->version) {
case protocol_v1:
case protocol_v0:
get_remote_heads(&reader, &refs,
@@ -221,7 +223,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 {
int ret = 0;
struct git_transport_data *data = transport->data;
-   struct ref *refs;
+   struct ref *refs = NULL;
char *dest = xstrdup(transport->url);
struct fetch_pack_args args;
struct ref *refs_tmp = NULL;
@@ -247,10 +249,18 @@ static int fetch_refs_via_pack(struct transport 
*transport,
if (!data->got_remote_heads)
refs_tmp = get_refs_via_connect(transport, 0);
 
-   refs = fetch_pack(&args, data->fd, data->conn,
- refs_tmp ? refs_tmp : transport->remote_refs,
- dest, to_fetch, nr_heads, &data->shallow,
- &transport->pack_lockfile);
+   switch (data->version) {
+   case protocol_v1:
+   case protocol_v0:
+   refs = fetch_pack(&args, data->fd, data->conn,
+ refs_tmp ? refs_tmp : transport->remote_refs,
+ dest, to_fetch, nr_heads, &data->shallow,
+ &transport->pack_lockfile);
+   break;
+   case protocol_unknown_version:
+   BUG("unknown protocol version");
+   }
+
close(data->fd[0]);
close(data->fd[1]);
if (finish_connect(data->conn))
@@ -549,7 +559,7 @@ static int git_transport_push(struct transport *transport, 
struct ref *remote_re
 {
struct git_transport_data *data = transport->data;
struct send_pack_args args;
-   int ret;
+   int ret = 0;
 
if (!data->got_remote_heads)
get_refs_via_connect(transport, 1);
@@ -574,8 +584,15 @@ static int git_transport_push(struct transport *transport, 
struct ref *remote_re
else
args.push_cert = SEND_PACK_PUSH_CERT_NEVER;
 
-   ret = send_pack(&args, data->fd, data->conn, remote_refs,
-   &data->extra_have);
+   switch (data->version) {
+   case protocol_v1:
+   case protocol_v0:
+   ret = send_pack(&args, data->fd, data->conn, remote_refs,
+   &data->extra_have);
+   break;
+   case protocol_unknown_version:
+   BUG("unknown protocol version");
+   }
 
close(data->fd[1]);
close(data->fd[0]);
-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v3 11/35] test-pkt-line: introduce a packet-line test helper

2018-02-06 Thread Brandon Williams

Introduce a packet-line test helper which can either pack or unpack an
input stream into packet-lines and writes out the result to stdout.

Signed-off-by: Brandon Williams 
---
 Makefile |  1 +
 t/helper/test-pkt-line.c | 64 
 2 files changed, 65 insertions(+)
 create mode 100644 t/helper/test-pkt-line.c

diff --git a/Makefile b/Makefile
index b7ccc05fa..3b849c060 100644
--- a/Makefile
+++ b/Makefile
@@ -669,6 +669,7 @@ TEST_PROGRAMS_NEED_X += test-mktemp
 TEST_PROGRAMS_NEED_X += test-online-cpus
 TEST_PROGRAMS_NEED_X += test-parse-options
 TEST_PROGRAMS_NEED_X += test-path-utils
+TEST_PROGRAMS_NEED_X += test-pkt-line
 TEST_PROGRAMS_NEED_X += test-prio-queue
 TEST_PROGRAMS_NEED_X += test-read-cache
 TEST_PROGRAMS_NEED_X += test-write-cache
diff --git a/t/helper/test-pkt-line.c b/t/helper/test-pkt-line.c
new file mode 100644
index 0..0f19e53c7
--- /dev/null
+++ b/t/helper/test-pkt-line.c
@@ -0,0 +1,64 @@
+#include "pkt-line.h"
+
+static void pack_line(const char *line)
+{
+   if (!strcmp(line, "") || !strcmp(line, "\n"))
+   packet_flush(1);
+   else if (!strcmp(line, "0001") || !strcmp(line, "0001\n"))
+   packet_delim(1);
+   else
+   packet_write_fmt(1, "%s", line);
+}
+
+static void pack(int argc, const char **argv)
+{
+   if (argc) { /* read from argv */
+   int i;
+   for (i = 0; i < argc; i++)
+   pack_line(argv[i]);
+   } else { /* read from stdin */
+   char line[LARGE_PACKET_MAX];
+   while (fgets(line, sizeof(line), stdin)) {
+   pack_line(line);
+   }
+   }
+}
+
+static void unpack(void)
+{
+   struct packet_reader reader;
+   packet_reader_init(&reader, 0, NULL, 0,
+  PACKET_READ_GENTLE_ON_EOF |
+  PACKET_READ_CHOMP_NEWLINE);
+
+   while (packet_reader_read(&reader) != PACKET_READ_EOF) {
+   switch (reader.status) {
+   case PACKET_READ_EOF:
+   break;
+   case PACKET_READ_NORMAL:
+   printf("%s\n", reader.line);
+   break;
+   case PACKET_READ_FLUSH:
+   printf("\n");
+   break;
+   case PACKET_READ_DELIM:
+   printf("0001\n");
+   break;
+   }
+   }
+}
+
+int cmd_main(int argc, const char **argv)
+{
+   if (argc < 2)
+   die("too few arguments");
+
+   if (!strcmp(argv[1], "pack"))
+   pack(argc - 2, argv + 2);
+   else if (!strcmp(argv[1], "unpack"))
+   unpack();
+   else
+   die("invalid argument '%s'", argv[1]);
+
+   return 0;
+}
-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v3 13/35] ls-refs: introduce ls-refs server command

2018-02-06 Thread Brandon Williams

Introduce the ls-refs server command.  In protocol v2, the ls-refs
command is used to request the ref advertisement from the server.  Since
it is a command which can be requested (as opposed to mandatory in v1),
a client can sent a number of parameters in its request to limit the ref
advertisement based on provided ref-patterns.

Signed-off-by: Brandon Williams 
---
 Documentation/technical/protocol-v2.txt |  32 +
 Makefile|   1 +
 ls-refs.c   |  96 ++
 ls-refs.h   |   9 +++
 serve.c |   8 +++
 t/t5701-git-serve.sh| 115 
 6 files changed, 261 insertions(+)
 create mode 100644 ls-refs.c
 create mode 100644 ls-refs.h

diff --git a/Documentation/technical/protocol-v2.txt 
b/Documentation/technical/protocol-v2.txt
index f87372f9b..ef81df868 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -112,3 +112,35 @@ printable ASCII characters except space (i.e., the byte 
range 32 < x <
 "git/1.8.3.1"). The agent strings are purely informative for statistics
 and debugging purposes, and MUST NOT be used to programmatically assume
 the presence or absence of particular features.
+
+ ls-refs
+-
+
+`ls-refs` is the command used to request a reference advertisement in v2.
+Unlike the current reference advertisement, ls-refs takes in parameters
+which can be used to limit the refs sent from the server.
+
+Additional features not supported in the base command will be advertised
+as the value of the command in the capability advertisement in the form
+of a space separated list of features, e.g.  "=
+".
+
+ls-refs takes in the following parameters wrapped in packet-lines:
+
+symrefs
+   In addition to the object pointed by it, show the underlying ref
+   pointed by it when showing a symbolic ref.
+peel
+   Show peeled tags.
+ref-pattern 
+   When specified, only references matching the one of the provided
+   patterns are displayed.
+
+The output of ls-refs is as follows:
+
+output = *ref
+flush-pkt
+ref = PKT-LINE(obj-id SP refname *(SP ref-attribute) LF)
+ref-attribute = (symref | peeled)
+symref = "symref-target:" symref-target
+peeled = "peeled:" obj-id
diff --git a/Makefile b/Makefile
index 18c255428..e50927cfb 100644
--- a/Makefile
+++ b/Makefile
@@ -825,6 +825,7 @@ LIB_OBJS += list-objects-filter-options.o
 LIB_OBJS += ll-merge.o
 LIB_OBJS += lockfile.o
 LIB_OBJS += log-tree.o
+LIB_OBJS += ls-refs.o
 LIB_OBJS += mailinfo.o
 LIB_OBJS += mailmap.o
 LIB_OBJS += match-trees.o
diff --git a/ls-refs.c b/ls-refs.c
new file mode 100644
index 0..70682b4f7
--- /dev/null
+++ b/ls-refs.c
@@ -0,0 +1,96 @@
+#include "cache.h"
+#include "repository.h"
+#include "refs.h"
+#include "remote.h"
+#include "argv-array.h"
+#include "ls-refs.h"
+#include "pkt-line.h"
+
+struct ls_refs_data {
+   unsigned peel;
+   unsigned symrefs;
+   struct argv_array patterns;
+};
+
+/*
+ * Check if one of the patterns matches the tail part of the ref.
+ * If no patterns were provided, all refs match.
+ */
+static int ref_match(const struct argv_array *patterns, const char *refname)
+{
+   char *pathbuf;
+   int i;
+
+   if (!patterns->argc)
+   return 1; /* no restriction */
+
+   pathbuf = xstrfmt("/%s", refname);
+   for (i = 0; i < patterns->argc; i++) {
+   if (!wildmatch(patterns->argv[i], pathbuf, 0)) {
+   free(pathbuf);
+   return 1;
+   }
+   }
+   free(pathbuf);
+   return 0;
+}
+
+static int send_ref(const char *refname, const struct object_id *oid,
+   int flag, void *cb_data)
+{
+   struct ls_refs_data *data = cb_data;
+   const char *refname_nons = strip_namespace(refname);
+   struct strbuf refline = STRBUF_INIT;
+
+   if (!ref_match(&data->patterns, refname))
+   return 0;
+
+   strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+   if (data->symrefs && flag & REF_ISSYMREF) {
+   struct object_id unused;
+   const char *symref_target = resolve_ref_unsafe(refname, 0,
+  &unused,
+  &flag);
+
+   if (!symref_target)
+   die("'%s' is a symref but it is not?", refname);
+
+   strbuf_addf(&refline, " symref-target:%s", symref_target);
+   }
+
+   if (data->peel) {
+   struct object_id peeled;
+   if (!peel_ref(refname, &peeled))
+   strbuf_addf(&refline, " peeled:%s", 
oid_to_hex(&peeled));
+   }
+
+   strbuf_addch(&refline, '\n');
+   packet_write(1, refline.buf, refline.len);
+
+

[PATCH v3 10/35] protocol: introduce enum protocol_version value protocol_v2

2018-02-06 Thread Brandon Williams

Introduce protocol_v2, a new value for 'enum protocol_version'.
Subsequent patches will fill in the implementation of protocol_v2.

Signed-off-by: Brandon Williams 
---
 builtin/fetch-pack.c   | 3 +++
 builtin/receive-pack.c | 6 ++
 builtin/send-pack.c| 3 +++
 builtin/upload-pack.c  | 7 +++
 connect.c  | 3 +++
 protocol.c | 2 ++
 protocol.h | 1 +
 remote-curl.c  | 3 +++
 transport.c| 9 +
 9 files changed, 37 insertions(+)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 85d4faf76..f492e8abd 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -201,6 +201,9 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
   PACKET_READ_GENTLE_ON_EOF);
 
switch (discover_version(&reader)) {
+   case protocol_v2:
+   die("support for protocol v2 not implemented yet");
+   break;
case protocol_v1:
case protocol_v0:
get_remote_heads(&reader, &ref, 0, NULL, &shallow);
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index b7ce7c7f5..3656e94fd 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1963,6 +1963,12 @@ int cmd_receive_pack(int argc, const char **argv, const 
char *prefix)
unpack_limit = receive_unpack_limit;
 
switch (determine_protocol_version_server()) {
+   case protocol_v2:
+   /*
+* push support for protocol v2 has not been implemented yet,
+* so ignore the request to use v2 and fallback to using v0.
+*/
+   break;
case protocol_v1:
/*
 * v1 is just the original protocol with a version string,
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index 83cb125a6..b5427f75e 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -263,6 +263,9 @@ int cmd_send_pack(int argc, const char **argv, const char 
*prefix)
   PACKET_READ_GENTLE_ON_EOF);
 
switch (discover_version(&reader)) {
+   case protocol_v2:
+   die("support for protocol v2 not implemented yet");
+   break;
case protocol_v1:
case protocol_v0:
get_remote_heads(&reader, &remote_refs, REF_NORMAL,
diff --git a/builtin/upload-pack.c b/builtin/upload-pack.c
index 2cb5cb35b..8d53e9794 100644
--- a/builtin/upload-pack.c
+++ b/builtin/upload-pack.c
@@ -47,6 +47,13 @@ int cmd_upload_pack(int argc, const char **argv, const char 
*prefix)
die("'%s' does not appear to be a git repository", dir);
 
switch (determine_protocol_version_server()) {
+   case protocol_v2:
+   /*
+* fetch support for protocol v2 has not been implemented yet,
+* so ignore the request to use v2 and fallback to using v0.
+*/
+   upload_pack(&opts);
+   break;
case protocol_v1:
/*
 * v1 is just the original protocol with a version string,
diff --git a/connect.c b/connect.c
index db3c9d24c..f2157a821 100644
--- a/connect.c
+++ b/connect.c
@@ -84,6 +84,9 @@ enum protocol_version discover_version(struct packet_reader 
*reader)
 
/* Maybe process capabilities here, at least for v2 */
switch (version) {
+   case protocol_v2:
+   die("support for protocol v2 not implemented yet");
+   break;
case protocol_v1:
/* Read the peeked version line */
packet_reader_read(reader);
diff --git a/protocol.c b/protocol.c
index 43012b7eb..5e636785d 100644
--- a/protocol.c
+++ b/protocol.c
@@ -8,6 +8,8 @@ static enum protocol_version parse_protocol_version(const char 
*value)
return protocol_v0;
else if (!strcmp(value, "1"))
return protocol_v1;
+   else if (!strcmp(value, "2"))
+   return protocol_v2;
else
return protocol_unknown_version;
 }
diff --git a/protocol.h b/protocol.h
index 1b2bc94a8..2ad35e433 100644
--- a/protocol.h
+++ b/protocol.h
@@ -5,6 +5,7 @@ enum protocol_version {
protocol_unknown_version = -1,
protocol_v0 = 0,
protocol_v1 = 1,
+   protocol_v2 = 2,
 };
 
 /*
diff --git a/remote-curl.c b/remote-curl.c
index 9f6d07683..dae8a4a48 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -185,6 +185,9 @@ static struct ref *parse_git_refs(struct discovery *heads, 
int for_push)
   PACKET_READ_GENTLE_ON_EOF);
 
switch (discover_version(&reader)) {
+   case protocol_v2:
+   die("support for protocol v2 not implemented yet");
+   break;
case protocol_v1:
case protocol_v0:
get_remote_heads(&reader, &list, for_push ? REF_NORMAL : 0,
diff --git a/transport.c b/transport.c
index 2378dcb38..83d9dd1df 100644
--- a/

[PATCH v3 06/35] transport: use get_refs_via_connect to get refs

2018-02-06 Thread Brandon Williams

Remove code duplication and use the existing 'get_refs_via_connect()'
function to retrieve a remote's heads in 'fetch_refs_via_pack()' and
'git_transport_push()'.

Signed-off-by: Brandon Williams 
---
 transport.c | 18 --
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/transport.c b/transport.c
index fc802260f..8e8779096 100644
--- a/transport.c
+++ b/transport.c
@@ -230,12 +230,8 @@ static int fetch_refs_via_pack(struct transport *transport,
args.cloning = transport->cloning;
args.update_shallow = data->options.update_shallow;
 
-   if (!data->got_remote_heads) {
-   connect_setup(transport, 0);
-   get_remote_heads(data->fd[0], NULL, 0, &refs_tmp, 0,
-NULL, &data->shallow);
-   data->got_remote_heads = 1;
-   }
+   if (!data->got_remote_heads)
+   refs_tmp = get_refs_via_connect(transport, 0);
 
refs = fetch_pack(&args, data->fd, data->conn,
  refs_tmp ? refs_tmp : transport->remote_refs,
@@ -541,14 +537,8 @@ static int git_transport_push(struct transport *transport, 
struct ref *remote_re
struct send_pack_args args;
int ret;
 
-   if (!data->got_remote_heads) {
-   struct ref *tmp_refs;
-   connect_setup(transport, 1);
-
-   get_remote_heads(data->fd[0], NULL, 0, &tmp_refs, REF_NORMAL,
-NULL, &data->shallow);
-   data->got_remote_heads = 1;
-   }
+   if (!data->got_remote_heads)
+   get_refs_via_connect(transport, 1);
 
memset(&args, 0, sizeof(args));
args.send_mirror = !!(flags & TRANSPORT_PUSH_MIRROR);
-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v3 17/35] ls-remote: pass ref patterns when requesting a remote's refs

2018-02-06 Thread Brandon Williams

Construct an argv_array of the ref patterns supplied via the command
line and pass them to 'transport_get_remote_refs()' to be used when
communicating protocol v2 so that the server can limit the ref
advertisement based on the supplied patterns.

Signed-off-by: Brandon Williams 
---
 builtin/ls-remote.c|  7 +--
 t/t5702-protocol-v2.sh | 16 
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index c6e9847c5..caf1051f3 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -43,6 +43,7 @@ int cmd_ls_remote(int argc, const char **argv, const char 
*prefix)
int show_symref_target = 0;
const char *uploadpack = NULL;
const char **pattern = NULL;
+   struct argv_array ref_patterns = ARGV_ARRAY_INIT;
 
struct remote *remote;
struct transport *transport;
@@ -74,8 +75,10 @@ int cmd_ls_remote(int argc, const char **argv, const char 
*prefix)
if (argc > 1) {
int i;
pattern = xcalloc(argc, sizeof(const char *));
-   for (i = 1; i < argc; i++)
+   for (i = 1; i < argc; i++) {
pattern[i - 1] = xstrfmt("*/%s", argv[i]);
+   argv_array_push(&ref_patterns, argv[i]);
+   }
}
 
remote = remote_get(dest);
@@ -96,7 +99,7 @@ int cmd_ls_remote(int argc, const char **argv, const char 
*prefix)
if (uploadpack != NULL)
transport_set_option(transport, TRANS_OPT_UPLOADPACK, 
uploadpack);
 
-   ref = transport_get_remote_refs(transport, NULL);
+   ref = transport_get_remote_refs(transport, &ref_patterns);
if (transport_disconnect(transport))
return 1;
 
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 1e42b5588..a33ff6597 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -30,6 +30,14 @@ test_expect_success 'list refs with git:// using protocol 
v2' '
test_cmp actual expect
 '
 
+test_expect_success 'ref advertisment is filtered with ls-remote using 
protocol v2' '
+   GIT_TRACE_PACKET=1 git -c protocol.version=2 \
+   ls-remote "$GIT_DAEMON_URL/parent" master 2>log &&
+
+   grep "ref-pattern master" log &&
+   ! grep "refs/tags/" log
+'
+
 stop_git_daemon
 
 # Test protocol v2 with 'file://' transport
@@ -50,4 +58,12 @@ test_expect_success 'list refs with file:// using protocol 
v2' '
test_cmp actual expect
 '
 
+test_expect_success 'ref advertisment is filtered with ls-remote using 
protocol v2' '
+   GIT_TRACE_PACKET=1 git -c protocol.version=2 \
+   ls-remote "file://$(pwd)/file_parent" master 2>log &&
+
+   grep "ref-pattern master" log &&
+   ! grep "refs/tags/" log
+'
+
 test_done
-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v3 12/35] serve: introduce git-serve

2018-02-06 Thread Brandon Williams

Introduce git-serve, the base server for protocol version 2.

Protocol version 2 is intended to be a replacement for Git's current
wire protocol.  The intention is that it will be a simpler, less
wasteful protocol which can evolve over time.

Protocol version 2 improves upon version 1 by eliminating the initial
ref advertisement.  In its place a server will export a list of
capabilities and commands which it supports in a capability
advertisement.  A client can then request that a particular command be
executed by providing a number of capabilities and command specific
parameters.  At the completion of a command, a client can request that
another command be executed or can terminate the connection by sending a
flush packet.

Signed-off-by: Brandon Williams 
---
 .gitignore  |   1 +
 Documentation/technical/protocol-v2.txt | 114 +++
 Makefile|   2 +
 builtin.h   |   1 +
 builtin/serve.c |  30 
 git.c   |   1 +
 serve.c | 250 
 serve.h |  15 ++
 t/t5701-git-serve.sh|  60 
 9 files changed, 474 insertions(+)
 create mode 100644 Documentation/technical/protocol-v2.txt
 create mode 100644 builtin/serve.c
 create mode 100644 serve.c
 create mode 100644 serve.h
 create mode 100755 t/t5701-git-serve.sh

diff --git a/.gitignore b/.gitignore
index 833ef3b0b..2d0450c26 100644
--- a/.gitignore
+++ b/.gitignore
@@ -140,6 +140,7 @@
 /git-rm
 /git-send-email
 /git-send-pack
+/git-serve
 /git-sh-i18n
 /git-sh-i18n--envsubst
 /git-sh-setup
diff --git a/Documentation/technical/protocol-v2.txt 
b/Documentation/technical/protocol-v2.txt
new file mode 100644
index 0..f87372f9b
--- /dev/null
+++ b/Documentation/technical/protocol-v2.txt
@@ -0,0 +1,114 @@
+ Git Wire Protocol, Version 2
+==
+
+This document presents a specification for a version 2 of Git's wire
+protocol.  Protocol v2 will improve upon v1 in the following ways:
+
+  * Instead of multiple service names, multiple commands will be
+supported by a single service
+  * Easily extendable as capabilities are moved into their own section
+of the protocol, no longer being hidden behind a NUL byte and
+limited by the size of a pkt-line (as there will be a single
+capability per pkt-line)
+  * Separate out other information hidden behind NUL bytes (e.g. agent
+string as a capability and symrefs can be requested using 'ls-refs')
+  * Reference advertisement will be omitted unless explicitly requested
+  * ls-refs command to explicitly request some refs
+  * Designed with http and stateless-rpc in mind.  With clear flush
+semantics the http remote helper can simply act as a proxy.
+
+ Detailed Design
+=
+
+A client can request to speak protocol v2 by sending `version=2` in the
+side-channel `GIT_PROTOCOL` in the initial request to the server.
+
+In protocol v2 communication is command oriented.  When first contacting a
+server a list of capabilities will advertised.  Some of these capabilities
+will be commands which a client can request be executed.  Once a command
+has completed, a client can reuse the connection and request that other
+commands be executed.
+
+ Special Packets
+-
+
+In protocol v2 these special packets will have the following semantics:
+
+  * '' Flush Packet (flush-pkt) - indicates the end of a message
+  * '0001' Delimiter Packet (delim-pkt) - separates sections of a message
+
+ Capability Advertisement
+--
+
+A server which decides to communicate (based on a request from a client)
+using protocol version 2, notifies the client by sending a version string
+in its initial response followed by an advertisement of its capabilities.
+Each capability is a key with an optional value.  Clients must ignore all
+unknown keys.  Semantics of unknown values are left to the definition of
+each key.  Some capabilities will describe commands which can be requested
+to be executed by the client.
+
+capability-advertisement = protocol-version
+  capability-list
+  flush-pkt
+
+protocol-version = PKT-LINE("version 2" LF)
+capability-list = *capability
+capability = PKT-LINE(key[=value] LF)
+
+key = 1*CHAR
+value = 1*CHAR
+CHAR = 1*(ALPHA / DIGIT / "-" / "_")
+
+A client then responds to select the command it wants with any particular
+capabilities or arguments.  There is then an optional section where the
+client can provide any command specific parameters or queries.
+
+command-request = command
+ capability-list
+ (command-args)
+ flush-pkt
+command = PKT-LINE("command=" key LF)
+command-args = delim-pkt
+

[PATCH v3 02/35] pkt-line: introduce struct packet_reader

2018-02-06 Thread Brandon Williams

Sometimes it is advantageous to be able to peek the next packet line
without consuming it (e.g. to be able to determine the protocol version
a server is speaking).  In order to do that introduce 'struct
packet_reader' which is an abstraction around the normal packet reading
logic.  This enables a caller to be able to peek a single line at a time
using 'packet_reader_peek()' and having a caller consume a line by
calling 'packet_reader_read()'.

Signed-off-by: Brandon Williams 
---
 pkt-line.c | 59 +++
 pkt-line.h | 58 ++
 2 files changed, 117 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index af0d2430f..4fc9ad4b0 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -406,3 +406,62 @@ ssize_t read_packetized_to_strbuf(int fd_in, struct strbuf 
*sb_out)
}
return sb_out->len - orig_len;
 }
+
+/* Packet Reader Functions */
+void packet_reader_init(struct packet_reader *reader, int fd,
+   char *src_buffer, size_t src_len,
+   int options)
+{
+   memset(reader, 0, sizeof(*reader));
+
+   reader->fd = fd;
+   reader->src_buffer = src_buffer;
+   reader->src_len = src_len;
+   reader->buffer = packet_buffer;
+   reader->buffer_size = sizeof(packet_buffer);
+   reader->options = options;
+}
+
+enum packet_read_status packet_reader_read(struct packet_reader *reader)
+{
+   if (reader->line_peeked) {
+   reader->line_peeked = 0;
+   return reader->status;
+   }
+
+   reader->status = packet_read_with_status(reader->fd,
+&reader->src_buffer,
+&reader->src_len,
+reader->buffer,
+reader->buffer_size,
+&reader->pktlen,
+reader->options);
+
+   switch (reader->status) {
+   case PACKET_READ_EOF:
+   reader->pktlen = -1;
+   reader->line = NULL;
+   break;
+   case PACKET_READ_NORMAL:
+   reader->line = reader->buffer;
+   break;
+   case PACKET_READ_FLUSH:
+   reader->pktlen = 0;
+   reader->line = NULL;
+   break;
+   }
+
+   return reader->status;
+}
+
+enum packet_read_status packet_reader_peek(struct packet_reader *reader)
+{
+   /* Only allow peeking a single line */
+   if (reader->line_peeked)
+   return reader->status;
+
+   /* Peek a line by reading it and setting peeked flag */
+   packet_reader_read(reader);
+   reader->line_peeked = 1;
+   return reader->status;
+}
diff --git a/pkt-line.h b/pkt-line.h
index 06c468927..7d9f0e537 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -111,6 +111,64 @@ char *packet_read_line_buf(char **src_buf, size_t 
*src_len, int *size);
  */
 ssize_t read_packetized_to_strbuf(int fd_in, struct strbuf *sb_out);
 
+struct packet_reader {
+   /* source file descriptor */
+   int fd;
+
+   /* source buffer and its size */
+   char *src_buffer;
+   size_t src_len;
+
+   /* buffer that pkt-lines are read into and its size */
+   char *buffer;
+   unsigned buffer_size;
+
+   /* options to be used during reads */
+   int options;
+
+   /* status of the last read */
+   enum packet_read_status status;
+
+   /* length of data read during the last read */
+   int pktlen;
+
+   /* the last line read */
+   const char *line;
+
+   /* indicates if a line has been peeked */
+   int line_peeked;
+};
+
+/*
+ * Initialize a 'struct packet_reader' object which is an
+ * abstraction around the 'packet_read_with_status()' function.
+ */
+extern void packet_reader_init(struct packet_reader *reader, int fd,
+  char *src_buffer, size_t src_len,
+  int options);
+
+/*
+ * Perform a packet read and return the status of the read.
+ * The values of 'pktlen' and 'line' are updated based on the status of the
+ * read as follows:
+ *
+ * PACKET_READ_ERROR: 'pktlen' is set to '-1' and 'line' is set to NULL
+ * PACKET_READ_NORMAL: 'pktlen' is set to the number of bytes read
+ *'line' is set to point at the read line
+ * PACKET_READ_FLUSH: 'pktlen' is set to '0' and 'line' is set to NULL
+ */
+extern enum packet_read_status packet_reader_read(struct packet_reader 
*reader);
+
+/*
+ * Peek the next packet line without consuming it and return the status.
+ * The next call to 'packet_reader_read()' will perform a read of the same line
+ * that was peeked, consuming the line.
+ *
+ * Peeking multiple times without calling 'packet_reader_read()' will return
+ * the same result.
+ */
+extern enum packet_read_status packe

[PATCH v3 01/35] pkt-line: introduce packet_read_with_status

2018-02-06 Thread Brandon Williams

The current pkt-line API encodes the status of a pkt-line read in the
length of the read content.  An error is indicated with '-1', a flush
with '0' (which can be confusing since a return value of '0' can also
indicate an empty pkt-line), and a positive integer for the length of
the read content otherwise.  This doesn't leave much room for allowing
the addition of additional special packets in the future.

To solve this introduce 'packet_read_with_status()' which reads a packet
and returns the status of the read encoded as an 'enum packet_status'
type.  This allows for easily identifying between special and normal
packets as well as errors.  It also enables easily adding a new special
packet in the future.

Signed-off-by: Brandon Williams 
---
 pkt-line.c | 57 +++--
 pkt-line.h | 15 +++
 2 files changed, 58 insertions(+), 14 deletions(-)

diff --git a/pkt-line.c b/pkt-line.c
index 2827ca772..af0d2430f 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -280,28 +280,33 @@ static int packet_length(const char *linelen)
return (val < 0) ? val : (val << 8) | hex2chr(linelen + 2);
 }
 
-int packet_read(int fd, char **src_buf, size_t *src_len,
-   char *buffer, unsigned size, int options)
+enum packet_read_status packet_read_with_status(int fd, char **src_buffer, 
size_t *src_len,
+   char *buffer, unsigned size, 
int *pktlen,
+   int options)
 {
-   int len, ret;
+   int len;
char linelen[4];
 
-   ret = get_packet_data(fd, src_buf, src_len, linelen, 4, options);
-   if (ret < 0)
-   return ret;
+   if (get_packet_data(fd, src_buffer, src_len, linelen, 4, options) < 0)
+   return PACKET_READ_EOF;
+
len = packet_length(linelen);
-   if (len < 0)
+
+   if (len < 0) {
die("protocol error: bad line length character: %.4s", linelen);
-   if (!len) {
+   } else if (!len) {
packet_trace("", 4, 0);
-   return 0;
+   return PACKET_READ_FLUSH;
+   } else if (len < 4) {
+   die("protocol error: bad line length %d", len);
}
+
len -= 4;
-   if (len >= size)
+   if ((unsigned)len >= size)
die("protocol error: bad line length %d", len);
-   ret = get_packet_data(fd, src_buf, src_len, buffer, len, options);
-   if (ret < 0)
-   return ret;
+
+   if (get_packet_data(fd, src_buffer, src_len, buffer, len, options) < 0)
+   return PACKET_READ_EOF;
 
if ((options & PACKET_READ_CHOMP_NEWLINE) &&
len && buffer[len-1] == '\n')
@@ -309,7 +314,31 @@ int packet_read(int fd, char **src_buf, size_t *src_len,
 
buffer[len] = 0;
packet_trace(buffer, len, 0);
-   return len;
+   *pktlen = len;
+   return PACKET_READ_NORMAL;
+}
+
+int packet_read(int fd, char **src_buffer, size_t *src_len,
+   char *buffer, unsigned size, int options)
+{
+   enum packet_read_status status;
+   int pktlen;
+
+   status = packet_read_with_status(fd, src_buffer, src_len,
+buffer, size, &pktlen,
+options);
+   switch (status) {
+   case PACKET_READ_EOF:
+   pktlen = -1;
+   break;
+   case PACKET_READ_NORMAL:
+   break;
+   case PACKET_READ_FLUSH:
+   pktlen = 0;
+   break;
+   }
+
+   return pktlen;
 }
 
 static char *packet_read_line_generic(int fd,
diff --git a/pkt-line.h b/pkt-line.h
index 3dad583e2..06c468927 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -65,6 +65,21 @@ int write_packetized_from_buf(const char *src_in, size_t 
len, int fd_out);
 int packet_read(int fd, char **src_buffer, size_t *src_len, char
*buffer, unsigned size, int options);
 
+/*
+ * Read a packetized line into a buffer like the 'packet_read()' function but
+ * returns an 'enum packet_read_status' which indicates the status of the read.
+ * The number of bytes read will be assigined to *pktlen if the status of the
+ * read was 'PACKET_READ_NORMAL'.
+ */
+enum packet_read_status {
+   PACKET_READ_EOF = -1,
+   PACKET_READ_NORMAL,
+   PACKET_READ_FLUSH,
+};
+enum packet_read_status packet_read_with_status(int fd, char **src_buffer, 
size_t *src_len,
+   char *buffer, unsigned size, 
int *pktlen,
+   int options);
+
 /*
  * Convenience wrapper for packet_read that is not gentle, and sets the
  * CHOMP_NEWLINE option. The return value is NULL for a flush packet,
-- 
2.16.0.rc1.238.g530d649a79-goog

[PATCH v3 04/35] upload-pack: convert to a builtin

2018-02-06 Thread Brandon Williams

In order to allow for code sharing with the server-side of fetch in
protocol-v2 convert upload-pack to be a builtin.

Signed-off-by: Brandon Williams 
---
 Makefile  |   3 +-
 builtin.h |   1 +
 builtin/upload-pack.c |  67 +++
 git.c |   1 +
 upload-pack.c | 107 --
 upload-pack.h |  13 ++
 6 files changed, 109 insertions(+), 83 deletions(-)
 create mode 100644 builtin/upload-pack.c
 create mode 100644 upload-pack.h

diff --git a/Makefile b/Makefile
index 1a9b23b67..b7ccc05fa 100644
--- a/Makefile
+++ b/Makefile
@@ -639,7 +639,6 @@ PROGRAM_OBJS += imap-send.o
 PROGRAM_OBJS += sh-i18n--envsubst.o
 PROGRAM_OBJS += shell.o
 PROGRAM_OBJS += show-index.o
-PROGRAM_OBJS += upload-pack.o
 PROGRAM_OBJS += remote-testsvn.o
 
 # Binary suffix, set to .exe for Windows builds
@@ -909,6 +908,7 @@ LIB_OBJS += tree-diff.o
 LIB_OBJS += tree.o
 LIB_OBJS += tree-walk.o
 LIB_OBJS += unpack-trees.o
+LIB_OBJS += upload-pack.o
 LIB_OBJS += url.o
 LIB_OBJS += urlmatch.o
 LIB_OBJS += usage.o
@@ -1026,6 +1026,7 @@ BUILTIN_OBJS += builtin/update-index.o
 BUILTIN_OBJS += builtin/update-ref.o
 BUILTIN_OBJS += builtin/update-server-info.o
 BUILTIN_OBJS += builtin/upload-archive.o
+BUILTIN_OBJS += builtin/upload-pack.o
 BUILTIN_OBJS += builtin/var.o
 BUILTIN_OBJS += builtin/verify-commit.o
 BUILTIN_OBJS += builtin/verify-pack.o
diff --git a/builtin.h b/builtin.h
index 42378f3aa..f332a1257 100644
--- a/builtin.h
+++ b/builtin.h
@@ -231,6 +231,7 @@ extern int cmd_update_ref(int argc, const char **argv, 
const char *prefix);
 extern int cmd_update_server_info(int argc, const char **argv, const char 
*prefix);
 extern int cmd_upload_archive(int argc, const char **argv, const char *prefix);
 extern int cmd_upload_archive_writer(int argc, const char **argv, const char 
*prefix);
+extern int cmd_upload_pack(int argc, const char **argv, const char *prefix);
 extern int cmd_var(int argc, const char **argv, const char *prefix);
 extern int cmd_verify_commit(int argc, const char **argv, const char *prefix);
 extern int cmd_verify_tag(int argc, const char **argv, const char *prefix);
diff --git a/builtin/upload-pack.c b/builtin/upload-pack.c
new file mode 100644
index 0..2cb5cb35b
--- /dev/null
+++ b/builtin/upload-pack.c
@@ -0,0 +1,67 @@
+#include "cache.h"
+#include "builtin.h"
+#include "exec_cmd.h"
+#include "pkt-line.h"
+#include "parse-options.h"
+#include "protocol.h"
+#include "upload-pack.h"
+
+static const char * const upload_pack_usage[] = {
+   N_("git upload-pack [] "),
+   NULL
+};
+
+int cmd_upload_pack(int argc, const char **argv, const char *prefix)
+{
+   const char *dir;
+   int strict = 0;
+   struct upload_pack_options opts = { 0 };
+   struct option options[] = {
+   OPT_BOOL(0, "stateless-rpc", &opts.stateless_rpc,
+N_("quit after a single request/response exchange")),
+   OPT_BOOL(0, "advertise-refs", &opts.advertise_refs,
+N_("exit immediately after initial ref 
advertisement")),
+   OPT_BOOL(0, "strict", &strict,
+N_("do not try /.git/ if  is no 
Git directory")),
+   OPT_INTEGER(0, "timeout", &opts.timeout,
+   N_("interrupt transfer after  seconds of 
inactivity")),
+   OPT_END()
+   };
+
+   packet_trace_identity("upload-pack");
+   check_replace_refs = 0;
+
+   argc = parse_options(argc, argv, NULL, options, upload_pack_usage, 0);
+
+   if (argc != 1)
+   usage_with_options(upload_pack_usage, options);
+
+   if (opts.timeout)
+   opts.daemon_mode = 1;
+
+   setup_path();
+
+   dir = argv[0];
+
+   if (!enter_repo(dir, strict))
+   die("'%s' does not appear to be a git repository", dir);
+
+   switch (determine_protocol_version_server()) {
+   case protocol_v1:
+   /*
+* v1 is just the original protocol with a version string,
+* so just fall through after writing the version string.
+*/
+   if (opts.advertise_refs || !opts.stateless_rpc)
+   packet_write_fmt(1, "version 1\n");
+
+   /* fallthrough */
+   case protocol_v0:
+   upload_pack(&opts);
+   break;
+   case protocol_unknown_version:
+   BUG("unknown protocol version");
+   }
+
+   return 0;
+}
diff --git a/git.c b/git.c
index c870b9719..f71073dc8 100644
--- a/git.c
+++ b/git.c
@@ -478,6 +478,7 @@ static struct cmd_struct commands[] = {
{ "update-server-info", cmd_update_server_info, RUN_SETUP },
{ "upload-archive", cmd_upload_archive },
{ "upload-archive--writer", cmd_upload_archive_writer },
+   { "upload-pack", cmd_upload_pack },
{ "var", cmd_var, RUN_SETUP_GENTLY },
{

[PATCH v3 00/35] protocol version 2

2018-02-06 Thread Brandon Williams

Changes in v3:
 * There were some comments about how the protocol should be designed
   stateless first.  I've made this change and instead of having to
   supply the `stateless-rpc=true` capability to force stateless
   behavior, the protocol just requires all commands to be stateless.
 
 * Added some patches towards the end of the series to force the client
   to not request to use protocol v2 when pushing (even if configured to
   use v2).  This is to ease the roll-out process of a push command in
   protocol v2.  This way when servers gain the ability to accept
   pushing in v2 (and they start responding using v2 when requests are
   sent to the git-receive-pack endpoint) that clients who still don't
   understand how to push using v2 won't request to use v2 and then die
   when they recognize that the server does indeed know how to accept a
   push under v2.

 * I implemented the `shallow` feature for fetch.  This feature
   encapsulates the existing functionality of all the shallow/deepen
   capabilities in v0.  So now a server can process shallow requests.

 * Various other small tweaks that I can't remember :)

After all of that I think the series is in a pretty good state, baring
any more critical reviewing feedback.

Thanks!

Brandon Williams (35):
  pkt-line: introduce packet_read_with_status
  pkt-line: introduce struct packet_reader
  pkt-line: add delim packet support
  upload-pack: convert to a builtin
  upload-pack: factor out processing lines
  transport: use get_refs_via_connect to get refs
  connect: convert get_remote_heads to use struct packet_reader
  connect: discover protocol version outside of get_remote_heads
  transport: store protocol version
  protocol: introduce enum protocol_version value protocol_v2
  test-pkt-line: introduce a packet-line test helper
  serve: introduce git-serve
  ls-refs: introduce ls-refs server command
  connect: request remote refs using v2
  transport: convert get_refs_list to take a list of ref patterns
  transport: convert transport_get_remote_refs to take a list of ref
patterns
  ls-remote: pass ref patterns when requesting a remote's refs
  fetch: pass ref patterns when fetching
  push: pass ref patterns when pushing
  upload-pack: introduce fetch server command
  fetch-pack: perform a fetch using v2
  upload-pack: support shallow requests
  fetch-pack: support shallow requests
  connect: refactor git_connect to only get the protocol version once
  connect: don't request v2 when pushing
  transport-helper: remove name parameter
  transport-helper: refactor process_connect_service
  transport-helper: introduce stateless-connect
  pkt-line: add packet_buf_write_len function
  remote-curl: create copy of the service name
  remote-curl: store the protocol version the server responded with
  http: allow providing extra headers for http requests
  http: don't always add Git-Protocol header
  remote-curl: implement stateless-connect command
  remote-curl: don't request v2 when pushing

 .gitignore  |   1 +
 Documentation/technical/protocol-v2.txt | 338 +
 Makefile|   7 +-
 builtin.h   |   2 +
 builtin/clone.c |   2 +-
 builtin/fetch-pack.c|  21 +-
 builtin/fetch.c |  14 +-
 builtin/ls-remote.c |   7 +-
 builtin/receive-pack.c  |   6 +
 builtin/remote.c|   2 +-
 builtin/send-pack.c |  20 +-
 builtin/serve.c |  30 ++
 builtin/upload-pack.c   |  74 
 connect.c   | 352 +-
 connect.h   |   7 +
 fetch-pack.c| 319 +++-
 fetch-pack.h|   4 +-
 git.c   |   2 +
 http.c  |  25 +-
 http.h  |   2 +
 ls-refs.c   |  96 +
 ls-refs.h   |   9 +
 pkt-line.c  | 149 +++-
 pkt-line.h  |  77 
 protocol.c  |   2 +
 protocol.h  |   1 +
 remote-curl.c   | 257 -
 remote.h|   9 +-
 serve.c | 260 +
 serve.h |  15 +
 t/helper/test-pkt-line.c|  64 
 t/t5701-git-serve.sh| 176 +
 t/t5702-protocol-v2.sh  | 239 
 transport-helper.c  |  84 +++--
 transport-internal.h|   4 +-
 transport.c | 116 --
 transport.h |   9 +-
 upload-pack.c   | 625 +++

Re: [PATCH v2 00/27] protocol version 2

2018-02-06 Thread Brandon Williams

On 01/31, Derrick Stolee wrote:
> Sorry for chiming in with mostly nitpicks so late since sending this
> version. Mostly, I tried to read it to see if I could understand the scope
> of the patch and how this code worked before. It looks very polished, so I
> the nits were the best I could do.
> 
> On 1/25/2018 6:58 PM, Brandon Williams wrote:
> > Changes in v2:
> >   * Added documentation for fetch
> >   * changes #defines for state variables to be enums
> >   * couple code changes to pkt-line functions and documentation
> >   * Added unit tests for the git-serve binary as well as for ls-refs
> 
> I'm a fan of more unit-level testing, and I think that will be more
> important as we go on with these multiple configuration options.
> 
> > Areas for improvement
> >   * Push isn't implemented, right now this is ok because if v2 is requested 
> > the
> > server can just default to v0.  Before this can be merged we may want to
> > change how the client request a new protocol, and not allow for sending
> > "version=2" when pushing even though the user has it configured.  Or 
> > maybe
> > its fine to just have an older client who doesn't understand how to push
> > (and request v2) to die if the server tries to speak v2 at it.
> > 
> > Fixing this essentially would just require piping through a bit more
> > information to the function which ultimately runs connect (for both 
> > builtins
> > and remote-curl)
> 
> Definitely save push for a later patch. Getting 'fetch' online did require
> 'ls-refs' at the same time. Future reviews will be easier when adding one
> command at a time.
> 
> > 
> >   * I want to make sure that the docs are well written before this gets 
> > merged
> > so I'm hoping that someone can do a through review on the docs 
> > themselves to
> > make sure they are clear.
> 
> I made a comment in the docs about the architectural changes. While I think
> a discussion on that topic would be valuable, I'm not sure that's the point
> of the document (i.e. documenting what v2 does versus selling the value of
> the patch). I thought the docs were clear for how the commands work.
> 
> >   * Right now there is a capability 'stateless-rpc' which essentially makes 
> > sure
> > that a server command completes after a single round (this is to make 
> > sure
> > http works cleanly).  After talking with some folks it may make more 
> > sense
> > to just have v2 be stateless in nature so that all commands terminate 
> > after
> > a single round trip.  This makes things a bit easier if a server wants 
> > to
> > have ssh just be a proxy for http.
> > 
> > One potential thing would be to flip this so that by default the 
> > protocol is
> > stateless and if a server/command has a state-full mode that can be
> > implemented as a capability at a later point.  Thoughts?
> 
> At minimum, all commands should be designed with a "stateless first"
> philosophy since a large number of users communicate via HTTP[S] and any
> decisions that make stateless communication painful should be rejected.

I agree with this and my next version will run with this philosophy in
mind (v2 will be stateless by default).

> 
> >   * Shallow repositories and shallow clones aren't supported yet.  I'm 
> > working
> > on it and it can be either added to v2 by default if people think it 
> > needs
> > to be in there from the start, or we can add it as a capability at a 
> > later
> > point.
> 
> I'm happy to say the following:
> 
> 1. Shallow repositories should not be used for servers, since they cannot
> service all requests.
> 
> 2. Since v2 has easy capability features, I'm happy to leave shallow for
> later. We will want to verify that a shallow clone command reverts to v1.
> 
> 
> I fetched bw/protocol-v2 with tip 13c70148, built, set 'protocol.version=2'
> in the config, and tested fetches against GitHub and VSTS just as a
> compatibility test. Everything worked just fine.
> 
> Is there an easy way to test the existing test suite for clone and fetch
> using protocol v2 to make sure there are no regressions with
> protocol.version=2 in the config?

Yes there already exist interop tests for testing the addition of
requesting a new protocol at //t/interop/i5700-protocol-transition.sh

> 
> Thanks,
> -Stolee

-- 
Brandon Williams

Re: [PATCH 12/26] ls-refs: introduce ls-refs server command

2018-02-06 Thread Brandon Williams

On 02/01, Jeff Hostetler wrote:
> 
> 
> On 1/2/2018 7:18 PM, Brandon Williams wrote:
> > Introduce the ls-refs server command.  In protocol v2, the ls-refs
> > command is used to request the ref advertisement from the server.  Since
> > it is a command which can be requested (as opposed to mandatory in v1),
> > a client can sent a number of parameters in its request to limit the ref
> > advertisement based on provided ref-patterns.
> > 
> > Signed-off-by: Brandon Williams 
> > ---
> >   Documentation/technical/protocol-v2.txt | 26 +
> >   Makefile|  1 +
> >   ls-refs.c   | 97 
> > +
> >   ls-refs.h   |  9 +++
> >   serve.c |  2 +
> >   5 files changed, 135 insertions(+)
> >   create mode 100644 ls-refs.c
> >   create mode 100644 ls-refs.h
> > 
> > diff --git a/Documentation/technical/protocol-v2.txt 
> > b/Documentation/technical/protocol-v2.txt
> > index b87ba3816..5f4d0e719 100644
> > --- a/Documentation/technical/protocol-v2.txt
> > +++ b/Documentation/technical/protocol-v2.txt
> > @@ -89,3 +89,29 @@ terminate the connection.
> >   Commands are the core actions that a client wants to perform (fetch, push,
> >   etc).  Each command will be provided with a list capabilities and
> >   arguments as requested by a client.
> > +
> > + Ls-refs
> > +-
> > +
> > +Ls-refs is the command used to request a reference advertisement in v2.
> > +Unlike the current reference advertisement, ls-refs takes in parameters
> > +which can be used to limit the refs sent from the server.
> > +
> > +Ls-ref takes in the following parameters wraped in packet-lines:
> > +
> > +  symrefs: In addition to the object pointed by it, show the underlying
> > +  ref pointed by it when showing a symbolic ref.
> > +  peel: Show peeled tags.
> > +  ref-pattern : When specified, only references matching the
> > +given patterns are displayed.
> > +
> > +The output of ls-refs is as follows:
> > +
> > +output = *ref
> > +flush-pkt
> > +ref = PKT-LINE((tip | peeled) LF)
> > +tip = obj-id SP refname (SP symref-target)
> > +peeled = obj-id SP refname "^{}"
> > +
> > +symref = PKT-LINE("symref" SP symbolic-ref SP resolved-ref LF)
> > +shallow = PKT-LINE("shallow" SP obj-id LF)
> 
> Do you want to talk about ordering requirements on this?
> I think packed-refs has one, but I'm not sure it matters here
> where the client or server sorts it.
> 
> Are there any provisions for compressing the renames, like in the
> reftable spec or in index-v4 ?

Not currently but it would be rather easy to just add a feature to
ls-refs to transmit the resultant list of refs into something like
reftable.  So this is something that can be added later.

> 
> It doesn't need to be in the initial version.  Just asking.  We could
> always add a "ls-refs-2" command that builds upon this.
> 
> Thanks,
> Jeff

-- 
Brandon Williams

Re: BUG: fetch in certain repo always gives "did not send all necessary objects"

2018-02-06 Thread Elijah Newren

On Tue, Feb 6, 2018 at 3:20 PM, Stefan Beller  wrote:
> On Tue, Feb 6, 2018 at 3:04 PM, Elijah Newren  wrote:
>
>>
>> Does anyone have an idea what may have happened here or how to avoid it?
>
> According to Peff this got fixed
> https://public-inbox.org/git/20171020031630.44zvzh3d2vlhg...@sigill.intra.peff.net/
> and but you've had a corrupted repo from back when you were using an older
> version of Git.
>
> Did that repo exist before d0c39a49cc was rolled out? Then we can keep that
> hypothesis of "left-over corruption" as Peff put it.

I'm somewhat confused by this explanation.  That precise commit is the
one I bisected to that _caused_ the fetch to fail.  Also, there might
be one important difference here -- in the link you provide, it
suggests that you had a corrupted working directory that made use of a
now gc'ed commit.  In the case I was able to dig into, we did not.
(There was a left-over .git/worktree/ that had a now gc'ed
commit, but no working directory that used it.)

I suspect you mean that there was another previous bug that induced
corruption, that this commit fixed that other bug, while also
introducing this new bug that makes folks' clones unusable because the
error doesn't provide enough information for users to know how to fix.
It took me hours to figure it out, after users ran out of ideas and
came and asked me for help.  (Maybe if I was familiar with worktree,
and knew they had been using it, then I might have guessed that "HEAD"
meant "not your actual HEAD but the HEAD of the vestige of some other
worktree").

Does anyone have pointers about what might be doable in terms of
providing a more useful error message to allow users to recover?
And/or ideas of what steps could cause corruption so I can send out a
PSA to help users avoid it?

If not, I'll try to dig more, but I thought I'd ask others familiar
with this area.

Re: [RFC PATCH 000/194] Moving global state into the repository object

2018-02-06 Thread Stefan Beller

It took some time to do fixes in such a long series, mostl of my time spent in
rebasing and being extra careful about selecting the right commit to edit.
Based on feedback so far I have queued the changes below.
The changes are also available at 
https://github.com/stefanbeller/git/tree/object-store-v2
I do not plan on sending out this series today, as I would want
to wait a couple of days between resending for such a large series.

I think I addressed all issues raised, except finishing the memleak
issues, which I'll continue working on.

Stefan

diff --git c/object-store.h w/object-store.h
index 62763b8412..a6051ccb73 100644
--- c/object-store.h
+++ w/object-store.h
@@ -60,6 +60,8 @@ struct raw_object_store {
 #define RAW_OBJECT_STORE_INIT \
{ NULL, MRU_INIT, ALTERNATES_INIT, { NULL, 0, 0, 0 }, 0, 0, 0 }
 
+extern void raw_object_store_clear(struct raw_object_store *o);
+
 struct packed_git {
struct packed_git *next;
struct pack_window *windows;
diff --git c/object.c w/object.c
index 2fe5fac3ce..f13f9d97f4 100644
--- c/object.c
+++ w/object.c
@@ -440,3 +440,20 @@ void clear_object_flags(unsigned flags)
obj->flags &= ~flags;
}
 }
+
+
+void raw_object_store_clear(struct raw_object_store *o)
+{
+   /* TODO: free alt_odb_list/tail */
+   /* TODO: clear packed_git, packed_git_mru */
+}
+
+void object_parser_clear(struct object_parser *o)
+{
+   /*
+* TOOD free objects in o->obj_hash.
+*
+* As objects are allocated in slabs (see alloc.c), we do
+* not need to free each object, but each slab instead.
+*/
+}
diff --git c/object.h w/object.h
index 900f1b6611..0b42b09322 100644
--- c/object.h
+++ w/object.h
@@ -43,6 +43,8 @@ extern struct alloc_state the_repository_object_state;
&the_repository_object_state, \
0 }
 
+extern void object_parser_clear(struct object_parser *o);
+
 struct object_list {
struct object *item;
struct object_list *next;
diff --git c/repository.c w/repository.c
index 64fb6d8b34..d5ea158b26 100644
--- c/repository.c
+++ w/repository.c
@@ -141,6 +141,9 @@ static void repo_clear(struct repository *repo)
FREE_AND_NULL(repo->worktree);
FREE_AND_NULL(repo->submodule_prefix);
 
+   raw_object_store_clear(&repo->objects);
+   object_parser_clear(&repo->parsed_objects);
+
if (repo->config) {
git_configset_clear(repo->config);
FREE_AND_NULL(repo->config);
diff --git c/submodule.c w/submodule.c
index 9bd337ce99..c9634f84ef 100644
--- c/submodule.c
+++ w/submodule.c
@@ -494,10 +494,7 @@ static int open_submodule(struct repository *out, const 
char *path)
 {
struct strbuf sb = STRBUF_INIT;
 
-   if (submodule_to_gitdir(&sb, path))
-   return -1;
-
-   if (repo_init(out, sb.buf, NULL)) {
+   if (submodule_to_gitdir(&sb, path) || repo_init(out, sb.buf, NULL)) {
strbuf_release(&sb);
return -1;
}
diff --git c/t/t5531-deep-submodule-push.sh w/t/t5531-deep-submodule-push.sh
index 8b2aa5a0f4..39cb2c1c34 100755
--- c/t/t5531-deep-submodule-push.sh
+++ w/t/t5531-deep-submodule-push.sh
@@ -308,22 +308,6 @@ test_expect_success 'submodule entry pointing at a tag is 
error' '
test_i18ngrep "is a tag, not a commit" err
 '
 
-test_expect_success 'replace ref does not interfere with submodule access' '
-   test_commit -C work/gar/bage one &&
-   test_commit -C work/gar/bage two &&
-   git -C work/gar/bage reset HEAD^^ &&
-   git -C work/gar/bage replace two one &&
-   test_when_finished "git -C work/gar/bage replace -d two" &&
-
-   test_commit -C work/gar/bage three &&
-   git -C work add gar/bage &&
-   git -C work commit -m "advance submodule" &&
-
-   git -C work push --recurse-submodules=on-demand ../pub.git master 2>err 
&&
-   ! grep error err &&
-   ! grep fatal err
-'
-
 test_expect_success 'push fails if recurse submodules option passed as yes' '
(
cd work/gar/bage &&

Re: BUG: fetch in certain repo always gives "did not send all necessary objects"

2018-02-06 Thread Stefan Beller

On Tue, Feb 6, 2018 at 3:04 PM, Elijah Newren  wrote:

>
> Does anyone have an idea what may have happened here or how to avoid it?

According to Peff this got fixed
https://public-inbox.org/git/20171020031630.44zvzh3d2vlhg...@sigill.intra.peff.net/
and but you've had a corrupted repo from back when you were using an older
version of Git.

Did that repo exist before d0c39a49cc was rolled out? Then we can keep that
hypothesis of "left-over corruption" as Peff put it.

[PATCH] t0050: remove the unused $test_case variable

2018-02-06 Thread Ævar Arnfjörð Bjarmason

The $test_case variable hasn't been used since
decd3c0c28 ("t0050-*.sh: mark the rename (case change) test as
passing", 2014-11-28) when its last user went away.

Let's remove the "say" as well, since it's obvious from subsequent
output that we're testing on a case sensitive filesystem.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t0050-filesystem.sh | 8 
 1 file changed, 8 deletions(-)

diff --git a/t/t0050-filesystem.sh b/t/t0050-filesystem.sh
index b29d749bb7..606ffddd92 100755
--- a/t/t0050-filesystem.sh
+++ b/t/t0050-filesystem.sh
@@ -7,14 +7,6 @@ test_description='Various filesystem issues'
 auml=$(printf '\303\244')
 aumlcdiar=$(printf '\141\314\210')
 
-if test_have_prereq CASE_INSENSITIVE_FS
-then
-   say "will test on a case insensitive filesystem"
-   test_case=test_expect_failure
-else
-   test_case=test_expect_success
-fi
-
 if test_have_prereq UTF8_NFD_TO_NFC
 then
say "will test on a unicode corrupting filesystem"
-- 
2.15.1.424.g9478a66081

Re: [PATCH v1 1/5] submodule foreach: correct '$path' in nested submodules from a subdirectory

2018-02-06 Thread Stefan Beller

On Tue, Feb 6, 2018 at 2:54 PM, Jonathan Tan  wrote:
> On Fri,  2 Feb 2018 10:27:41 +0530
> Prathamesh Chavan  wrote:
>
>> When running 'git submodule foreach' from a subdirectory of your
>
> Add "--recursive".
>
>> repository, nested submodules get a bogus value for $sm_path:
>
> Maybe call it $path for now, since $sm_path starts to be recommended
> only in patches after this one.
>
>> For a submodule 'sub' that contains a nested submodule 'nested',
>> running 'git -C dir submodule foreach echo $path' would report
>
> Add "from the root of the superproject", maybe?

This command is run from the root, though the
"-C dir" should indicate that the git command runs from the subdirectory.
Not sure how much slang this is, or if it can be made easier to understand
by writing

  cd dir && git submodule foreach --recursive echo $path

but adding the "from root" part sounds like a clarification nevertheless.

>> path='../nested' for the nested submodule. The first part '../' is
>> derived from the logic computing the relative path from $pwd to the
>> root of the superproject. The second part is the submodule path inside
>> the submodule. This value is of little use and is hard to document.
>>
>> There are two different possible solutions that have more value:
>> (a) The path value is documented as the path from the toplevel of the
>> superproject to the mount point of the submodule.
>> In this case we would want to have path='sub/nested'.
>>
>> (b) As Ramsay noticed the documented value is wrong. For the non-nested
>> case the path is equal to the relative path from $pwd to the
>> submodules working directory. When following this model,
>> the expected value would be path='../sub/nested'.
>
> A third solution is to use "nested" - that is, the name of the submodule
> directory relative to its superproject. (It's currently documented as
> "the name of the submodule directory relative to the superproject".)
> Having said that, (b) is probably better.

Oh, so the nested would just report "nested/" as that is the path
from its superproject to its location. The value does not change depending
on where the command is invoked, or whether it is an actual nested or direct
submodule? The latter part sounds like a slight modification of (a), but the
former part sounds like a completely new version (c).

BUG: fetch in certain repo always gives "did not send all necessary objects"

2018-02-06 Thread Elijah Newren

[I cannot share the local repo and had to modify output of commands
slightly to redact hostnames, branch names, etc.  I think I haven't
messed anything up, but it's possible...]

Two people in the last week have come to me after running into a case
where they could not update their repo because any call to git fetch
or git pull would error out with

  fatal: bad object HEAD
  error: gerrit.server:reponame did not send all necessary objects"

which apparently has been reported before at both of:

https://public-inbox.org/git/cae5ih78nll6uhkpobvfea9xqzutc1xppvgjnayth9fj0ryf...@mail.gmail.com/
and

https://public-inbox.org/git/cagz79kyp0z1g_h3nwfmshrawhmbocik5lepuxkj0nveebri...@mail.gmail.com/

but without a resolution and it appears to have been difficult to
reproduce.  I can reproduce reliably with a copy of one of their
repos, and have dug in a little bit and may be able to provide more
info.  The pieces I've found that look relevant:

* Both users were using git-2.16.1 (suggesting they update and try new
things often)
* "git fsck --no-dangling" reports no problem on either client or server.
* The error bisects to commit d0c39a49cc ("revision.c: --all adds HEAD
from all worktrees", 2017-08-23).
* The developer has 3 directories in use corresponding to different
worktrees, but four entries under .git/worktrees, one of which appears
that it was once in use but no longer is
* The unused entry under .git/worktrees has a detached HEAD that
points to an object that doesn't exist; I suspect it was garbage
collected.

More details about the last two items:

$ ls ..
master/  1.0-release/  2.0-release/  3.0-release/

$ for i in ../*/.git; do echo $i $(cat $i/.git 2>&1); done
../master/.git cat: ../master/.git: Is a directory
../1.0-release/.git gitdir: /clone/master/.git/worktrees/1.0-release
../2.0-release/.git gitdir: /clone/master/.git/worktrees/2.0-release
../3.0-release/.git gitdir: /clone/master/.git/worktrees/3.0-release1

$ for i in .git/worktrees/*/gitdir; do echo $i $(cat $i); done
.git/worktrees/1.0-release/gitdir /clone/1.0-release/.git
.git/worktrees/2.0-release/gitdir /clone/2.0-release/.git
.git/worktrees/3.0-release1/gitdir /clone/3.0-release/.git
.git/worktrees/3.0-release/gitdir /clone/3.0-release/.git

$ (cd .git && for i in worktrees/*/HEAD; do echo $i $(git rev-parse
$i^{object}); done)
worktrees/1.0-release/HEAD c483c6806fe19c2fa9bcda892e0ddb470f8e9d65
worktrees/2.0-release/HEAD ab4c699ec9d1eab143c7e8ef51a0a6c451fcd4ea
worktrees/3.0-release1/HEAD 55e52e24b24b2f81484de9c6ea7d4520238c5fd5
fatal: ambiguous argument 'worktrees/3.0-release/HEAD^{object}':
unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git  [...] -- [...]'
worktrees/3.0-release/HEAD worktrees/3.0-release/HEAD^{object}


So, it appears both worktrees/3.0-release and worktrees/3.0-release1
believe they are used by /clone/3.0-release/, but /clone/3.0-release
uses worktree-3.0-release1.  That leaves worktrees/3.0-release unused,
and in fact its HEAD is some sha1 that does not exist in the repo --
possibly GC'ed.


Also, after the investigation above and based on the hint in one of
the other email threads linked above, we tried 'git worktree prune'
but it didn't remove the unused directory under .git/worktree (perhaps
because it thought something was using it?).  The fetch bug didn't go
away until doing an 'rm -rf .git/worktree/3.0-release'.

I have no experience prior to today with git worktree, and the
developer in question doesn't remember how things got to the current
state.  It was all a while ago.


Does anyone have an idea what may have happened here or how to avoid it?

Re: [PATCH v1 1/5] submodule foreach: correct '$path' in nested submodules from a subdirectory

2018-02-06 Thread Jonathan Tan

On Tue, 6 Feb 2018 14:54:06 -0800
Jonathan Tan  wrote:

> > There are two different possible solutions that have more value:
> > (a) The path value is documented as the path from the toplevel of the
> > superproject to the mount point of the submodule.
> > In this case we would want to have path='sub/nested'.
> > 
> > (b) As Ramsay noticed the documented value is wrong. For the non-nested
> > case the path is equal to the relative path from $pwd to the
> > submodules working directory. When following this model,
> > the expected value would be path='../sub/nested'.
>
> A third solution is to use "nested" - that is, the name of the submodule
> directory relative to its superproject. (It's currently documented as
> "the name of the submodule directory relative to the superproject".)
> Having said that, (b) is probably better.

[snip]

> > +cat >expect < > +Entering '../nested1'
> > +$pwd/clone2-nested1-nested1-$nested1sha1
> > +Entering '../nested1/nested2'
> > +$pwd/clone2/nested1-nested2-nested2-$nested2sha1
> > +Entering '../nested1/nested2/nested3'
> > +$pwd/clone2/nested1/nested2-nested3-nested3-$nested3sha1
> > +Entering '../nested1/nested2/nested3/submodule'
> > +$pwd/clone2/nested1/nested2/nested3-submodule-submodule-$submodulesha1
> > +Entering '../sub1'
> > +$pwd/clone2-foo1-sub1-$sub1sha1
> > +Entering '../sub2'
> > +$pwd/clone2-foo2-sub2-$sub2sha1
> > +Entering '../sub3'
> > +$pwd/clone2-foo3-sub3-$sub3sha1
> > +EOF
> > +
> > +test_expect_success 'test "submodule foreach --recursive" from 
> > subdirectory' '
> > +   (
> > +   cd clone2/untracked &&
> > +   git submodule foreach --recursive "echo 
> > \$toplevel-\$name-\$sm_path-\$sha1" >../../actual
> > +   ) &&
> > +   test_i18ncmp expect actual
> > +'

Wait a minute...this seems like you're using my "third solution". If we
were using either (a) or (b), $sm_path would contain slashes in the case
of nested submodules, right?

RE: git: CVE-2018-1000021: client prints server sent ANSI escape codes to the terminal, allowing for unverified messages to potentially execute arbitrary commands

2018-02-06 Thread Randall S. Becker

On February 5, 2018 3:43 PM, Jonathan Nieder wrote:
> 
> Salvatore Bonaccorso wrote[1]:
> 
> > the following vulnerability was published for git.
> >
> > CVE-2018-121[0]:
> > |client prints server sent ANSI escape codes to the terminal, allowing
> > |for unverified messages to potentially execute arbitrary commands
> >
> > Creating this bug to track the issue in the BTS. Apparently the CVE
> > was sssigned without notifying/discussing it with upstream, at least
> > according to [1].
> >
> > If you fix the vulnerability please also make sure to include the CVE
> > (Common Vulnerabilities & Exposures) id in your changelog entry.
> >
> > For further information see:
> >
> > [0] https://security-tracker.debian.org/tracker/CVE-2018-121
> > https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-121
> > [1] https://bugzilla.novell.com/show_bug.cgi?id=1079389#c1
> 
> Thanks.  Upstream was notified about this and we dropped the ball on
> passing it on to a more public forum.  Sorry about that.
> 
> I'd be interested in your advice on this.  There are cases where the user
may
> *want* ANSI escape codes to be passed through without change and other
> cases where the user doesn't want that.  Commands like "git diff" pass
their
> output through a pager by default, which itself may or may not sanitize
the
> output.
> 
> In other words, there are multiple components at play:
> 
>  1. A terminal.  IMHO, it is completely inexcusable these days for a
> terminal to allow arbitrary code execution by writing output to
> it.  If bugs of that kind still exist, I think we should fix them
> (and perhaps even make it a requirement in Debian policy to make
> the expectations clear for new terminals).
> 
> That said, for defense in depth, it can be useful to also guard
> against this kind of issue in other components.  In particular:
> 
>  2. A pager.  Are there clear guidelines for what it is safe and not
> safe for a pager to write to a terminal?
> 
> "less -R" tries to only allow ANSI "color" escape sequences
> through but I wouldn't be surprised if there are some cases it
> misses.
> 
>  3. Output formats.  Some git commands are designed for scripting
> and do not have a sensible way to sanitize their output without
> breaking scripts.  Fortunately, in the case of "git diff", git
> has a notion of a "binary patch" where everything is sanitized,
> at the cost of the output being unreadable to a human (email-safe
> characters but not something that a human can read at a glance).
> So if we know what sequences to avoid writing to stdout, then we
> can treat files with those sequences as binary.
> 
> Pointers welcome.

One possible (albeit brute force) approach, in dealing with the specifics of
this CVE, may be to explicitly translate ESC-] into BLANK-], leaving a
potential attack visible but ineffective. This only addresses the attack
vector documented in the particular CVE but it can be done efficiently. The
sequence does not appear significant in ANSI - the CVE documents the xterm
situation.  Checking very old termcap, the impact would be on unfiltering
emulations derived (this is a sample) from nec 5520, freedom 100, Sun
workstations sun-s/-e-s, fortune, etc. Based on the seemingly limited use of
this sequence, having a config item may be overkill, but it could be set
enabled by default.

What I don't know - and it's not explicitly in the CVE - is just how many
other terminal types with similar vulnerabilities are out there, but I'm
suspecting it's larger than one would guess - mostly, it seems like this
particular sequence is intended to be used for writing status line output
(line 25?) instead of sticking it in a prompt. This can be used prettifies a
lengthy bash prompt to display the current branch and repository at the
bottom of the screen instead of in the inline prompt, but that's the user's
choice and not something git has to deal with. There were some green-screen
terminals with other weird ESC sequences back in the day that could really
get into trouble with this, including loading/executing programs in terminal
memory via output - really. I'm sure it seemed like a good idea at the time,
but I can see how it could have been used for evil.

A more general solution might be to permit the configuration of a list of
blocked character sequences and apply those as a filter. Something like
core.filter-mask="\E]", "\EA".

Just my $0.02 ramblings.

Cheers,
Randall

-- Brief whoami:
 NonStop developer since approximately 2112884442
 UNIX developer since approximately 421664400
-- In my real life, I talk too much.

Re: [PATCH v1 1/5] submodule foreach: correct '$path' in nested submodules from a subdirectory

2018-02-06 Thread Jonathan Tan

On Fri,  2 Feb 2018 10:27:41 +0530
Prathamesh Chavan  wrote:

> When running 'git submodule foreach' from a subdirectory of your

Add "--recursive".

> repository, nested submodules get a bogus value for $sm_path:

Maybe call it $path for now, since $sm_path starts to be recommended
only in patches after this one.

> For a submodule 'sub' that contains a nested submodule 'nested',
> running 'git -C dir submodule foreach echo $path' would report

Add "from the root of the superproject", maybe?

> path='../nested' for the nested submodule. The first part '../' is
> derived from the logic computing the relative path from $pwd to the
> root of the superproject. The second part is the submodule path inside
> the submodule. This value is of little use and is hard to document.
> 
> There are two different possible solutions that have more value:
> (a) The path value is documented as the path from the toplevel of the
> superproject to the mount point of the submodule.
> In this case we would want to have path='sub/nested'.
> 
> (b) As Ramsay noticed the documented value is wrong. For the non-nested
> case the path is equal to the relative path from $pwd to the
> submodules working directory. When following this model,
> the expected value would be path='../sub/nested'.

A third solution is to use "nested" - that is, the name of the submodule
directory relative to its superproject. (It's currently documented as
"the name of the submodule directory relative to the superproject".)
Having said that, (b) is probably better.

> Both groups can be found in the wild.  The author has no data if one group
> outweighs the other by large margin, and offending each one seems equally
> bad at first.  However in the authors imagination it is better to go with
> (a) as running from a sub directory sounds like it is carried out
> by a human rather than by some automation task.  With a human on
> the keyboard the feedback loop is short and the changed behavior can be
> adapted to quickly unlike some automation that can break silently.

Thanks - this is a good analysis.

>  git-submodule.sh |  1 -
>  t/t7407-submodule-foreach.sh | 36 ++--

I think the documentation should be changed too - $path is the name of
the submodule directory relative to the current directory?

Re: [PATCH 076/194] push: add test showing bad interaction of replace refs and submodules

2018-02-06 Thread Stefan Beller

On Mon, Feb 5, 2018 at 6:20 PM, brian m. carlson
 wrote:
> On Mon, Feb 05, 2018 at 03:55:37PM -0800, Stefan Beller wrote:
>> The ref subsystem has not been migrated yet to access the object store
>> via passed in repository objects. As a result replace when the object store
>> tries to access replace refs in a repository other than the_repository
>> it produces errors:
>>
>>   error: refs/replace/3afabef75c627b8943bcae86837abc7c32fe does not 
>> point to a valid object!
>>
>> Add a test demonstrating this failure.
>>
>> Signed-off-by: Jonathan Nieder 
>> Signed-off-by: Stefan Beller 
>>
>> squash! push: add test showing bad interaction of replace refs and submodules
>>
>> replace-objects: evaluate replacement refs without using the object store
>>
>> Pass DO_FOR_EACH_INCLUDE_BROKEN when iterating over replacement refs
>> so that the iteration does not require opening the named objects from
>> the object store. This avoids a dependency cycle between object access
>> and replace ref iteration.
>>
>> Moreover the ref subsystem has not been migrated yet to access the object
>> store via passed in repository objects.  As a result, without this patch
>> when the object store tries to access replace refs in a repository other
>> than the_repository it produces errors:
>>
>>error: refs/replace/3afabef75c627b8943bcae86837abc7c32fe does not 
>> point to a valid object!
>>
>> Signed-off-by: Jonathan Nieder 
>> Signed-off-by: Stefan Beller 
>
> It appears you have multiple independent commit messages here.

I will drop this patch; it appears as if it was targeted to be part of
006f3f28af (replace-objects: evaluate replacement refs without
using the object store, 2017-09-12), which landed.

We can revive this test outside of this long series if we feel inclined.

Thanks for spotting!
Stefan

Please Its Very Urgent!

2018-02-06 Thread Mrs.Etta Kayser

Hello Dear,
Please forgive me for stressing you with my predicaments as I know that 
this letter may come to you as big surprise.Truly, I came across your E-
mail from my personal search afterward I decided to email you directly 
believing that you will be honest to fulfill my final wish before i 
die.

Meanwhile, I am Mrs. Etta Kayser, 62 years old, from America, and I am 
suffering from a long time cancer and from all indication my condition 
is really deteriorating as my doctors have confirmed and courageously 
advised me that I may not live beyond two months from now for the 
reason that my tumor has reached a critical stage which has defiled all 
forms of medical treatment. I am a registered nurse by profession while 
my husband was dealing on Gold Dust and Gold Dory Bars in Burkina Faso 
till his sudden death the year 2008 then I took over his business till 
date.

Actually, I have a deposit sum of four million five hundred thousand US 
dollars [$4,500,000.00] with one of the leading bank in Burkina Faso 
but unfortunately I cannot visit the bank since I’m critically sick and 
powerless to do anything myself but my bank account officer advised me 
to assign any of my trustworthy relative, friends or partner with 
authorization letter to stand as the recipient of my money but 
sorrowfully my only beloved daughter died at the age of 19 years and I 
don’t have any reliable relative and no other child.

Therefore, I want you to receive the money and take 50% to take care of 
yourself and family while 50% should be use basically on humanitarian 
purposes mostly to orphanages home, Motherless babies home, less 
privileged and disable citizens and widows around the world.

So, i humbly request you to kindly contact my resident lawyer, 
honorable David Richard through this telephone, [+22675329619] for my 
banking records and more details.


Good Day 

Yours Faithfully, 
Mrs.Etta Kayser

Re: An option to ignore submodules in stash push?

2018-02-06 Thread Stefan Beller

On Tue, Feb 6, 2018 at 7:56 AM, Robert Dailey  wrote:
> I haven't seen such an option, but I think it would be nice to be able
> to ignore submodules when creating a stash. When I stash changes in a
> directory, almost always I intend to only stash real files, not
> modified submodules. When I pop the stash later, it gets disrupted due
> to submodule conflicts. To avoid getting the conflicts in the first
> place, it would be nice to somehow specify:
>
> $ git stash push --no-submodules -- MyDirectory/
>
> Would this make sense?

What kind of submodule conflicts arise?

I remember a recent bugfix with apparent submodules, which
were not touched.

https://public-inbox.org/git/cabpp-bhdrw_daesic3xk7kc3jmgkenqupqf69opbvyhrkbh...@mail.gmail.com

https://github.com/git/git/commit/c641ca67072946f95f87e7b21f13f3d4e73701e3
which is included in 2.16.1

But this is me taking a wild guess, can you say more about your use case
and what the actual problem is (and what the expected behavior is, favorably
as a script) ?

Thanks,
Stefan

[no subject]

2018-02-06 Thread Mario Ernesto Gerala




You have been awarded a donation of $350,000 USD please reply this email for 
more info : sungla...@gmail.com

[PATCH] files-backend: unlock packed store only if locked

2018-02-06 Thread Jonathan Tan

In commit 42c7f7ff9685 ("commit_packed_refs(): remove call to
`packed_refs_unlock()`", 2017-06-23), a call to packed_refs_unlock() was
added to files_initial_transaction_commit() in order to compensate for
removing that call from commit_packed_refs(). However, that call was
added in the cleanup section, which is run even if the packed_ref_store
was never locked (which happens if an error occurs earlier in the
function).

Create a new cleanup goto target which runs packed_refs_unlock(), and
ensure that only goto statements after a successful invocation of
packed_refs_lock() jump there.

Signed-off-by: Jonathan Tan 
---
I noticed this when one of our servers sent duplicate refs in its ref
advertisement (noticed through GIT_TRACE_PACKET). With this change (and
before the aforementioned commit 42c7f7ff9685), the error message is
"fatal: multiple updates for ref '' not allowed", which gives a
bigger clue to the problem. Currently, it is "fatal: BUG:
packed_refs_unlock() called when not locked".

(I couldn't replicate this problem in C Git.)
---
 refs/files-backend.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index f75d960e1..89bc5584a 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2931,13 +2931,14 @@ static int files_initial_transaction_commit(struct 
ref_store *ref_store,
 
if (initial_ref_transaction_commit(packed_transaction, err)) {
ret = TRANSACTION_GENERIC_ERROR;
-   goto cleanup;
+   goto locked_cleanup;
}
 
+locked_cleanup:
+   packed_refs_unlock(refs->packed_ref_store);
 cleanup:
if (packed_transaction)
ref_transaction_free(packed_transaction);
-   packed_refs_unlock(refs->packed_ref_store);
transaction->state = REF_TRANSACTION_CLOSED;
string_list_clear(&affected_refnames, 0);
return ret;
-- 
2.16.0.rc1.238.g530d649a79-goog

Re: [PATCH 075/194] fetch, push: do not use submodule as alternate in has_commits check

2018-02-06 Thread Stefan Beller

On Mon, Feb 5, 2018 at 8:20 PM, Eric Sunshine  wrote:

> Or just combine these two error cases:
>
> if (submodule_to_gitdir(...) || repo_init(...)) {
> strbuf_release(...);
> return -1;
> }

will fix, thanks!

Re: Are concurrent git operations on the same repo safe?

2018-02-06 Thread Ian Norton

Sure, my office is still devoid of power, i'll have to get back to you
to be more precise but I was using a morally equivalent process to
https://gitlab.com/inorton/git-multi-sync

cd repo
python -m gitmultisync --update-submodules

where I had a superproject containing 5-6 submodules, some of which
were quite large (1-2Gb history)

Afterwards, I had trouble doing "git pull -r" in my submodules to pull
in newer changes.

On 6 February 2018 at 20:10, Jonathan Nieder  wrote:
> Ian Norton wrote:
>
>>  Specifically I'm trying to speed up "git
>> submodule update" by doing several at the same time.
>
> Can you say more about this?  E.g. how can I reproduce your experience?
> Is there a script I can run?
>
> Thanks,
> Jonathan

Re: [RFC PATCH 000/194] Moving global state into the repository object

2018-02-06 Thread Stefan Beller

> Any suggestions welcome!

Eric repeatedly points out leaking memory.

As of today we do not care about memory leaking as it is cleaned
up at the end of the program anyway, for example the objects
hash table is never cleared.

In a resend I will put the infrastructure in place to free the memory via
adding

  raw_object_store_clear(struct raw_object_store *)
  object_parser_clear(struct object_parser *)
  ref_store_clear(struct ref_store *)

and calling these functions on repo destruction. The functions
itself will be empty code-wise and contain TODO comments listing
all variables that need care.

Follow up patches can figure out what is best to do, such as closing
the memleaks. As repo_clear is not called for the_repository
we'd even keep the property of relying on fast cleanup by the OS.

Thanks,
Stefan

Re: "git branch" issue in 2.16.1

2018-02-06 Thread Paul Smith

On Tue, 2018-02-06 at 11:49 -0800, Jason Racey wrote:
> After upgrading git from 2.16.0 to 2.16.1 (via Homebrew - I’m on
> macOS) I noticed that the “git branch” command appears to display the
> branch listing in something similar to a vi editor - though not quite
> the same. I don’t know the technical term for this state. You can’t
> actually edit the output of the command, but you’re in a state where
> you have to type “q” to exit and then the list disappears. It’s very
> inconvenient and it doesn’t seem like it was by design. I’m using zsh
> in iTerm2 if that helps. Thanks.

I think you mean that you're in the pager (less(1), most likely). 
Many/most Git commands that can generate a large amount of output (git
log, git diff, git show, etc.) will automatically send the output to a
pager so you can scroll through it easily.

The man page for git branch says:

  CONFIGURATION
   pager.branch is only respected when listing branches, i.e., when --list
   is used or implied. The default is to use a pager. See git-config(1).

So, if you never want to use the pager for git branch output you can
configure the pager.branch option to set it always off.

Or you can use "git branch | cat" so that stdout is not a terminal :).

Re: Are concurrent git operations on the same repo safe?

2018-02-06 Thread Jonathan Nieder

Ian Norton wrote:

>  Specifically I'm trying to speed up "git
> submodule update" by doing several at the same time.

Can you say more about this?  E.g. how can I reproduce your experience?
Is there a script I can run?

Thanks,
Jonathan

Re: "git branch" issue in 2.16.1

2018-02-06 Thread Stefan Beller

On Tue, Feb 6, 2018 at 11:57 AM, Todd Zullinger  wrote:
> Hi Jason,
>
> Jason Racey wrote:
>> After upgrading git from 2.16.0 to 2.16.1 (via Homebrew -
>> I’m on macOS) I noticed that the “git branch” command
>> appears to display the branch listing in something similar
>> to a vi editor - though not quite the same. I don’t know
>> the technical term for this state. You can’t actually edit
>> the output of the command, but you’re in a state where you
>> have to type “q” to exit and then the list disappears.
>> It’s very inconvenient and it doesn’t seem like it was by
>> design. I’m using zsh in iTerm2 if that helps. Thanks.
>
> In 2.16.0 `git branch --list` is sent to a pager by default.
> (Without arguments, --list is the default, so this applies
> to `git branch`).
>
> You can set pager.branch to false to disable this in the
> config, or use git --no-pager branch to do so for a single
> invocation.
>
> I can't say why you're seeing this with 2.16.1 and not
> 2.16.0, but I'm not familiar with homebrew, so perhaps
> something didn't work as intended in 2.16.0.
>

Maybe the number of branches changed since then?
As the pager only comes to life when the output fills
more than your screen. Quick workarounds:
* buy a bigger screen
* have fewer branches.

:-)

Stefan

Re: [PATCH 3/7] worktree move: new command

2018-02-06 Thread Martin Ågren

On 6 February 2018 at 03:13, Jeff King  wrote:
> On Mon, Feb 05, 2018 at 08:28:10PM +0700, Duy Nguyen wrote:
>> I learned SANITIZE=leak today! It not only catches this but also "dst".
>>
>> Jeff is there any ongoing effort to make the test suite pass with
>> SANITIZE=leak? My t2038 passed, so I went ahead with the full test
>> suite and saw so many failures. I did see in your original mails that
>> you focused on t and t0001 only..
>
> Yeah, I did those two scripts to try to prove to myself that the
> approach was good. But I haven't really pushed it any further.
>
> Martin Ågren (cc'd) did some follow-up work, but I think we still have a
> long way to go.

Agreed. :-)

> My hope is that people who are interested in
> leak-checking their new code can run some specific script they're
> interested in, and maybe fix up one or two nearby bits while they're
> there (either by fixing a leak, or just annotating via UNLEAK). Then we
> can slowly converge on correctness. :)

Yeah. My experience is that it's easy -- or was for me, anyway -- to
get carried away trying to fix all "related" leaks to the one you
started with, since there are so many dimensions to search in. Two
obvious aspects are "leaks nearby" and "leaks using the same API". This
is not to suggest that the situation is horrible. It's just that if you
pick a leak at random and there is a fair number of "clusters" of
leaks, chances are good you'll find yourself in such a cluster.

Addressing a leak without worrying too much about how it generalizes
would still help. ;-)

Martin

Re: "git branch" issue in 2.16.1

2018-02-06 Thread Todd Zullinger

Hi Jason,

Jason Racey wrote:
> After upgrading git from 2.16.0 to 2.16.1 (via Homebrew -
> I’m on macOS) I noticed that the “git branch” command
> appears to display the branch listing in something similar
> to a vi editor - though not quite the same. I don’t know
> the technical term for this state. You can’t actually edit
> the output of the command, but you’re in a state where you
> have to type “q” to exit and then the list disappears.
> It’s very inconvenient and it doesn’t seem like it was by
> design. I’m using zsh in iTerm2 if that helps. Thanks.

In 2.16.0 `git branch --list` is sent to a pager by default.
(Without arguments, --list is the default, so this applies
to `git branch`).

You can set pager.branch to false to disable this in the
config, or use git --no-pager branch to do so for a single
invocation.

I can't say why you're seeing this with 2.16.1 and not
2.16.0, but I'm not familiar with homebrew, so perhaps
something didn't work as intended in 2.16.0.

-- 
Todd
~~
Historian, n. A broad-gauge gossip.
-- Ambrose Bierce, "The Devil's Dictionary"

"git branch" issue in 2.16.1

2018-02-06 Thread Jason Racey

After upgrading git from 2.16.0 to 2.16.1 (via Homebrew - I’m on macOS) I 
noticed that the “git branch” command appears to display the branch listing in 
something similar to a vi editor - though not quite the same. I don’t know the 
technical term for this state. You can’t actually edit the output of the 
command, but you’re in a state where you have to type “q” to exit and then the 
list disappears. It’s very inconvenient and it doesn’t seem like it was by 
design. I’m using zsh in iTerm2 if that helps. Thanks.

- Jason

Re: [PATCH v2 05/14] commit-graph: implement git-commit-graph --write

2018-02-06 Thread Derrick Stolee


On 2/5/2018 1:48 PM, Junio C Hamano wrote:

Jeff King  writes:


The big advantage of your scheme is that you can update the graph index
without repacking. The traditional advice has been that you should
always do a full repack during a gc (since it gives the most delta
opportunities). So metadata like reachability bitmaps were happy to tied
to packs, since you're repacking anyway during a gc. But my
understanding is that this doesn't really fly with the Windows
repository, where it's simply so big that you never obtain a single
pack, and just pass around slices of history in pack format.

So I think I'm OK with the direction here of keeping metadata caches
separate from the pack storage.

OK.  I guess that the topology information surviving repacking is a
reason good enough to keep this separate from pack files, and I
agree with your "If they're not tied to packs,..." below, too.

Thanks.


If they're not tied to packs, then I think having a separate builtin
like this is the best approach. It gives you a plumbing command to
experiment with, and it can easily be called from git-gc.

-Peff


Thanks for all the advice here. In addition to the many cleanups that 
were suggested, I'm going to take a try at the "subcommand" approach. 
I'll use git-submodule--helper and git-remote as models for my 
implementation.


Thanks,
-Stolee

Re: Missing git options

2018-02-06 Thread Jonathan Nieder

Hi,

Martin Häcker wrote:
>> Am 06.02.2018 um 01:43 schrieb brian m. carlson 
>> :

>> I think this is likely to cause problems.  Many people use git log with
>> --pretty to format commit hashes or messages into other programs.  I'm
>> aware of multiple tools that will simply break if --graph or --patch
>> become the default.  Requiring people to retrofit their tools to use
>> --no-graph or --no-patch is likely to be a burden.
>
> While I share your concern, this is something that will completely
> freeze development of the git tui which I cannot imagine is wanted.

On the contrary, we take things on a case by case basis.  Brian
described a cost to your proposed change and you described a benefit;
the next step would be to figure out a way to accomplish what you're
aiming to do in a way that maximizes the benefit and minimizes the
cost.

In other words, it is not that we never change default output formats,
but we do care a lot about helping existing users.  One way of doing
that is to provide more convenient machine-readable versions of some
commands so that it is obvious to script writers what to use.

As Stefan Beller mentioned, "git log" is in interesting case, in that
the scripting commands exist:

 - "git rev-list" to list commits satisfying some criteria
 - "git diff-tree --stdin" to show information about those commits

But these commands were not sufficiently discoverable and easy to use,
so people end up using "git log" for scripting anyway.  We provide

 - "git log --format" to produce log output with a fixed format, and
   in particular

 - "git log --format=raw" to print raw objects

It's perfectly normal and common to change what "git log
--format=medium" (the default format) prints, as long as we take into
account the effect on existing users.  In particular:

 - Some scripts do not care about the *exact* format from "git log"
   but are using the output for a particular purpose (e.g. to generate
   release notes that they distribute along with compiler output).
   If these scripts start _sometimes_ using --graph instead based on
   configuration, this would be disruptive to some users.

 - Making --format suppress `log.graph` configuration feels confusing
   and hard to explain to me.  --graph --format= would mean something
   different from -c log.graph=true --format=, except when  =
   medium, or "--format=medium" would mean something different from not
   specifying --format at all even though medium is the default format.

In that context, it feels simpler to me to use aliases:

[alias]
l = log --oneline --graph --patch

That way, it does not confuse scripts, and even better, it is easier
to type, as "git l".

I would be happy to see some documentation pointing users to the
alias feature for this purpose.

As an example of changing the default "git log" format, see the
history in "git log --grep=decorate".  I am all for that kind of
change, but I don't think --graph or --patch falls into this category.

Thanks and hope that helps,
Jonathan

Re: [PATCH 092/194] object: move grafts to object parser

2018-02-06 Thread Stefan Beller

On Mon, Feb 5, 2018 at 8:07 PM, Eric Sunshine  wrote:
> On Mon, Feb 5, 2018 at 6:55 PM, Stefan Beller  wrote:
>> Grafts are only meaningful in the context of a single repository.
>> Therefore they cannot be global.
>>
>> Signed-off-by: Stefan Beller 
>> Signed-off-by: Jonathan Nieder 
>> ---
>> diff --git a/commit.c b/commit.c
>> @@ -121,20 +120,22 @@ int register_commit_graft(struct commit_graft *graft, 
>> int ignore_dups)
>> if (ignore_dups)
>> free(graft);
>> else {
>> -   free(commit_graft[pos]);
>> -   commit_graft[pos] = graft;
>> +   free(the_repository->parsed_objects.grafts[pos]);
>> +   the_repository->parsed_objects.grafts[pos] = graft;
>> diff --git a/object.h b/object.h
>> @@ -4,9 +4,13 @@
>>  struct object_parser {
>> struct object **obj_hash;
>> int nr_objs, obj_hash_size;
>> +
>> +   /* parent substitutions from .git/info/grafts and .git/shallow */
>> +   struct commit_graft **grafts;
>> +   int grafts_alloc, grafts_nr;
>>  };
>
> Do items moved to object_parser need to get freed when object_parser
> itself is freed? Is that happening in some other patch?

No, this patch supposedly should free the memory via
repo_free (which either does it itself or calls some specific free
for the grafts)

Thanks for finding the mem leak!
Stefan

Re: [PATCH 059/194] refs: store the main ref store inside the repository struct

2018-02-06 Thread Stefan Beller

On Mon, Feb 5, 2018 at 8:27 PM, Eric Sunshine  wrote:
> On Mon, Feb 5, 2018 at 6:55 PM, Stefan Beller  wrote:
>> diff --git a/refs.c b/refs.c
>> @@ -1609,9 +1609,6 @@ static struct ref_store_hash_entry 
>> *alloc_ref_store_hash_entry(
>> -/* A pointer to the ref_store for the main repository: */
>> -static struct ref_store *main_ref_store;
>> diff --git a/repository.h b/repository.h
>> @@ -33,6 +33,11 @@ struct repository {
>>  */
>> struct object_store objects;
>>
>> +   /*
>> +* The store in which the refs are hold.
>> +*/
>> +   struct ref_store *main_ref_store;
>
> Do items moved to the 'repository' structure need to be freed when the
> 'repository' itself is freed? Is that being done by a different patch?
> If so, it would ease review burden for the freeing to happen in the
> same patch in which the item is moved to the 'repository'.

There are two cases:

* the_repository
  In the_repository we'll mostly have the globals as they exist
  (the_index, the_hash_algo, the_ref_store) and just as before
  these globals will not be freed at the end of the program.
* arbitrary repos:
  For arbitrary repos, we usually need to allocate memory in repo_init
  and clear it out in repo_free/clear

This patch is incomplete and is leaking the main ref store for arbitrary repos.

Thanks for spotting the mem leak!
Stefan

Re: [PATCH 042/194] object-store: move alternates API to new alternates.h

2018-02-06 Thread Stefan Beller

On Mon, Feb 5, 2018 at 5:44 PM, brian m. carlson
 wrote:
> On Mon, Feb 05, 2018 at 03:55:03PM -0800, Stefan Beller wrote:
>> From: Jonathan Nieder 
>>
>> This should make these functions easier to find and object-store.h
>> less overwhelming to read.
>>
>> Signed-off-by: Stefan Beller 
>> Signed-off-by: Jonathan Nieder 
>> ---
>>  alternates.h| 68 
>> +
>>  builtin/clone.c |  1 +
>>  builtin/count-objects.c |  1 +
>>  builtin/fsck.c  |  3 +-
>>  builtin/grep.c  |  1 +
>>  builtin/submodule--helper.c |  1 +
>>  cache.h | 52 --
>>  object-store.h  | 16 +--
>>  packfile.c  |  3 +-
>>  sha1_file.c | 23 +++
>>  sha1_name.c |  3 +-
>>  submodule.c |  1 +
>>  t/helper/test-ref-store.c   |  1 +
>>  tmp-objdir.c|  1 +
>>  transport.c |  1 +
>>  15 files changed, 102 insertions(+), 74 deletions(-)
>>  create mode 100644 alternates.h
>>
>> diff --git a/alternates.h b/alternates.h
>> new file mode 100644
>> index 00..df5dc67e2e
>> --- /dev/null
>> +++ b/alternates.h
>> @@ -0,0 +1,68 @@
>> +#ifndef ALTERNATES_H
>> +#define ALTERNATES_H
>> +
>> +#include "strbuf.h"
>> +#include "sha1-array.h"
>> +
>> +struct alternates {
>> + struct alternate_object_database *list;
>> + struct alternate_object_database **tail;
>> +};
>> +#define ALTERNATES_INIT { NULL, NULL }
>
> I was surprised to find that this patch not only moves the alternates
> API to a new location, but introduces this struct.  I certainly think
> the struct is a good idea, but it should probably go in a separate
> patch, or at least get a mention in the commit message.

ok, I'll fix the commit message.

Thanks,
Stefan

Re: [PATCH 042/194] object-store: move alternates API to new alternates.h

2018-02-06 Thread Stefan Beller

On Mon, Feb 5, 2018 at 8:52 PM, Eric Sunshine  wrote:
> On Mon, Feb 5, 2018 at 6:55 PM, Stefan Beller  wrote:
>> This should make these functions easier to find and object-store.h
>> less overwhelming to read.
>
> I think you mean: s/object-store.h/cache.h/

Probably both.

At the end of the series the object-store.h has grown a bit, such that
we'd want to have specific things outside of it. Alternates are a good choice
as they are coherent on their own.

I have given up on cache.h and its readability, but moving things out of there
helps of course. And that is what the patch does.
So I'll change that.

Thanks,
Stefan

Re: [PATCH 025/194] object-store: allow prepare_alt_odb to handle arbitrary repositories

2018-02-06 Thread Stefan Beller

On Mon, Feb 5, 2018 at 5:19 PM, brian m. carlson
 wrote:
> On Mon, Feb 05, 2018 at 03:54:46PM -0800, Stefan Beller wrote:
>> @@ -434,12 +433,12 @@ static int link_alt_odb_entry_the_repository(const 
>> char *entry,
>>   ent = alloc_alt_odb(pathbuf.buf);
>>
>>   /* add the alternate entry */
>> - *the_repository->objects.alt_odb_tail = ent;
>> - the_repository->objects.alt_odb_tail = &(ent->next);
>> + *r->objects.alt_odb_tail = ent;
>> + r->objects.alt_odb_tail = &(ent->next);
>>   ent->next = NULL;
>
> I'm sure I'm missing something obvious, but it's not clear to me that
> this transformation is correct.  Could you perhaps say a few words about
> why it is?

This is a pretty open ended question, so I'll give it a try:

* ent is a local variable that is newly allocated using `alloc_alt_odb`.
  `alloc_alt_odb` has no hidden dependencies on a specific repository,
  it uses FLEX_ALLOC_STR, which is defined in cache.h as a wrapper
  around xcalloc/memcpy

* Before this patch we always used the_repository->objects.alt_odb_tail,
  but with this patch there is no reference "alt_odb.tail" of the_repository,
  but only of a given arbitrary repository.

  Usually we convert only one function at a time. (Chose that function,
  which calls only already converted functions, because then passing in
  r instead of the_repository still compiles correctly)

  The additional messiness comes from a cycle:
  read_info_alternates
-> link_alt_odb_entries
  -> link_alt_odb_entry
-> read_info_alternates

  That is why we cannot just convert one function, but we have to convert
  the whole strongly connected component in this graph of functions.
  This is why this patch is so messy and touches multiple functions at once.

* While I hope this helps, I assume you want me to repeat this or other
  hints in the commit message, too.

Thanks,
Stefan

Re: Are concurrent git operations on the same repo safe?

2018-02-06 Thread Stefan Beller

On Tue, Feb 6, 2018 at 2:16 AM, Duy Nguyen  wrote:
> On Tue, Feb 6, 2018 at 3:16 AM, Ian Norton  wrote:
>> Hi all,
>>
>> I'm generally used to the idea that if a thing is not marked as
>> "thread-safe" then it isn't thread safe, but I thought I'd ask anyway
>> to be sure.
>>
>> Is it safe for me to do several operations with git concurrently on
>> the same local repo?
>
> Off the top of my head, object database access (e.g. things in
> .git/objects), refs updates (.git/refs) and .git/index should handle
> concurrent operations fine (or in the worst case you get "some
> operation is ongoing, aborted" and you need to try again but not
> corruption or anything else). I think we generally try to make it safe
> concurrently.
>
>> Specifically I'm trying to speed up "git
>> submodule update" by doing several at the same time.  I've noticed
>> some odd side effects afterwards though when trying to commit changes
>> to my super project.
>
> submodule is a hot area with lots of development lately I think,
> perhaps you're seeing some bugs... CCing at least one submodule
> person...
>
>> Apologies if this is answered elsewhere, my google-foo is weak today.
>>
>> Many thanks
>>
>> Ian

"git submodule update" has the network part parallelized,
but not the local part IIRC. (That is cloning/fetching submodules
can be done with "-j ", but the local checkout is still serial for
UX purposes, i.e. it wants to stop at the first conflict and only
have one conflict at a time)

Which odd side effects do you see?
I'd be curious to see if that is a bug in the code or documentation.

Thanks,
Stefan

Re: [PATCH v3 00/11] document & test fetch pruning & add fetch.pruneTags

2018-02-06 Thread Ævar Arnfjörð Bjarmason


On Tue, Jan 23 2018, Ævar Arnfjörð Bjarmason jotted:

> I'm now just skipping quoting things like +refs/... on the
> command-line, which as grepping the rest of the test suite shows is
> fine, this eliminated the need for "fetch tests: double quote a
> variable for interpolation" so I've ejected it.

There's a segfault bug in 11/11, which wasn't found because the test
suite doesn't test `git fetch ` just `git fetch ` and this
is handled differently.

I'll send a fix soon, but don't merge this down from pu for now.

In order to test for that I brought that cmdline quoting patch back, I
can't find a better way to do that, and in addition I have this similar
WIP patch:

diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index 2f5bd966be..8fe4f3c13b 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -549,13 +549,39 @@ set_config_tristate () {
 }

 test_configured_prune () {
+   test_configured_prune_guts "$@" "name"
+   test_configured_prune_guts "$@" "link"
+}
+
+test_configured_prune_guts () {
fetch_prune=$1
remote_origin_prune=$2
expected_branch=$3
expected_tag=$4
cmdline=$5
-
-   test_expect_success "prune fetch.prune=$1 remote.origin.prune=$2${5:+ 
$5}; branch:$3 tag:$4" '
+   mode=$6
+
+   if ! test -e prune-guts-setup
+   then
+   test_expect_success 'prune_guts setup' '
+   git -C one config remote.origin.url >one.remote-url &&
+   git -C one config remote.origin.fetch >one.remote-fetch 
&&
+   touch prune-guts-setup
+   '
+   fi
+
+   if test "$mode" = 'link'
+   then
+   remote_url="file://$(cat one.remote-url)"
+   remote_fetch="$(cat one.remote-fetch)"
+   cmdline_setup="\"$remote_url\" \"$remote_fetch\""
+   if test "$cmdline" != ""
+   then
+   cmdline=$(printf "%s" "$cmdline" | sed -e 's! origin! 
"'"$remote_url"'"!g')
+   fi
+   fi
+
+   test_expect_success "$mode prune fetch.prune=$1 
remote.origin.prune=$2${5:+ $5}; branch:$3 tag:$4" '
# make sure a newbranch is there in . and also in one
git branch -f newbranch &&
git tag -f newtag &&
@@ -563,7 +589,7 @@ test_configured_prune () {
cd one &&
test_unconfig fetch.prune &&
test_unconfig remote.origin.prune &&
-   git fetch &&
+   git fetch '"$cmdline_setup"' &&
git rev-parse --verify refs/remotes/origin/newbranch &&
git rev-parse --verify refs/tags/newtag
) &&

It'll be amended a bit more, but the general idea is there, because of
how this whole quoting mess looks like I have to resort to the above
hack outside of the test setup.

Re: [ANNOUNCE] Git for Windows 2.16.1(3)

2018-02-06 Thread Johannes Schindelin

Dear Git for Windows SDK users,

[This mail only concerns the *SDK* of Git for Windows, i.e. the build
environment to work on, and contribute to, Git for Windows.]

On Tue, 6 Feb 2018, Johannes Schindelin wrote:

> Changes since Git for Windows v2.16.1(2) (February 2nd 2018)
> 
> New Features
> 
>   * Git for Windows' SDK packages are now hosted on Azure Blobs, fixing
> part of issue #1479.

To clarify what this means for any developer who wants to update or
install a Git for Windows SDK, here is some guidance.

# Fixing existing SDKs

You can fix your existing SDK (if you installed it successfully before
Friday, February 2nd 2018) by downloading either

- 
https://wingit.blob.core.windows.net/x86-64/git-extra-1.1.241.91af289c6-1-x86_64.pkg.tar.xz
  for 64-bit SDKs, or
- 
https://wingit.blob.core.windows.net/x86-64/git-extra-1.1.241.91af289c6-1-i686.pkg.tar.xz
  for 32-bit SDKs,

and then executing `pacman -U `. This will have the main
effect of updating your `/etc/pacman.conf` so that a subsequent `pacman
-Syu` will find the new home of Git for Windows' Pacman repositories.

# Initializing a new SDK

If you did not yet have a working SDK, the [installer as per Git for
Windows' home page](https://gitforwindows.org/#download-sdk) will not work
(I hope that we get all of this sorted out this week).

The idea is to support a completely new, more robust method of obtaining a
working SDK:

```cmd
git clone --depth 1 https://github.com/git-for-windows/git-sdk-64
```

(or `.../git-sdk-32` for 32-bit setups) and then call `git-bash.exe` in
the top-level directory of the checkout.

(These repositories are essentially synchronized by a scheduled job
running on Visual Studio Team Services, and the set of installed packages
was chosen so as to support the most common operations developer would
want to perform in a Git for Windows SDK.)

This does already work!

But you still have to clone https://github.com/git-for-windows/git
manually for the time being (preferred checkout location: /usr/src/git),
and maybe also https://github.com/git-for-windows/build-extra.

## Future work

I plan on doing a couple more things in the git-sdk-* repositories, and
every bit of help with those tasks will be greatly appreciated:

- add a `README.md` explaining the purpose, the most common operations,
  context and design
- modify

[`/etc/profile.d/git-sdk.sh`](https://github.com/git-for-windows/build-extra/blob/master/git-extra/git-sdk.sh)
  so that it auto-initializes the `git`, `build-extra`, `MINGW-packages` and
  `MSYS2-packages` worktrees in `/usr/src`: they should only be `git
  init`'ed and `git add remote origin ...`, but *not* fetching anything yet.
  They can then be initialized on demand by e.g. `git -C /usr/src/git pull
  origin master`
- ensure that a regular clone works, especially 32-bit, where some of the
  files seem to be committed with incorrect line endings
- modify the git-sdk-installer to use the new method instead
- maybe add an `/etc/motd` that describes how to build stuff, how to
  bundle a custom installer, how to keep the SDK up-to-date, etc. Or maybe
  a helper. Dunno.

This blurb was copy-edited from
https://github.com/git-for-windows/git/issues/1479#issuecomment-363452826

For updates on the Git for Windows SDK front, please follow that ticket!

Thanks,
Johannes

An option to ignore submodules in stash push?

2018-02-06 Thread Robert Dailey

I haven't seen such an option, but I think it would be nice to be able
to ignore submodules when creating a stash. When I stash changes in a
directory, almost always I intend to only stash real files, not
modified submodules. When I pop the stash later, it gets disrupted due
to submodule conflicts. To avoid getting the conflicts in the first
place, it would be nice to somehow specify:

$ git stash push --no-submodules -- MyDirectory/

Would this make sense?

[ANNOUNCE] Git for Windows 2.16.1(3)

2018-02-06 Thread Johannes Schindelin

Dear Git users,

It is my pleasure to announce that Git for Windows 2.16.1(3) is available from:

https://git-for-windows.github.io/

Changes since Git for Windows v2.16.1(2) (February 2nd 2018)

New Features

  * Git for Windows' SDK packages are now hosted on Azure Blobs, fixing
part of issue #1479.
  * Comes with perl-Net-SSLeay v1.84.

Bug Fixes

  * When http.sslBackend is not configured (e.g. in portable Git or
MinGit), fetch/push operations no longer crash.
  * On Windows 7 and older, Git for Windows v2.16.1(2) was no longer
able to spawn any processes (e.g. during fetch/clone). This
regression has been fixed.
  * The Perl upgrade in v2.16.1(2) broke git send-email; This has been
fixed by updating the Net-SSLeay Perl module.

Filename | SHA-256
 | ---
Git-2.16.1.3-64-bit.exe | 
27cec5113e2572d22336150afec337fc94627b330ebeec2a61b8bb447a6301ea
Git-2.16.1.3-32-bit.exe | 
ad1fca7ac33326378b8be187d4595f50c72bd8ad00b277807fc091c16bd9796c
PortableGit-2.16.1.3-64-bit.7z.exe | 
bcb3ca27ae151313ab369d9f39b018129035d7fa018fa0df364424b6e22be4d3
PortableGit-2.16.1.3-32-bit.7z.exe | 
ae4cf93c391c66698856b8b38ba3a42e7fcc7141f61531562134f72b1c014f36
MinGit-2.16.1.3-64-bit.zip | 
a3ee17301ef563349f9936e68beb996c747079afc883e4e40615931e032905fd
MinGit-2.16.1.3-32-bit.zip | 
c46c6697906a1a85f784c7cf6e0385689ae3105bc8fe99b20c3a70ac73ab8e9e
MinGit-2.16.1.3-busybox-64-bit.zip | 
630bb6d79c9b0f64c81046208d00a41edb2540bd99a2626c5fdaf0550cdb1a9e
MinGit-2.16.1.3-busybox-32-bit.zip | 
3ff6f9e936d837d6f319bdb1ea4f0fcc9ff6bc203f9de1987538313686574f1b
Git-2.16.1.3-64-bit.tar.bz2 | 
c7c4f9a9205e85977475698732484c6524d8b381f03ceaf04f35bd6f89bce389
Git-2.16.1.3-32-bit.tar.bz2 | 
8ed49c8008728c70e82436d8c1cafb23cb2ee1f7b01d2e97a7e626d2438a0cd1

Ciao,
Johannes

Re: [PATCH v2] rebase: add --allow-empty-message option

2018-02-06 Thread Johannes Schindelin

Hi Genki,

On Sun, 4 Feb 2018, Genki Sky wrote:

> This option allows commits with empty commit messages to be rebased,
> matching the same option in git-commit and git-cherry-pick. While empty
> log messages are frowned upon, sometimes one finds them in older
> repositories (e.g. translated from another VCS [0]), or have other
> reasons for desiring them. The option is available in git-commit and
> git-cherry-pick, so it is natural to make other git tools play nicely
> with them. Adding this as an option allows the default to be "give the
> user a chance to fix", while not interrupting the user's workflow
> otherwise [1].
> 
>   [0]: https://stackoverflow.com/q/8542304
>   [1]: https://public-inbox.org/git/7vd33afqjh@alter.siamese.dyndns.org/
> 
> To implement this, add a new --allow-empty-message flag. Then propagate
> it to all calls of 'git commit', 'git cherry-pick', and 'git rebase--helper'
> within the rebase scripts.
> 
> Signed-off-by: Genki Sky 
> ---
> 
> Notes:
> 
>   Thanks for the initial feedback, here are the changes from [v1]:
>   - Made my commit message include the main motivations inline.
>   - Moved new tests to t/t3405-rebase-malformed.sh, which has the
> relevant test description: "rebase should handle arbitrary git
> message".
>   - Accordingly re-used existing test setup.
>   - Minimized tests to just one for git-rebase--interactive.sh and one
> for git-rebase--merge.sh. First, one test per file keeps things
> focused while getting most benefit (other code within same file is
> likely to be noticed by modifiers). And, while git-rebase--am.sh
> does have one cherry-pick, it is only for a special case with
> --keep-empty. So git-rebase--am.sh otherwise doesn't have this
> empty-message issue.
> 
>   In general, there was a concern of over-testing, and over-checking
>   implementation details. So, this time, I erred on the conservative
>   side.
> 
>   [v1]: 
> https://public-inbox.org/git/9d0414b869f21f38b81f92ee0619fd76410cbcfc.1517552392.git@genki.is/t/

Very nice. I looked over the patch (sorry, I have too little time to test
this thoroughly, but then, it is the custom on this here mailing list to
just review the patch as per the mail) and it looks very good to me.

Junio, if you like, please add a "Reviewed-by:" line for me.

Thanks!
Johannes

Re: [PATCH v2 11/14] commit: integrate commit graph with commit parsing

2018-02-06 Thread Derrick Stolee


On 2/1/2018 8:51 PM, Jonathan Tan wrote:

On Tue, 30 Jan 2018 16:39:40 -0500
Derrick Stolee  wrote:


+/* global storage */
+struct commit_graph *commit_graph = 0;

NULL, not 0.


+static int bsearch_graph(struct commit_graph *g, struct object_id *oid, 
uint32_t *pos)
+{
+   uint32_t last, first = 0;
+
+   if (oid->hash[0])
+   first = ntohl(*(uint32_t*)(g->chunk_oid_fanout + 4 * 
(oid->hash[0] - 1)));
+   last = ntohl(*(uint32_t*)(g->chunk_oid_fanout + 4 * oid->hash[0]));
+
+   while (first < last) {
+   uint32_t mid = first + (last - first) / 2;
+   const unsigned char *current;
+   int cmp;
+
+   current = g->chunk_oid_lookup + g->hdr->hash_len * mid;
+   cmp = hashcmp(oid->hash, current);
+   if (!cmp) {
+   *pos = mid;
+   return 1;
+   }
+   if (cmp > 0) {
+   first = mid + 1;
+   continue;
+   }
+   last = mid;
+   }
+
+   *pos = first;
+   return 0;
+}

This would be better in sha1-lookup.c, something like the reverse of commit
f1068efefe6d ("sha1_file: drop experimental GIT_USE_LOOKUP search",
2017-08-09), except that it can be done using a simple binary search.


I rebased my patch onto your binary search patch, so I'll use that in 
the future.





+static int full_parse_commit(struct commit *item, struct commit_graph *g,
+uint32_t pos, const unsigned char *commit_data)
+{
+   struct object_id oid;
+   struct commit *new_parent;
+   uint32_t new_parent_pos;
+   uint32_t *parent_data_ptr;
+   uint64_t date_low, date_high;
+   struct commit_list **pptr;
+
+   item->object.parsed = 1;
+   item->graph_pos = pos;
+
+   hashcpy(oid.hash, commit_data);
+   item->tree = lookup_tree(&oid);
+
+   date_high = ntohl(*(uint32_t*)(commit_data + g->hdr->hash_len + 8)) & 
0x3;
+   date_low = ntohl(*(uint32_t*)(commit_data + g->hdr->hash_len + 12));
+   item->date = (timestamp_t)((date_high << 32) | date_low);
+
+   pptr = &item->parents;
+
+   new_parent_pos = ntohl(*(uint32_t*)(commit_data + g->hdr->hash_len));
+   if (new_parent_pos == GRAPH_PARENT_NONE)
+   return 1;
+   get_nth_commit_oid(g, new_parent_pos, &oid);
+   new_parent = lookup_commit(&oid);
+   if (new_parent) {
+   new_parent->graph_pos = new_parent_pos;
+   pptr = &commit_list_insert(new_parent, pptr)->next;
+   } else {
+   die("could not find commit %s", oid_to_hex(&oid));
+   }
+
+   new_parent_pos = ntohl(*(uint32_t*)(commit_data + g->hdr->hash_len + 
4));
+   if (new_parent_pos == GRAPH_PARENT_NONE)
+   return 1;
+   if (!(new_parent_pos & GRAPH_LARGE_EDGES_NEEDED)) {
+   get_nth_commit_oid(g, new_parent_pos, &oid);
+   new_parent = lookup_commit(&oid);
+   if (new_parent) {
+   new_parent->graph_pos = new_parent_pos;
+   pptr = &commit_list_insert(new_parent, pptr)->next;
+   } else
+   die("could not find commit %s", oid_to_hex(&oid));
+   return 1;
+   }
+
+   parent_data_ptr = (uint32_t*)(g->chunk_large_edges + 4 * 
(new_parent_pos ^ GRAPH_LARGE_EDGES_NEEDED));
+   do {
+   new_parent_pos = ntohl(*parent_data_ptr);
+
+   get_nth_commit_oid(g, new_parent_pos & GRAPH_EDGE_LAST_MASK, 
&oid);
+   new_parent = lookup_commit(&oid);
+   if (new_parent) {
+   new_parent->graph_pos = new_parent_pos & 
GRAPH_EDGE_LAST_MASK;
+   pptr = &commit_list_insert(new_parent, pptr)->next;
+   } else
+   die("could not find commit %s", oid_to_hex(&oid));
+   parent_data_ptr++;
+   } while (!(new_parent_pos & GRAPH_LAST_EDGE));
+
+   return 1;
+}

The part that converts  into 
seems to be duplicated 3 times. Refactor into a function?


Will do.




+/**
+ * Fill 'item' to contain all information that would be parsed by 
parse_commit_buffer.
+ */
+static int fill_commit_in_graph(struct commit *item, struct commit_graph *g, 
uint32_t pos)
+{
+   uint32_t new_parent_pos;
+   uint32_t *parent_data_ptr;
+   const unsigned char *commit_data = g->chunk_commit_data + 
(g->hdr->hash_len + 16) * pos;
+
+   new_parent_pos = ntohl(*(uint32_t*)(commit_data + g->hdr->hash_len));
+
+   if (new_parent_pos == GRAPH_PARENT_MISSING)
+   return 0;
+
+   if (new_parent_pos == GRAPH_PARENT_NONE)
+   return full_parse_commit(item, g, pos, commit_data);
+
+   new_parent_pos = ntohl(*(uint32_t*)(commit_data + g->hdr->hash_len + 
4));
+
+   if (new_parent_pos == GRAPH_PARENT_MISSING)
+   return 0;
+   if (!(new_parent_pos & GRAP

Re: [PATCH v2 00/12] object_id part 11 (the_hash_algo)

2018-02-06 Thread Junio C Hamano

"brian m. carlson"  writes:

> On Fri, Feb 02, 2018 at 11:46:03AM -0800, Junio C Hamano wrote:
>> Thanks for working on this.  All changes looked sensible (even
>> though I spotted one nit in the original, which was moved as-is,
>> which does not count as a "change" ;-)).
>
> I forgot to ask: do you want a reroll which fixes this, or just a
> follow-up patch on top to fix the comment?

I do not think it is in scope of the series, so a follow-up would be
sufficient.

Thanks.

Re: [PATCH v1] dir.c: don't flag the index as dirty for changes to the untracked cache

2018-02-06 Thread Junio C Hamano

Duy Nguyen  writes:

> Please don't do that, at least not this way. cache_changed mask should
> reflect all dirty parts in .git/index. If UNTR extension is not marked
> updated, it's legit to just skip generating/writing it down (e.g. if I
> kept the old UNTR extension from the last time I read .git/index
> around in memory)

Excellent point.  Thanks for mentioning this (it crossed my mind
when I responded but I forgot to metion in my message).

Re: Cloned repository has file changes -> bug?

2018-02-06 Thread Junio C Hamano

Jeff King  writes:

> I'm not entirely convinced it's worth all of this effort, but I think it
> would be _possible_ at least.

I thought that the original poster wants to have a knob that the
project can ask its participants to enable in their clones of the
repository that wars this situation when they "git add".  When a
projects considers that macOS users must dictate the world order
because its source must be checked out on case insensitive HFS+, it
may be sensible to force participants on other platforms whose
filesystem does not guarantee that two paths that "normalize" to the
same string will not enter in the project history to spend
application cycles to do so instead.

That admittedly is quite a macOS centric view, and it is not a very
pleasant one, as it means almost all individual _applications_ on
other platforms that conceivably could be used to work on a project
macOS folks would want to be treated as first-class citizens would
need to learn about the peculiar filesystem limitation of one
particular platform they otherwise may not necessarily care about.

Re: Cloned repository has file changes -> bug?

2018-02-06 Thread Filip Jorissen

I’m not git expert but, from a user point of view, the following would make 
sense. When adding a file, git could check whether a different file is already 
in the repository with the same name (case-insensitive check). Then simply 
report that this may be a mistake and request to use ‘git add -f’ to force the 
‘add’. Perhaps this coincides with what was suggested in earlier replies. From 
my point of view the check should be enabled by default, but it could be 
optional too. 

Filip



> Op 6 feb. 2018, om 15:14 heeft Jeff King  het volgende 
> geschreven:
> 
> On Tue, Feb 06, 2018 at 02:24:25PM +0100, Ævar Arnfjörð Bjarmason wrote:
> 
>> 3) Such hooks slow down pushes, especially on big repos, you can
>>optimize things a bit (e.g. only look in the same directories), but
>>pathologically you end up needing to compare the cross-product of
>>changed files v.s. all existing files for each changed file.
> 
> I think you could just complain about any tree that contains entries
> that have duplicate entries after normalization. I.e.:
> 
>  git rev-list --objects $new --not $old |
>  awk '{print $1}' |
>  git cat-file --batch-check='%(objecttype) %(objectname)' |
>  awk '/^tree/ {print $2}'|
>  while read tree; do
>   dups=$(git ls-tree $tree | cut -f 2- | tr A-Z a-z | sort | uniq -d)
>   test -z "$dups" || echo "$tree has duplicates: $dups"
>  done
> 
> That gives reasonable algorithmic complexity, but of course the shell
> implementation is horrific. One could imagine that this could be
> implemented as part of fsck_tree(), though, which is already reading
> through all the entries (unfortunately it requires auxiliary storage
> linear with the size of a given tree object, but that's not too bad).
> 
> But it would probably need:
> 
>  1. To be enabled as an optional fsck warning, possibly even defaulting
> to "ignore".
> 
>  2. That "tr" could be any arbitrary transformation. Case-folding is
> the obvious one, but in theory you could match the normalization
> behavior of certain popular filesystems.
> 
> I'm not entirely convinced it's worth all of this effort, but I think it
> would be _possible_ at least.
> 
> -Peff

Re: Cloned repository has file changes -> bug?

2018-02-06 Thread Jeff King

On Tue, Feb 06, 2018 at 02:24:25PM +0100, Ævar Arnfjörð Bjarmason wrote:

>  3) Such hooks slow down pushes, especially on big repos, you can
> optimize things a bit (e.g. only look in the same directories), but
> pathologically you end up needing to compare the cross-product of
> changed files v.s. all existing files for each changed file.

I think you could just complain about any tree that contains entries
that have duplicate entries after normalization. I.e.:

  git rev-list --objects $new --not $old |
  awk '{print $1}' |
  git cat-file --batch-check='%(objecttype) %(objectname)' |
  awk '/^tree/ {print $2}'|
  while read tree; do
dups=$(git ls-tree $tree | cut -f 2- | tr A-Z a-z | sort | uniq -d)
test -z "$dups" || echo "$tree has duplicates: $dups"
  done

That gives reasonable algorithmic complexity, but of course the shell
implementation is horrific. One could imagine that this could be
implemented as part of fsck_tree(), though, which is already reading
through all the entries (unfortunately it requires auxiliary storage
linear with the size of a given tree object, but that's not too bad).

But it would probably need:

  1. To be enabled as an optional fsck warning, possibly even defaulting
 to "ignore".

  2. That "tr" could be any arbitrary transformation. Case-folding is
 the obvious one, but in theory you could match the normalization
 behavior of certain popular filesystems.

I'm not entirely convinced it's worth all of this effort, but I think it
would be _possible_ at least.

-Peff

Re: [PATCH 025/194] object-store: allow prepare_alt_odb to handle arbitrary repositories

2018-02-06 Thread Derrick Stolee


On 2/5/2018 8:19 PM, brian m. carlson wrote:

On Mon, Feb 05, 2018 at 03:54:46PM -0800, Stefan Beller wrote:

@@ -434,12 +433,12 @@ static int link_alt_odb_entry_the_repository(const char 
*entry,
ent = alloc_alt_odb(pathbuf.buf);
  
  	/* add the alternate entry */

-   *the_repository->objects.alt_odb_tail = ent;
-   the_repository->objects.alt_odb_tail = &(ent->next);
+   *r->objects.alt_odb_tail = ent;
+   r->objects.alt_odb_tail = &(ent->next);
ent->next = NULL;

I'm sure I'm missing something obvious, but it's not clear to me that
this transformation is correct.  Could you perhaps say a few words about
why it is?


I believe the reason it is correct is due to this change, higher up:

+static void read_info_alternates(struct repository *r,
+const char *relative_base,
+int depth);

Now the method takes a 'struct repository' pointer (which at this moment will 
always be the_repository) but it allows callers to use a different repository, 
when they are ready.

Thanks,
-Stolee

Re: Cloned repository has file changes -> bug?

2018-02-06 Thread Ævar Arnfjörð Bjarmason

On Tue, Feb 06 2018, Filip Jorissen jotted:

> Hi all,
>
> Thank you for your quick responses. I was able to resolve the problem based 
> on your feedback!
>
> Based on this experience, I would like to suggest that git is somehow able to 
> avoid these problems by doing a case check itself rather than relying on the 
> host OS for this?

I think it would make a lot of sense for git to ship with some standard
library of hooks for people to enable that covered common cases like
that.

But it's important to realize that:

 1) That's all this would be, this check would just be something you can
do via a pre-receive hook now (or in this case, that big hosting
providers like Github could offer as a pre-receive hook).

 2) Case collisions are just a subset of these issues, if you look at my
cacbzzx4qh-w3yevoaw8rxaofrjpfgerlwpbhetro2tv-fgr...@mail.gmail.com
here on-list you'll see an issue relating to unicode normalizing
that triggers a similar problem.

 3) Such hooks slow down pushes, especially on big repos, you can
optimize things a bit (e.g. only look in the same directories), but
pathologically you end up needing to compare the cross-product of
changed files v.s. all existing files for each changed file.

 4) It's not something hosting providers like Github, Gitlab etc. can
just enable, because some people want it like this. There's a lot of
repos that have e.g. case collisions or other problems that only
manifest on some OS-specific limitation (e.g. paths that are too
long for VMS), and none of the users care because they don't use the
repo on those systems.

So it's not something that can be enabled by default, even for the
most common case of case collisions.

Right now you're probably best off setting up a CI check on Github that
checks whether someone commits a colliding file, and poking Github about
maybe providing some opt-in feature to check for this.

I think the most performant way of checking this is:

git --icase-pathspecs ls-files -- 

Which should never emit more lines than you give to it, you'd run that
either for the tip of the push, or if you want to be more thorough for
each pushed commit.

If someone can think of a better way to check for this I'd love to hear
about it, I'll probably add a check like this to some of our internal
repos soon. We semi-regularly have Mac users up in arms because someone
pushed a case-colliding file on Linux.

>> Op 28 jan. 2018, om 08:57 heeft Torsten Bögershausen  het 
>> volgende geschreven:
>>
>> On Sat, Jan 27, 2018 at 08:59:50PM +0100, Ævar Arnfjörð Bjarmason wrote:
>>>
>>> On Sat, Jan 27 2018, Filip Jorissen jotted:
>>>
 I think our git repository is bugged. The reason why I say this is the
 following. When cloning the repository, the newly cloned repository
 immediately has file changes[...].
>>>
>>> If you run this:
>>>
>>>git ls-files | tr '[:upper:]' '[:lower:]' | sort | uniq -D | grep '^'
>>>
>>> You'll see that the reason is that you have files that differ only in
>>> case.
>>>
>>> You are using a Mac, and Macs by default think that files that are
>>> different binary strings are the same file, since they don't consider
>>> case to be relevant. The file FOO, foo and FoO and fOo are all the same
>>> file as far as your Mac is concerned, but would be 4 different files on
>>> Linux.
>>>
 How can I fix the repository?
>>>
>>> You could check it out on a OS that considers files that differ in case
>>> to be different files, e.g. on Linux, move them around, push it, and new
>>> clones should work on your Mac.
>>>
>>> Alternatively I hear that you can create a loopback case-sensitive FS
>>> image on Macs.
>>
>> You can even fix the repo locally.
>> There are 2 files with uppercase/lowercase collisions.
>> I show you how to fix one off these, the other one goes similar.
>> After that, do a commit and a push and pull request.
>>
>>
>>
>> Changes not staged for commit:
>>  (use "git add ..." to update what will be committed)
>>  (use "git checkout -- ..." to discard changes in working directory)
>>
>>modified:   
>> IDEAS/Resources/ReferenceResults/Dymola/IDEAS_Fluid_HeatExchangers_GroundHeatExchangers_Borefield_Examples_MultipleBoreholesWithHeatPump.txt
>>modified:   
>> IDEAS/Resources/ReferenceResults/Dymola/IDEAS_Utilities_Psychrometrics_Functions_Examples_saturationPressure.txt
>>
>> no changes added to commit (use "git add" and/or "git commit -a")
>> user@mac:/tmp/IDEAS> git ls-files -s | grep -i 
>> IDEAS/Resources/ReferenceResults/Dymola/IDEAS_Fluid_HeatExchangers_GroundHeatExchangers_Borefield_Examples_MultipleBoreholesWithHeatPump.txt
>> 100644 f56cfcf14aa4b53dfc5ecfb488366f721c94c8e2 0   
>> IDEAS/Resources/ReferenceResults/Dymola/IDEAS_Fluid_HeatExchangers_GroundHeatExchangers_Borefield_Examples_MultipleBoreholesWithHeatPump.txt
>> 100644 e345e1372111d034b4c5a1c75eb791340b93f55e 0   
>> IDEAS/Resources/ReferenceResults/Dymola/ideas_Fluid_H

coucou j'ai envie de te connaitre

2018-02-06 Thread Marco Martha

Réponds moi vite
Bisou

Martha

Re: Cloned repository has file changes -> bug?

2018-02-06 Thread Duy Nguyen

On Tue, Feb 6, 2018 at 8:09 PM, Filip Jorissen
 wrote:
> Hi all,
>
> Thank you for your quick responses. I was able to resolve the problem based 
> on your feedback!
>
> Based on this experience, I would like to suggest that git is somehow able to 
> avoid these problems by doing a case check itself rather than relying on the 
> host OS for this?

Git itself can't resolve the problem. Case checking is just a side
problem. You have two paths, only one of which can be checked out by
your OS. How can Git know which one to checkout?

Git could pick one at random and hide the other (using sparse
checkout, behind the scene). But that can still surprise the user "how
come I see this content (of file "AAA") while the other user see
another content (of file "aaa")?"

I don't see a good solution for this. I think the best we could do is
warn earlier at clone time "trouble ahead! Your repository and your OS
don't like each other!" so that users won't get puzzled at "git
status" later. Would that help?

> Kind regards!
>
> Filip
-- 
Duy

Re: Cloned repository has file changes -> bug?

2018-02-06 Thread Filip Jorissen

Hi all,

Thank you for your quick responses. I was able to resolve the problem based on 
your feedback!

Based on this experience, I would like to suggest that git is somehow able to 
avoid these problems by doing a case check itself rather than relying on the 
host OS for this? 

Kind regards!

Filip



> Op 28 jan. 2018, om 08:57 heeft Torsten Bögershausen  het 
> volgende geschreven:
> 
> On Sat, Jan 27, 2018 at 08:59:50PM +0100, Ævar Arnfjörð Bjarmason wrote:
>> 
>> On Sat, Jan 27 2018, Filip Jorissen jotted:
>> 
>>> I think our git repository is bugged. The reason why I say this is the
>>> following. When cloning the repository, the newly cloned repository
>>> immediately has file changes[...].
>> 
>> If you run this:
>> 
>>git ls-files | tr '[:upper:]' '[:lower:]' | sort | uniq -D | grep '^'
>> 
>> You'll see that the reason is that you have files that differ only in
>> case.
>> 
>> You are using a Mac, and Macs by default think that files that are
>> different binary strings are the same file, since they don't consider
>> case to be relevant. The file FOO, foo and FoO and fOo are all the same
>> file as far as your Mac is concerned, but would be 4 different files on
>> Linux.
>> 
>>> How can I fix the repository?
>> 
>> You could check it out on a OS that considers files that differ in case
>> to be different files, e.g. on Linux, move them around, push it, and new
>> clones should work on your Mac.
>> 
>> Alternatively I hear that you can create a loopback case-sensitive FS
>> image on Macs.
> 
> You can even fix the repo locally.
> There are 2 files with uppercase/lowercase collisions.
> I show you how to fix one off these, the other one goes similar.
> After that, do a commit and a push and pull request.
> 
> 
> 
> Changes not staged for commit:
>  (use "git add ..." to update what will be committed)
>  (use "git checkout -- ..." to discard changes in working directory)
> 
>modified:   
> IDEAS/Resources/ReferenceResults/Dymola/IDEAS_Fluid_HeatExchangers_GroundHeatExchangers_Borefield_Examples_MultipleBoreholesWithHeatPump.txt
>modified:   
> IDEAS/Resources/ReferenceResults/Dymola/IDEAS_Utilities_Psychrometrics_Functions_Examples_saturationPressure.txt
> 
> no changes added to commit (use "git add" and/or "git commit -a")
> user@mac:/tmp/IDEAS> git ls-files -s | grep -i 
> IDEAS/Resources/ReferenceResults/Dymola/IDEAS_Fluid_HeatExchangers_GroundHeatExchangers_Borefield_Examples_MultipleBoreholesWithHeatPump.txt
> 100644 f56cfcf14aa4b53dfc5ecfb488366f721c94c8e2 0   
> IDEAS/Resources/ReferenceResults/Dymola/IDEAS_Fluid_HeatExchangers_GroundHeatExchangers_Borefield_Examples_MultipleBoreholesWithHeatPump.txt
> 100644 e345e1372111d034b4c5a1c75eb791340b93f55e 0   
> IDEAS/Resources/ReferenceResults/Dymola/ideas_Fluid_HeatExchangers_GroundHeatExchangers_Borefield_Examples_MultipleBoreholesWithHeatPump.txt
> user@mac:/tmp/IDEAS> git mv 
> IDEAS/Resources/ReferenceResults/Dymola/ideas_Fluid_HeatExchangers_GroundHeatExchangers_Borefield_Examples_MultipleBoreholesWithHeatPump.txt
>  
> IDEAS/Resources/ReferenceResults/Dymola/ideas_Fluid_HeatExchangers_GroundHeatExchangers_Borefield_Examples_MultipleBoreholesWithHeatPump2.txt
> user@mac:/tmp/IDEAS> git checkout  
> IDEAS/Resources/ReferenceResults/Dymola/ideas_Fluid_HeatExchangers_GroundHeatExchangers_Borefield_Examples_MultipleBoreholesWithHeatPump2.txt
> user@mac:/tmp/IDEAS> git checkout 
> IDEAS/Resources/ReferenceResults/Dymola/IDEAS_Fluid_HeatExchangers_GroundHeatExchangers_Borefield_Examples_MultipleBoreholesWithHeatPump.txt
> user@mac:/tmp/IDEAS> git status
> On branch master
> Your branch is up to date with 'origin/master'.
> 
> Changes to be committed:
>  (use "git reset HEAD ..." to unstage)
> 
>renamed:
> IDEAS/Resources/ReferenceResults/Dymola/ideas_Fluid_HeatExchangers_GroundHeatExchangers_Borefield_Examples_MultipleBoreholesWithHeatPump.txt
>  -> 
> IDEAS/Resources/ReferenceResults/Dymola/ideas_Fluid_HeatExchangers_GroundHeatExchangers_Borefield_Examples_MultipleBoreholesWithHeatPump2.txt
> 
> Changes not staged for commit:
>  (use "git add ..." to update what will be committed)
>  (use "git checkout -- ..." to discard changes in working directory)
> 
>modified:   
> IDEAS/Resources/ReferenceResults/Dymola/IDEAS_Utilities_Psychrometrics_Functions_Examples_saturationPressure.txt
> 
> user@mac:/tmp/IDEAS>

Re: cherry-pick '-m' curiosity

2018-02-06 Thread Sergey Organov

Junio C Hamano  writes:

> Sergey Organov  writes:
>
>> Isn't it always the case that "mainline" is the first parent, as that's
>> how "git merge" happens to work?
>
> You may not be merging into the "mainline" in the first place.
>
> Imagine forking two topics at the same commit on the mainline, and
> merging these two topics of equal importance together into a single
> bigger topic, before asking the mainline to pull the whole.
>
> $ git checkout mainline
> $ git tag fork-point
> $ git checkout -b frontend-topic fork-point
> $ work work work
> $ git checkout -b backend-topic fork-point
> $ work work work
> $ git checkout -b combined
> $ git merge frontend-topic
> $ git tag merged
>
> The backend-topic may be recorded as the first-parent of the
> resulting merge, but logically the two topics are of equal footing,
> so merge^1..merge and merge^2..merge are both equally interesting.

Point taken, thanks! Now, if I reformulate my original question as:

"Isn't it _usually_ the case that "mainline" is the first parent?"

what is the answer?

For example, in the above case I'd likely rather: 

 $ git checkout -b combined fork-point
 $ git merge --no-ff frontend-topic
 $ git merge --no-ff backend-topic

and still have clear "mainline" on "-m1" for both merges.

I'm asking because those:

"Usually you cannot cherry-pick a merge because you do not know which
side of the merge should be considered the mainline."

in the manual page still feels confusing in the context of typical git
usage (as opposed to the context of abstract DAG operations where it'd
make sense indeed.)

-- Sergey

Re: [PATCH v1] dir.c: don't flag the index as dirty for changes to the untracked cache

2018-02-06 Thread Duy Nguyen

On Tue, Feb 6, 2018 at 7:27 PM, Duy Nguyen  wrote:
> On Tue, Feb 6, 2018 at 8:48 AM, Ben Peart  wrote:
>> With the new behavior, making a change in dir1/, then calling status would
>> update the dir1/ untracked cache entry but not write it out. On the next
>> status, git would detect the change in dir1/ again and update the untracked
>
> Thing only missing piece here I would add is, this dir1/ detection is
> done by watchman. We have to contact watchman and ask the set of
> changed paths since $TIME where $TIME is the last time we updated
> untracked cache and invalidate those paths in core. Then it should
> work correctly. I checked the watchman query in the fsmonitor hook and
> I think it's correct so far.

No I think I'm wrong. And worse, I think the interaction between FSNM
and UNTR extension is broken. This is partly from me guessing how
fsmonitor works so I may be double-wrong.

UNTR extension is supposed to cover the untracked state at timestamp
$X. Where $X is stored in FSNM extension. Correct?

When fsmonitor is used and read_directory() is executed (at timestamp
$Y), we ask watchman "what's new on worktree since $X?"). We get the
list, we invalidate relevant directories so we can see new untracked
entries (or lack of old untracked entries). We write FSNM with
timestamp $Y down. We may or may not write UNTR down because of this
patch, but let's suppose we do. All is good. UNTR now records the
state at $Y. FSNM says $Y. This is how it works (when there's no bugs)

UNTR extension is only updated when read_directory() is called. It
does not always happen. FSNM is updated all the time (almost at every
git command since most of them needs to read index, many write it
down) with new timestamp. Suppose FSNM now records timestamp $Z, UNTR
still records the state at $Y because in the last index update,
read_directory() is not executed. 4 files have been added between $Y
and $Z. When we ask watchman "what's new since $Z?" we will not see
those files, so we don't invalidate relevant directories and
read_directory() will not see those files.

What am I missing?
-- 
Duy

I NEED YOUR HELP

2018-02-06 Thread MISS SAIF IBRAHIM

Dear Mr/Sir,

My Name is Miss safi ibrahim from Libya, I am 22 years old, I am in
St.Christopher's Parish for refugee in Burkina Faso under United
Nations High commission for Refugee because I lost my parents in the
recent war in  Libya, right now am in Burkina Faso, please save my
life i am in danger.

I need your help in transferring my inheritance my father left behind
for me in a Bank in Burkina Faso here, I have every information to
guide you get the money transferred to your bank account, all i need
is a foreigner who will stand as the foreign partner to my father and
beneficiary of the fund.

The money deposited in the Bank is US$10.500,000.00 united state
dollae, I just need this fund to be transfer to your bank account so
that i will come over to your country and complete my education as you
know that my country have been in deep crisis due to the war, and I
cannot go back there again because I have nobody again all of my
family were killed in the war.

If you are interested to save me and help me receive my inheritance
fund Please get back to me on this email address for more details
about me and the inheritance funds.

Miss safi ibrahim.

Re: [PATCH v1] dir.c: don't flag the index as dirty for changes to the untracked cache

2018-02-06 Thread Duy Nguyen

On Tue, Feb 6, 2018 at 8:48 AM, Ben Peart  wrote:
> With the new behavior, making a change in dir1/, then calling status would
> update the dir1/ untracked cache entry but not write it out. On the next
> status, git would detect the change in dir1/ again and update the untracked

Thing only missing piece here I would add is, this dir1/ detection is
done by watchman. We have to contact watchman and ask the set of
changed paths since $TIME where $TIME is the last time we updated
untracked cache and invalidate those paths in core. Then it should
work correctly. I checked the watchman query in the fsmonitor hook and
I think it's correct so far.

Except one point. What if you activate fsmonitor, write down the
fsmonitor_last_update field there but _not_ create UNTR extension for
the same timestamp? UNTR extension is created only when
read_directory() is executed and we don't do that in every command. I
haven't checked fsmonitor.c carefully, maybe I'm missing something.

One thing I'd like to point out in this patch is untracked cache will
start degrading if you just make the initial UNTR extension once and
don't ever update it again. Dirty paths hit slow path and will start
poking the disk. If somebody accidentally change a minor thing in
every single directory (worst case scenario), the untracked cache
becomes useless. We need a threshold or something to start repairing
UNTR extension if it gets damaged too much.

If you rely on random index updates (e.g. the main entries got updated
and .git/index must be updated) to write UNTR extension down together.
Please don't do that, at least not this way. cache_changed mask should
reflect all dirty parts in .git/index. If UNTR extension is not marked
updated, it's legit to just skip generating/writing it down (e.g. if I
kept the old UNTR extension from the last time I read .git/index
around in memory)

> cache.  All of the other cached entries are still valid and the cache would
> be used for them.  The updated cache entry for dir1/ would not get persisted
> to disk until something that required the index to be written out.
>
> The behavior is correct in both cases.  You just don't get the benefit of
> the updated cache for the dir1/ entry until the index is persisted again.
> What you gain in exchange is that you don't have to write out the index
> which is (typically) a lot more expensive than checking dir1/ for changes.

This is another thing that bugs me. I know you're talking about huge
index files, but at what size should we start this sort of
optimization? Writing down a few MBs on linux is cheap enough that I
won't bother optimizing (and I get my UNTR extension repaired all the
time, so reduced lstat calls and stuff). This "typically" only comes
at certain size, what size?
-- 
Duy

Re: [PATCH] dir.c: ignore paths containing .git when invalidating untracked cache

2018-02-06 Thread Duy Nguyen

On Tue, Feb 6, 2018 at 12:44 AM, Ben Peart  wrote:
>
>
> On 2/4/2018 4:38 AM, Nguyễn Thái Ngọc Duy wrote:
>>
>> read_directory() code ignores all paths named ".git" even if it's not
>> a valid git repository. See treat_path() for details. Since ".git" is
>> basically invisible to read_directory(), when we are asked to
>> invalidate a path that contains ".git", we can safely ignore it
>> because the slow path would not consider it anyway.
>>
>> This helps when fsmonitor is used and we have a real ".git" repo at
>> worktree top. Occasionally .git/index will be updated and if the
>> fsmonitor hook does not filter it, untracked cache is asked to
>> invalidate the path ".git/index".
>>
>> Without this patch, we invalidate the root directory unncessarily,
>> which:
>>
>> - makes read_directory() fall back to slow path for root directory
>>(slower)
>>
>> - makes the index dirty (because UNTR extension is updated). Depending
>>on the index size, writing it down could also be slow.
>>
>> Noticed-by: Ævar Arnfjörð Bjarmason 
>> Signed-off-by: Nguyễn Thái Ngọc Duy 
>> ---
>>   Sorry for the resend, I forgot git@vger.
>>
>>   dir.c | 13 -
>>   git-compat-util.h |  2 ++
>>   wrapper.c | 12 
>>   3 files changed, 26 insertions(+), 1 deletion(-)
>>
>> diff --git a/dir.c b/dir.c
>> index 7c4b45e30e..f8b4cabba9 100644
>> --- a/dir.c
>> +++ b/dir.c
>> @@ -1773,7 +1773,7 @@ static enum path_treatment treat_path(struct
>> dir_struct *dir,
>> if (!de)
>> return treat_path_fast(dir, untracked, cdir, istate, path,
>>baselen, pathspec);
>> -   if (is_dot_or_dotdot(de->d_name) || !strcmp(de->d_name, ".git"))
>> +   if (is_dot_or_dotdot(de->d_name) || !fspathcmp(de->d_name,
>> ".git"))
>> return path_none;
>> strbuf_setlen(path, baselen);
>> strbuf_addstr(path, de->d_name);
>> @@ -2970,8 +2970,19 @@ static int invalidate_one_component(struct
>> untracked_cache *uc,
>>   void untracked_cache_invalidate_path(struct index_state *istate,
>>  const char *path)
>>   {
>> +   const char *end;
>> +   int skipped;
>> +
>> if (!istate->untracked || !istate->untracked->root)
>> return;
>
>
> Thank you for this patch!  It's great to see people helping improve the
> performance of git.
>
> I ran a quick test and this is not sufficient to prevent the index from
> getting flagged as dirty and written out to disk when fsmonitor detects
> changes for files under the .git folder.  What I'm seeing is that when
> ".git/index" is returned, the test below doesn't catch them and end early as
> would be expected.

Right. I focused too much on ".git" in the middle and the end of the
path and forgot that it's also at the beginning.

> As a result, invalidate_one_component() is called which calls
> invalidate_one_directory() which increments dir_invalidated counter and the
> regular dirty process continues which triggers the index to be written to
> disk unnecessarily.
>
>> +   if (!fspathcmp(path, ".git"))
>> +   return;
>> +   if (ignore_case)
>> +   skipped = skip_caseprefix(path, "/.git", &end);
>> +   else
>> +   skipped = skip_prefix(path, "/.git", &end);
>> +   if (skipped && (*end == '\0' || *end == '/'))
>> +   return;
>
>
> If I replace the above lines with:
>
> if (ignore_case)
> skipped = skip_caseprefix(path, ".git", &end);
> else
> skipped = skip_prefix(path, ".git", &end);
>
> Then it correctly skips _all_ files under ".git".  I'm not sure if by
> removing the leading slash, I'm breaking some other case but I was not able
> to find where that was expected or needed. Removing the leading slash also
> allows me to remove the fsmpathcmp() call as it is now redundant.

Removing "/" could catch things like abc/lala.git/def, which
treat_path does not consider special and may show up as untracked
entries. In that sense, the updated invalidate_... is broken if we
don't invalidate them properly.

I think what we need here is something like

if (!fspathcmp(path, ".git") || starts_with(path, ".git/"))

but can handle case-insensitivity as well (starts_with can't).

> I wondered what would happen if there was some other directory named ".git"
> and how that would behave.  With or without this patch and with or without
> the untracked cache, a file "dir1/.git/foo" is ignored by git so no change
> in behavior there either.

That's what I meant by treat_path() and invisibility in my commit
message. We always ignore directories (or even files) named ".git". It
does not matter if it contains anything remotely related to git. I'm
not saying it's a good thing, but it's how it is and changing that
requires a lot more work, possibly some performance degradation if you
start to check if ".git" is a valid repo before ignoring. So I

Re: [PATCH v2 3/3] worktree: teach "add" to check out existing branches

2018-02-06 Thread Duy Nguyen

On Tue, Feb 6, 2018 at 3:23 AM, Thomas Gummerer  wrote:
> On 02/05, Duy Nguyen wrote:
>> On Sun, Feb 04, 2018 at 10:13:05PM +, Thomas Gummerer wrote:
>> > -   if (opts->new_branch)
>> > +   if (opts->checkout_existing_branch)
>> > +   fprintf(stderr, _(", checking out existing branch '%s'"),
>> > +   refname);
>> > +   else if (opts->new_branch)
>> > fprintf(stderr, _(", creating new branch '%s'"), 
>> > opts->new_branch);
>>
>> I wonder if "creating branch" and "checkout out branch" are enough.
>
> I thought printing the branch name might be a good idea just to show
> more clearly what the dwim did.

No no printing branch name is definitely a good idea, especially when
I think of one thing and type another. I shortened my example phrases
too much. It should have been "creating branch %s" and "checkout out
branch %s"
-- 
Duy

Re: Missing git options

2018-02-06 Thread Duy Nguyen

On Tue, Feb 6, 2018 at 6:13 PM, Martin Häcker
 wrote:
> This however still freezes the default output of git forever.

Why is that a bad thing? Default output format should not change
(much) from version to version, or from machine to machine (because of
different ~/.gitconfig) for that matter. I don't want to type "git
log" on somebody else's machine and have a big surprise. You _can_
customize output but that should be explicit.

If you want good defaults, that's what aliases are for. I personally
rarely type "git log". I have "l", "lp", "l1"  and other aliases for
my everyday use.
-- 
Duy

Re: [PATCH 0/2] minor GETTEXT_POISON fixes

2018-02-06 Thread Jeff King

On Tue, Feb 06, 2018 at 10:06:53AM +0100, Lars Schneider wrote:

> 
> > On 06 Feb 2018, at 09:42, Jeff King  wrote:
> > 
> > I set NO_GETTEXT=1 in my config.mak, and happened to notice that running
> > the tests with GETTEXT_POISON fails. I think this has been broken for
> > years, but I don't generally play with GETTEXT_POISON. ;)
> 
> On Travis we run GETTEXT_POISON with gettext installed.
> What do you think about an additional job with GETTEXT_POISON and 
> NO_GETTEXT=1?

I think it's probably a waste of CPU. ;)

It would be running the whole test suite to fix this one particular bug,
which I don't think is pretty likely to regress (at least not more than
any other bug). Whereas running with GETTEXT_POISON is meant to shake
out strings that maybe shouldn't be translated, which is a very likely
bug to get introduced.

-Peff

1 2 >

1 - 100 of 113 matches

Mail list logo