Re: Git send-email not sending email patches as subsequent replies

2018-04-07 Thread Eric Wong
David Davis  wrote:
> I'm seeing 6 individual emails, how do I just see one email followed by 5
> email replies to the one? I don't want six individual emails.

The five email replies you want are still individual emails.
Emails are ALWAYS "individual", and reply emails simply have a
References: and/or In-Reply-To: header in them.

Whether your mail client shows them as individual mails is
another story.  I also suspect whatever mail client you're using
isn't rendering thread hierarchies correctly.

Most good mail clients use the same algorithm to display message
threads since the days of Netscape:

https://www.jwz.org/doc/threading.html

I've also modified that algorithm to be recursion-free for public-inbox:

https://rt.cpan.org/Public/Bug/Display.html?id=116727


Btw, please disable HTML mail in your mail client or it won't
show up on the list (and other people won't be able to help).
I may not be around much for a bit.


Re: [PATCH v11 4/4] ls-remote: create '--sort' option

2018-04-07 Thread Eric Sunshine
On Sat, Apr 7, 2018 at 12:42 PM, Harald Nordgren
 wrote:
> Create a '--sort' option for ls-remote, based on the one from
> for-each-ref. This e.g. allows ref names to be sorted by version
> semantics, so that v1.2 is sorted before v1.10.
>
> Signed-off-by: Harald Nordgren 
> ---
> diff --git a/Documentation/git-ls-remote.txt b/Documentation/git-ls-remote.txt
> @@ -60,6 +60,19 @@ OPTIONS
> +--sort=::
> +   Sort based on the key given.  Prefix `-` to sort in
> +   descending order of the value. You may use the --sort= option
> +   multiple times, in which case the last key becomes the primary

This "last becomes primary key" feels counterintuitive to me, however,
I see it mirrors precedence of other Git commands.

In what situations would it make sense to specify --sort= multiple
times in the context of ls-remote? If there are none, then I wonder if
this should instead be documented as "last wins"...

> +   key. Also supports "version:refname" or "v:refname" (tag

To what does "Also" refer?

> +   names are treated as versions). The "version:refname" sort
> +   order can also be affected by the "versionsort.suffix"
> +   configuration variable.
> +   The keys supported are the same as those in `git for-each-ref`,
> +   except that because `ls-remote` deals only with remotes, keys like
> +   `committerdate` that require access to the objects themselves will
> +   not work.

What does "not work" mean in this context? Will the command crash
outright? Will it emit a suitable error message or warning? Will the
sorting be somehow dysfunctional?

It seems like the sentence "The keys supported..." should go above the
"Also supports..." sentence for this explanation to be more cohesive.

Finally, how about adding a linkgit:git-for-each-ref[1] to give the
user an easy way to access the referenced documentation. For instance:

The keys supported are the same as those accepted by the
`--sort=` option of linkgit:git-for-each-ref[1], except...

> diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
> @@ -43,10 +44,13 @@ int cmd_ls_remote(int argc, const char **argv, const char 
> *prefix)
> +   struct ref_array array;
> +   static struct ref_sorting *sorting = NULL, **sorting_tail = 
> @@ -68,6 +74,8 @@ int cmd_ls_remote(int argc, const char **argv, const char 
> *prefix)
> +   memset(, 0, sizeof(array));

Can we have a more meaningful name than 'array'? Even a name a simple
as 'refs' would convey more information.

> @@ -104,13 +112,23 @@ int cmd_ls_remote(int argc, const char **argv, const 
> char *prefix)
> for ( ; ref; ref = ref->next) {
> +   struct ref_array_item *item;
> if (!check_ref_type(ref, flags))
> continue;
> if (!tail_match(pattern, ref->name))
> continue;
> +   item = ref_array_push(, ref->name, >old_oid);
> +   item->symref = xstrdup_or_null(ref->symref);

Do we need to worry about freeing memory allocated by these two lines?

More generally, do we care that 'array' and 'sorting' are being
leaked? If not, perhaps they should be annotated by UNLEAK.

> +   }
> diff --git a/t/t5512-ls-remote.sh b/t/t5512-ls-remote.sh
> @@ -39,6 +42,39 @@ test_expect_success 'ls-remote self' '
> +test_expect_success 'ls-remote --sort="version:refname" --tags self' '
> +   cat >expect <<-\EOF &&
> +   1bd44cb9d13204b0fe1958db0082f5028a16eb3arefs/tags/mark
> +   1bd44cb9d13204b0fe1958db0082f5028a16eb3arefs/tags/mark1.1
> +   1bd44cb9d13204b0fe1958db0082f5028a16eb3arefs/tags/mark1.2
> +   1bd44cb9d13204b0fe1958db0082f5028a16eb3arefs/tags/mark1.10
> +   EOF
> +   git ls-remote --sort="version:refname" --tags self >actual &&
> +   test_cmp expect actual
> +'

This test script is already filled with these hardcoded SHA-1's, so
this is not a new problem, but work is under way to make it possible
to replace SHA-1 with some other hashing algorithm. Test scripts are
being retrofitted to avoid hard-coding them; instead determining the
hash value dynamically (for example, [1]). It would be nice if the new
tests followed suit, thus saving someone else extra work down the
road. (This, itself, is not worth a re-roll, but something to consider
if you do re-roll.)

[1]: 
https://public-inbox.org/git/20180325192055.841459-10-sand...@crustytoothpaste.net/

> +test_expect_success 'ls-remote --sort="-version:refname" --tags self' '
> +   cat >expect <<-\EOF &&
> +   1bd44cb9d13204b0fe1958db0082f5028a16eb3arefs/tags/mark1.10
> +   1bd44cb9d13204b0fe1958db0082f5028a16eb3arefs/tags/mark1.2
> +   1bd44cb9d13204b0fe1958db0082f5028a16eb3arefs/tags/mark1.1
> +   1bd44cb9d13204b0fe1958db0082f5028a16eb3arefs/tags/mark
> +   EOF
> +   git ls-remote --sort="-version:refname" --tags self 

Re: [PATCH v2 0/4] Lazy-load trees when reading commit-graph

2018-04-07 Thread Derrick Stolee

On 4/7/2018 2:40 PM, Jakub Narebski wrote:

Derrick Stolee  writes:

[...]

On the Linux repository, performance tests were run for the following
command:

 git log --graph --oneline -1000

 Before: 0.92s
 After:  0.66s
 Rel %: -28.3%

Adding '-- kernel/' to the command requires loading the root tree
for every commit that is walked. There was no measureable performance
change as a result of this patch.

In the "Git Merge contributor summit notes" [1] one can read that:


- VSTS adds bloom filters to know which paths have changed on the commit
- tree-same check in the bloom filter is fast; speeds up file history checks
- if the file history is _very_ sparse, then bloom filter is useful

Could this method speed up also the second case mentioned here?  Can
anyone explain how this "path-changed bloom filter" works in VSTS?
   


The idea is simple: for every commit, store a Bloom filter containing 
the list of paths that are not TREESAME against the first parent. (A 
slight detail: have a max cap on the number of paths, and store simply 
"TOO_BIG" for commits with too many diffs.)


When performing 'git log -- path' queries, the most important detail for 
considering how to advance the walk is whether the commit is TREESAME to 
its first parent. For a deep path in a large repo, this is almost always 
true. When a Bloom filter says "TREESAME" (i.e. "this path is not in my 
set") it is always correct, so we can set the treesame bit and continue 
without walking any trees. When a Bloom filter says "MAYBE NOT TREESAME" 
(i.e. "this path is probably in my set") you only need to do the same 
work as before: walk the trees to compare against your first parent.


If a Bloom filter has a false-positive rate of X%, then you can possibly 
drop your number of tree comparisons by (100-X)%. This is very important 
for large repos where some paths were changed only ten times or so, the 
full graph needs to be walked and it is helpful to avoid parsing too 
many trees.



Could we add something like this to the commit-graph file? 


I'm not sure if it is necessary for client-side operations, but it is 
one of the reasons the commit-graph file has the idea of an "optional 
chunk". It could be added to the file format (without changing version 
numbers) and be ignored by clients that don't understand it. I could 
also be gated by a config setting for computing them. My guess is that 
only server-side operations will need the added response time, and can 
bear the cost of computing them when writing the commit-graph file. 
Clients are less likely to be patient waiting for a lot of diff 
calculations.


If we add commit-graph file downloads to the protocol, then the server 
could do this computation and send the data to all clients. But that 
would be "secondary" information that maybe clients want to verify, 
which is as difficult as computing it themselves.


Thanks,

-Stolee



Re: [PATCH 0/6] Compute and consume generation numbers

2018-04-07 Thread Derrick Stolee

On 4/7/2018 12:55 PM, Jakub Narebski wrote:

Currently I am at the stage of reproducing results in FELINE paper:
"Reachability Queries in Very Large Graphs: A Fast Refined Online Search
Approach" by Renê R. Veloso, Loïc Cerf, Wagner Meira Jr and Mohammed
J. Zaki (2014).  This paper is available in the PDF form at
https://openproceedings.org/EDBT/2014/paper_166.pdf

The Jupyter Notebook (which runs on Google cloud, but can be also run
locally) uses Python kernel, NetworkX librabry for graph manipulation,
and matplotlib (via NetworkX) for display.

Available at:
https://colab.research.google.com/drive/1V-U7_slu5Z3s5iEEMFKhLXtaxSu5xyzg
https://drive.google.com/file/d/1V-U7_slu5Z3s5iEEMFKhLXtaxSu5xyzg/view?usp=sharing

I hope that could be of help, or at least interesting


Let me know when you can give numbers (either raw performance or # of 
commits walked) for real-world Git commit graphs. The Linux repo is a 
good example to use for benchmarking, but I also use the Kotlin repo 
sometimes as it has over a million objects and over 250K commits.


Of course, the only important statistic at the end of the day is the 
end-to-end time of a 'git ...' command. Your investigations should 
inform whether it is worth prototyping the feature in the git codebase.


Thanks,

-Stolee



Re: [PATCH v11 1/4] ref-filter: use "struct object_id" consistently

2018-04-07 Thread Eric Sunshine
On Sat, Apr 7, 2018 at 12:42 PM, Harald Nordgren
 wrote:
> From: Jeff King 
>
> Internally we store a "struct object_id", and all of our
> callers have one to pass us. But we insist that they peel it
> to its bare-sha1 hash, which we then hashcpy() into place.
> Let's pass it around as an object_id, which future-proofs us
> for a post-sha1 world.
> ---

You incorrectly dropped Peff's sign-off[1] when re-sending the patches
he authored in the series. And, your sign-off should follow his.

Also, if you made any changes to Peff's patch, it's a good idea to
state so with a bracketed comment at the end of the commit message
(before the sign-offs). For instance:

[hn: tweaked doodle blap]

or such.

[1]: https://public-inbox.org/git/20180406185831.ga11...@sigill.intra.peff.net/


Re: [PATCH v7 03/14] commit-graph: add format document

2018-04-07 Thread Jakub Narebski
Derrick Stolee  writes:

> diff --git a/Documentation/technical/commit-graph-format.txt
> b/Documentation/technical/commit-graph-format.txt
> new file mode 100644
> index 00..ad6af8105c
> --- /dev/null
> +++ b/Documentation/technical/commit-graph-format.txt
> @@ -0,0 +1,97 @@
> +Git commit graph format
> +===
> +
> +The Git commit graph stores a list of commit OIDs and some associated
> +metadata, including:
> +
> +- The generation number of the commit. Commits with no parents have
> +  generation number 1; commits with parents have generation number
> +  one more than the maximum generation number of its parents. We
> +  reserve zero as special, and can be used to mark a generation
> +  number invalid or as "not computed".

Actually we will be reserving "all bits 1" as special, though I don't
think it is worth mentioning here, and at this time.

> +
> +- The root tree OID.
> +
> +- The commit date.
> +
> +- The parents of the commit, stored using positional references within
> +  the graph file.
> +
> +These positional references are stored as unsigned 32-bit integers
> +corresponding to the array position withing the list of commit OIDs. We
> +use the most-significant bit for special purposes, so we can store at most
> +(1 << 31) - 1 (around 2 billion) commits.
> +
> +== Commit graph files have the following format:
> +
> +In order to allow extensions that add extra data to the graph, we organize
> +the body into "chunks" and provide a binary lookup table at the beginning
> +of the body. The header includes certain values, such as number of chunks
> +and hash type.
> +
> +All 4-byte numbers are in network order.
> +
> +HEADER:
> +
> +  4-byte signature:
> +  The signature is: {'C', 'G', 'P', 'H'}

The mnemonics: CGPH = Commit GraPH

> +
> +  1-byte version number:
> +  Currently, the only valid version is 1.
> +
> +  1-byte Hash Version (1 = SHA-1)
> +  We infer the hash length (H) from this value.
> +
> +  1-byte number (C) of "chunks"
> +
> +  1-byte (reserved for later use)
> + Current clients should ignore this value.

All right, with this reserved byte that makes header word-aligned (be it
32-bit or 64-bit)

> +
> +CHUNK LOOKUP:
> +
> +  (C + 1) * 12 bytes listing the table of contents for the chunks:
> +  First 4 bytes describe the chunk id. Value 0 is a terminating label.

As I understand it, it is value 0 as 4-byte integer, that is 4 x byte 0.
This may need clarification (or may need not).

> +  Other 8 bytes provide the byte-offset in current file for chunk to
> +  start. (Chunks are ordered contiguously in the file, so you can infer
> +  the length using the next chunk position if necessary.) Each chunk
> +  ID appears at most once.
> +
> +  The remaining data in the body is described one chunk at a time, and
> +  these chunks may be given in any order. Chunks are required unless
> +  otherwise specified.

Perhaps there should be here list of all required chunks, and list of
all optional chunks.

> +
> +CHUNK DATA:
> +
> +  OID Fanout (ID: {'O', 'I', 'D', 'F'}) (256 * 4 bytes)

The mnemonics: OIDF = CID Fanout

> +  The ith entry, F[i], stores the number of OIDs with first
> +  byte at most i. Thus F[255] stores the total
> +  number of commits (N).

So in other words this is cumulative histogram?  Just ensuring that I
understand it correctly.

> +
> +  OID Lookup (ID: {'O', 'I', 'D', 'L'}) (N * H bytes)

The mnemonics: OIDL = OID Lookup

> +  The OIDs for all commits in the graph, sorted in ascending order.
> +
> +  Commit Data (ID: {'C', 'G', 'E', 'T' }) (N * (H + 16) bytes)

The mnemonics: CGET = ???

It is not CDAT, CGDT, DATA, etc.

> +* The first H bytes are for the OID of the root tree.
> +* The next 8 bytes are for the positions of the first two parents
> +  of the ith commit. Stores value 0x if no parent in that
> +  position. If there are more than two parents, the second value
> +  has its most-significant bit on and the other bits store an array
> +  position into the Large Edge List chunk.

Possibly better:

  +  position into the Large Edge List chunk (EDGE chunk).

> +* The next 8 bytes store the generation number of the commit and
> +  the commit time in seconds since EPOCH.

The commit time is committer date (without timezone info), that is the
date that the commit object was created, isn't it?

> The generation number
> +  uses the higher 30 bits of the first 4 bytes, while the commit
> +  time uses the 32 bits of the second 4 bytes, along with the lowest
> +  2 bits of the lowest byte, storing the 33rd and 34th bit of the
> +  commit time.
> +
> +  Large Edge List (ID: {'E', 'D', 'G', 'E'}) [Optional]

Well, it is optional in the sense that it may not be here if the project
doesn't have any octopus merges.  It is not optional in the sense that
Git can ignore this chunk if 

Re: Git send-email not sending email patches as subsequent replies

2018-04-07 Thread Eric Wong
David Davis  wrote:
> Hello, I have the following git send-email command:
> 
> git send-email -5 --quiet --thread --no-chain-reply-to --compose
> --subject='Recent Base Prototype Changes Summary'
> --to=davisda...@google.com --from=davisda...@google.com
> 
> It's sending emails but as 5 individual emails and not as subsequent
> replies as suggested on https://git-scm.com/docs/git-send-email below:

So, I expect the above command should give you six emails:

  [PATCH 0/5] Recent Base Prototype Changes Summary
[PATCH 1/5]
[PATCH 2/5]
[PATCH 3/5]
[PATCH 4/5]
[PATCH 5/5]

Is that not what you're seeing?

You're quoting the manpage for --in-reply-to option below,
but you're not using that option.

> So for example when --thread and --no-chain-reply-to are specified,
> the second and subsequent patches will be replies to the first one
> like in the illustration below where [PATCH v2 0/3] is in reply to
> [PATCH 0/2]:
> 
> [PATCH 0/2] Here is what I did...
>   [PATCH 1/2] Clean up and tests
>   [PATCH 2/2] Implementation
>   [PATCH v2 0/3] Here is a reroll
> [PATCH v2 1/3] Clean up
> [PATCH v2 2/3] New tests
> [PATCH v2 3/3] Implementation
> 
> What am I missing?

I'm not sure...  I suspect you're either not using --in-reply-to=
or misreading the manpage, somehow.


Re: [PATCH v7 02/14] csum-file: refactor finalize_hashfile() method

2018-04-07 Thread Jakub Narebski
Derrick Stolee  writes:

> From: Derrick Stolee 
>
> If we want to use a hashfile on the temporary file for a lockfile, then
> we need finalize_hashfile() to fully write the trailing hash but also keep
> the file descriptor open.
>
> Do this by adding a new CSUM_HASH_IN_STREAM flag along with a functional
> change that checks this flag before writing the checksum to the stream.
> This differs from previous behavior since it would be written if either
> CSUM_CLOSE or CSUM_FSYNC is provided.

I'm sorry, but I don't understand from this description what this flag
does and what it is meant to do.

> Signed-off-by: Derrick Stolee 
> ---

[...]
> diff --git a/csum-file.h b/csum-file.h
> index 9ba87f0a6c..c5a2e335e7 100644
> --- a/csum-file.h
> +++ b/csum-file.h
> @@ -27,8 +27,9 @@ extern void hashfile_checkpoint(struct hashfile *, struct 
> hashfile_checkpoint *)
>  extern int hashfile_truncate(struct hashfile *, struct hashfile_checkpoint 
> *);
>  
>  /* finalize_hashfile flags */
> -#define CSUM_CLOSE   1
> -#define CSUM_FSYNC   2
> +#define CSUM_CLOSE   1
> +#define CSUM_FSYNC   2
> +#define CSUM_HASH_IN_STREAM  4

Especially that it is not commented / described here, and the name is
unsufficiently descriptive for me.

[...]
> diff --git a/csum-file.c b/csum-file.c
> index e6c95a6915..53ce37f7ca 100644
> --- a/csum-file.c
> +++ b/csum-file.c
> @@ -61,11 +61,11 @@ int finalize_hashfile(struct hashfile *f, unsigned char 
> *result, unsigned int fl
>   the_hash_algo->final_fn(f->buffer, >ctx);
>   if (result)
>   hashcpy(result, f->buffer);
> - if (flags & (CSUM_CLOSE | CSUM_FSYNC)) {
> - /* write checksum and close fd */
> + if (flags & CSUM_HASH_IN_STREAM)
>   flush(f, f->buffer, the_hash_algo->rawsz);

Wouldn't CSUM_FLUSH be a better name for this flag?

> - if (flags & CSUM_FSYNC)
> - fsync_or_die(f->fd, f->name);
> + if (flags & CSUM_FSYNC)
> + fsync_or_die(f->fd, f->name);
> + if (flags & CSUM_CLOSE) {
>   if (close(f->fd))
>   die_errno("%s: sha1 file error on close", f->name);
>   fd = 0;

--
Jakub Narębski


Re: [PATCH v4 00/13] Serialized Git Commit Graph

2018-04-07 Thread Jakub Narebski
Derrick Stolee  writes:

> On 4/2/2018 10:46 AM, Jakub Narebski wrote:
>> Derrick Stolee  writes:
> [...]
>> I see the FELINE-index as a stronger form of generation numbers (called
>> also level of the vertex / node), in that it allows to negative-cut even
>> more, pruning paths that are known to be unreachable (or marking nodes
>> known to be unreachable in the "calculate difference" scenario).
>>
>> Also, FELINE-index uses two integer numbers (coordinates in 2d space);
>> one of those indices can be the topological numbering (topological
>> sorting order) of nodes in the commit graph.  That would help to answer
>> even more Git questions.
>
> This two-dimensional generation number is helpful for non-reachability
> queries, but is something that needs the "full" commit graph in order
> to define the value for a single commit (hence the O(N lg N)
> performance mentioned below). Generation numbers are effective while
> being easy to compute and immutable.

O(N log N) is the cost of sort algorithm, namely peforming topological
sorting of commits.  Which Git can currently do.

We can use the main idea behind FELINE index, namely weak dominance
drawing, while perhaps changing the detail.  The main idea is that if
you draw edges of DAG always to be in selected quadrant -- the default
is drawing them up and to the right, then paths would also always be in
the same quadrant; if target node is not in given quadrant with respect
to source node, then target node cannot be reached from source node.

For fast answering of reachability queries it is important to have
minimal number of false positives (falsely implied paths), that is
situation where FELINE index does imply that there could be a path, but
in fact target is not reachable from the source.  The authors propose a
heuristics: use positions in [some] topological order for X coordinates
of FELINE index, and generate new optimal locally topological sorting to
use for Y coordinates.


Generation numbers (which can be used together with FELINE index, and
what paper does use -- calling them Level Filter) have the advantage of
being able to be calculated incrementally. I think this is what you
wanted to say.

I think it would be possible to calculate FELINE index incrementally,
too.  If Git's topological order is stable, at least for X coordinates
of FELINE index it would be easy.


I have started experimenting with this approach in Jupyter Notebook,
available on Google Colaboratory (you can examine it, run it and edit it
from web browser -- the default is to use remote runtime on Google
cloud).  Currently I am at the stage of reproducing the theoretical
parts of the FELINE paper.

  https://colab.research.google.com/drive/1V-U7_slu5Z3s5iEEMFKhLXtaxSu5xyzg
  
https://drive.google.com/file/d/1V-U7_slu5Z3s5iEEMFKhLXtaxSu5xyzg/view?usp=sharing

> I wonder if FELINE was compared directly to a one-dimensional index (I
> apologize that I have not read the paper in detail, so I don't
> understand the indexes they compare against).

They have compared FELINE using real graphs (like ArXiv, Pubmed,
CiteseerX, Uniprot150m) and synthetic DAG against:
 - GRAIL (Graph Reachability Indexing via RAndomized Interval Labeling)
 - FERRARI (Flexible and Efficient Reachability Range Assignment for
   gRapth Indexing), with interval set compression to approximate ones
 - Nuutila's INTERVAL, which extracts complete transitive closure of
   the graph and stores it using PWAH bit-vector compression
 - TF-Label (Topological Folding LABELling)

>It also appears the
> graphs they use for their tests are not commit graphs, which have a
> different shape than many of the digraphs studies by that work.

I plan to check how FELINE index works for commit graphs in
above-mentioned Jupyter Notebook.

> This is all to say: I would love to see an interesting study in this
> direction, specifically comparing this series' definition of
> generation numbers to the 2-dimensional system in FELINE, and on a
> large sample of commit graphs available in open-source data sets
> (Linux kernel, Git, etc.) and possibly on interesting closed-source
> data sets.

I wonder if there are more recent works, with even faster solutions.

>>> The case for "Can X reach Y?" is mostly for commands like 'git branch
>>> --contains', when 'git fetch' checks for forced-updates of branches,
>>> or when the server decides enough negotiation has occurred during a
>>> 'git fetch'. While these may be worth investigating, they also benefit
>>> greatly from the accelerated graph walk introduced in the current
>>> format.
>>>
>>> I would be happy to review any effort to extend the commit-graph
>>> format to include such indexes, as long as the performance benefits
>>> outweigh the complexity to create them.
>>>
>>> [2] 
>>> http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.719.8396=rep1=pdf

I wonder if next low-hanging branch would be to store 

Hello friend, did your received my email?.

2018-04-07 Thread Gonzalez
my message...


Re: Git Merge contributor summit notes

2018-04-07 Thread Jakub Narebski
Brandon Williams  writes:
> On 03/26, Jeff Hostetler wrote:

[...]
>> All of these cases could be eliminated if the type/size were available
>> in the OID.
>> 
>> Just a thought.  While we are converting to a new hash it seems like
>> this would be a good time to at least discuss it.
>
> Echoing what Stefan said.  I don't think its a good idea to embed this
> sort of data into the OID.  There are a lot of reasons but one of them
> being that would block having access to this data behind completing the
> hash transition (which could very well still be years away from
> completing).
>
> I think that a much better approach would be to create a meta-data data
> structure, much like the commit graph that stolee has been working on)
> which can store this data along side the objects (but not in the
> packfiles themselves).  It could be a stacking structure which is
> periodically coalesced and we could add in a wire feature to fetch this
> meta data from the server upon fetching objects.

Well, the type of the object is available, from what I remember, in the
bitmap file for a packfile (if one does enable creaating them).  There
are four compressed bit vectors, one for each type, with bit set to 1 on
i-th place if i-th object in packfile is of given type.

Just FYI.
--
Jakub Narębski


Witam potrzebuję Pilnej odpowiedzi, abyśmy mogli porozmawiać o ważnej sprawie

2018-04-07 Thread Sergeant Schieble Anderson




Re: [PATCH] git-svn: avoid warning on undef readline()

2018-04-07 Thread brian m. carlson
On Fri, Apr 06, 2018 at 01:15:14PM +, Ævar Arnfjörð Bjarmason wrote:
> Change code in Git.pm that sometimes calls chomp() on undef to only do
> so the value is defined.
> 
> This code has been chomping undef values ever since it was added in
> b26098fc2f ("git-svn: reduce scope of input record separator change",
> 2016-10-14), but started warning due to the introduction of "use
> warnings" to Git.pm in my f0e19cb7ce ("Git.pm: add the "use warnings"
> pragma", 2018-02-25) released with 2.17.0.
> 
> Since this function will return undef in those cases it's still
> possible that the code using it will warn if it does a chomp of its
> own, as the code added in b26098fc2f ("git-svn: reduce scope of input
> record separator change", 2016-10-14) might do, but since git-svn has
> "use warnings" already that's clearly not a codepath that's going to
> warn.

Thanks for this patch.  I ran into this earlier this week (with git svn
fetch) and had intended to send a patch, but you clearly got to it
before I did.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH v2 0/4] Lazy-load trees when reading commit-graph

2018-04-07 Thread Jakub Narebski
Derrick Stolee  writes:

[...]
> On the Linux repository, performance tests were run for the following
> command:
>
> git log --graph --oneline -1000
>
> Before: 0.92s
> After:  0.66s
> Rel %: -28.3%
>
> Adding '-- kernel/' to the command requires loading the root tree
> for every commit that is walked. There was no measureable performance
> change as a result of this patch.

In the "Git Merge contributor summit notes" [1] one can read that:

> - VSTS adds bloom filters to know which paths have changed on the commit
> - tree-same check in the bloom filter is fast; speeds up file history checks
> - if the file history is _very_ sparse, then bloom filter is useful

Could this method speed up also the second case mentioned here?  Can
anyone explain how this "path-changed bloom filter" works in VSTS?

Could we add something like this to the commit-graph file?

[1]: 
https://public-inbox.org/git/alpine.DEB.2.20.1803091557510.23109@alexmv-linux/

Best regards,
--
Jakub Narębski


Re: [PATCH 4/3] Makefile: untangle DEVELOPER and -Werror

2018-04-07 Thread Ævar Arnfjörð Bjarmason

On Sat, Apr 07 2018, Duy Nguyen wrote:

> On Sat, Apr 7, 2018 at 2:36 PM, Ævar Arnfjörð Bjarmason
>  wrote:
>> Anyway, I see you've pushed a new version with DEVOPTS. I'll submit mine
>> on top of that once your new version lands (unless you want to try to
>> integrate it yourself).
>
> Actually I think I'll just drop both EAGER_DEVELOPER and DEVOTPS.

OK. I'll submit something on top to optionally get rid of -Werror, since
that's the only bit I actually care about & have a use-case for.


Re: [PATCH 0/6] Compute and consume generation numbers

2018-04-07 Thread Jakub Narebski
Derrick Stolee  writes:

> On 4/3/2018 2:03 PM, Brandon Williams wrote:
>> On 04/03, Derrick Stolee wrote:
>>> This is the first of several "small" patches that follow the serialized
>>> Git commit graph patch (ds/commit-graph).
>>>
>>> As described in Documentation/technical/commit-graph.txt, the generation
>>> number of a commit is one more than the maximum generation number among
>>> its parents (trivially, a commit with no parents has generation number
>>> one).
[...]
>>> A more substantial refactoring of revision.c is required before making
>>> 'git log --graph' use generation numbers effectively.
>>
>> log --graph should benefit a lot more from this correct?  I know we've
>> talked a bit about negotiation and I wonder if these generation numbers
>> should be able to help out a little bit with that some day.
>
> 'log --graph' should be a HUGE speedup, when it is refactored. Since
> the topo-order can "stream" commits to the pager, it can be very
> responsive to return the graph in almost all conditions. (The case
> where generation numbers are not enough is when filters reduce the set
> of displayed commits to be very sparse, so many commits are walked
> anyway.)

I wonder if next big speedup would be to store [some] topological
ordering of commits in the commit graph... It could be done for example
in two chunks: a mapping to position in topological order, and list of
commits sorted in topological order.

Note also that FELINE index uses (or can use -- but it is supposedly the
optimal choice) position of vertex/node in topological order as one of
the two values in the pair that composes FELINE index.

> If we have generic "can X reach Y?" queries, then we can also use
> generation numbers there to great effect (by not walking commits Z
> with gen(Z) <= gen(Y)). Perhaps I should look at that "git branch
> --contains" thread for ideas.

This is something that is shown in the Google Colab [Jupyter] Notebook
I have mentioned:

  https://colab.research.google.com/drive/1V-U7_slu5Z3s5iEEMFKhLXtaxSu5xyzg
  
https://drive.google.com/file/d/1V-U7_slu5Z3s5iEEMFKhLXtaxSu5xyzg/view?usp=sharing

> For negotiation, there are some things we can do here. VSTS uses
> generation numbers as a heuristic for determining "all wants connected
> to haves" which is a condition for halting negotiation. The idea is
> very simple, and I'd be happy to discuss it on a separate thread.

Nice.  How much speedup it gives?

Best regards,
--
Jakub Narębski


Re: [PATCH 4/3] Makefile: untangle DEVELOPER and -Werror

2018-04-07 Thread Duy Nguyen
On Sat, Apr 7, 2018 at 2:36 PM, Ævar Arnfjörð Bjarmason
 wrote:
> Anyway, I see you've pushed a new version with DEVOPTS. I'll submit mine
> on top of that once your new version lands (unless you want to try to
> integrate it yourself).

Actually I think I'll just drop both EAGER_DEVELOPER and DEVOTPS.
-- 
Duy


Re: [PATCH 0/6] Compute and consume generation numbers

2018-04-07 Thread Jakub Narebski
Hello,

Derrick Stolee  writes:

> This is the first of several "small" patches that follow the serialized
> Git commit graph patch (ds/commit-graph).
>
> As described in Documentation/technical/commit-graph.txt, the generation
> number of a commit is one more than the maximum generation number among
> its parents (trivially, a commit with no parents has generation number
> one).
>
> This series makes the computation of generation numbers part of the
> commit-graph write process.
>
> Finally, generation numbers are used [...].
>
> This does not have a significant performance benefit in repositories
> of normal size, but in the Windows repository, some merge-base
> calculations improve from 3.1s to 2.9s. A modest speedup, but provides
> an actual consumer of generation numbers as a starting point.
>
> A more substantial refactoring of revision.c is required before making
> 'git log --graph' use generation numbers effectively.

I have started working on Jupyter Notebook on Google Colaboratory to
find out how much speedup we can get using generation numbers (level
negative-cut filter), FELINE index (negative-cut filter) and min-post
intervals in some spanning tree (positive-cut filter, if I understand it
correctly the base of GRAIL method) in commit graphs.

Currently I am at the stage of reproducing results in FELINE paper:
"Reachability Queries in Very Large Graphs: A Fast Refined Online Search
Approach" by Renê R. Veloso, Loïc Cerf, Wagner Meira Jr and Mohammed
J. Zaki (2014).  This paper is available in the PDF form at
https://openproceedings.org/EDBT/2014/paper_166.pdf

The Jupyter Notebook (which runs on Google cloud, but can be also run
locally) uses Python kernel, NetworkX librabry for graph manipulation,
and matplotlib (via NetworkX) for display.

Available at:
https://colab.research.google.com/drive/1V-U7_slu5Z3s5iEEMFKhLXtaxSu5xyzg
https://drive.google.com/file/d/1V-U7_slu5Z3s5iEEMFKhLXtaxSu5xyzg/view?usp=sharing

I hope that could be of help, or at least interesting
--
Jakub Narębski


[PATCH v11 2/4] ref-filter: make ref_array_item allocation more consistent

2018-04-07 Thread Harald Nordgren
From: Jeff King 

We have a helper function to allocate ref_array_item
structs, but it only takes a subset of the possible fields
in the struct as initializers. We could have it accept an
argument for _every_ field, but that becomes a pain for the
fields which some callers don't want to set initially.

Instead, let's be explicit that it takes only the minimum
required to create the ref, and that callers should then
fill in the rest themselves.
---
 ref-filter.c | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index ade97a848..c1c3cc948 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1824,15 +1824,18 @@ static const struct object_id *match_points_at(struct 
oid_array *points_at,
return NULL;
 }
 
-/* Allocate space for a new ref_array_item and copy the objectname and flag to 
it */
+/*
+ * Allocate space for a new ref_array_item and copy the name and oid to it.
+ *
+ * Callers can then fill in other struct members at their leisure.
+ */
 static struct ref_array_item *new_ref_array_item(const char *refname,
-const struct object_id *oid,
-int flag)
+const struct object_id *oid)
 {
struct ref_array_item *ref;
+
FLEX_ALLOC_STR(ref, refname, refname);
oidcpy(>objectname, oid);
-   ref->flag = flag;
 
return ref;
 }
@@ -1927,12 +1930,13 @@ static int ref_filter_handler(const char *refname, 
const struct object_id *oid,
 * to do its job and the resulting list may yet to be pruned
 * by maxcount logic.
 */
-   ref = new_ref_array_item(refname, oid, flag);
+   ref = new_ref_array_item(refname, oid);
ref->commit = commit;
+   ref->flag = flag;
+   ref->kind = kind;
 
REALLOC_ARRAY(ref_cbdata->array->items, ref_cbdata->array->nr + 1);
ref_cbdata->array->items[ref_cbdata->array->nr++] = ref;
-   ref->kind = kind;
return 0;
 }
 
@@ -2169,7 +2173,7 @@ void pretty_print_ref(const char *name, const struct 
object_id *oid,
  const struct ref_format *format)
 {
struct ref_array_item *ref_item;
-   ref_item = new_ref_array_item(name, oid, 0);
+   ref_item = new_ref_array_item(name, oid);
ref_item->kind = ref_kind_from_refname(name);
show_ref_array_item(ref_item, format);
free_array_item(ref_item);
-- 
2.14.3 (Apple Git-98)



[PATCH v11 4/4] ls-remote: create '--sort' option

2018-04-07 Thread Harald Nordgren
Create a '--sort' option for ls-remote, based on the one from
for-each-ref. This e.g. allows ref names to be sorted by version
semantics, so that v1.2 is sorted before v1.10.

Signed-off-by: Harald Nordgren 
---

Notes:
Rebasing my patch on Jeff King's refatoring patches

 Documentation/git-ls-remote.txt | 15 ++-
 builtin/ls-remote.c | 22 --
 t/t5512-ls-remote.sh| 41 -
 3 files changed, 74 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-ls-remote.txt b/Documentation/git-ls-remote.txt
index 5f2628c8f..fa4505fd7 100644
--- a/Documentation/git-ls-remote.txt
+++ b/Documentation/git-ls-remote.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 
 [verse]
 'git ls-remote' [--heads] [--tags] [--refs] [--upload-pack=]
- [-q | --quiet] [--exit-code] [--get-url]
+ [-q | --quiet] [--exit-code] [--get-url] [--sort=]
  [--symref] [ [...]]
 
 DESCRIPTION
@@ -60,6 +60,19 @@ OPTIONS
upload-pack only shows the symref HEAD, so it will be the only
one shown by ls-remote.
 
+--sort=::
+   Sort based on the key given.  Prefix `-` to sort in
+   descending order of the value. You may use the --sort= option
+   multiple times, in which case the last key becomes the primary
+   key. Also supports "version:refname" or "v:refname" (tag
+   names are treated as versions). The "version:refname" sort
+   order can also be affected by the "versionsort.suffix"
+   configuration variable.
+   The keys supported are the same as those in `git for-each-ref`,
+   except that because `ls-remote` deals only with remotes, keys like
+   `committerdate` that require access to the objects themselves will
+   not work.
+
 ::
The "remote" repository to query.  This parameter can be
either a URL or the name of a remote (see the GIT URLS and
diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index 540d56429..4375e8322 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -1,6 +1,7 @@
 #include "builtin.h"
 #include "cache.h"
 #include "transport.h"
+#include "ref-filter.h"
 #include "remote.h"
 
 static const char * const ls_remote_usage[] = {
@@ -43,10 +44,13 @@ int cmd_ls_remote(int argc, const char **argv, const char 
*prefix)
int show_symref_target = 0;
const char *uploadpack = NULL;
const char **pattern = NULL;
+   int i;
 
struct remote *remote;
struct transport *transport;
const struct ref *ref;
+   struct ref_array array;
+   static struct ref_sorting *sorting = NULL, **sorting_tail = 
 
struct option options[] = {
OPT__QUIET(, N_("do not print remote URL")),
@@ -60,6 +64,8 @@ int cmd_ls_remote(int argc, const char **argv, const char 
*prefix)
OPT_BIT(0, "refs", , N_("do not show peeled tags"), 
REF_NORMAL),
OPT_BOOL(0, "get-url", _url,
 N_("take url..insteadOf into account")),
+   OPT_CALLBACK(0 , "sort", sorting_tail, N_("key"),
+N_("field name to sort on"), 
_opt_ref_sorting),
OPT_SET_INT_F(0, "exit-code", ,
  N_("exit with exit code 2 if no matching refs are 
found"),
  2, PARSE_OPT_NOCOMPLETE),
@@ -68,6 +74,8 @@ int cmd_ls_remote(int argc, const char **argv, const char 
*prefix)
OPT_END()
};
 
+   memset(, 0, sizeof(array));
+
argc = parse_options(argc, argv, prefix, options, ls_remote_usage,
 PARSE_OPT_STOP_AT_NON_OPTION);
dest = argv[0];
@@ -104,13 +112,23 @@ int cmd_ls_remote(int argc, const char **argv, const char 
*prefix)
if (!dest && !quiet)
fprintf(stderr, "From %s\n", *remote->url);
for ( ; ref; ref = ref->next) {
+   struct ref_array_item *item;
if (!check_ref_type(ref, flags))
continue;
if (!tail_match(pattern, ref->name))
continue;
+   item = ref_array_push(, ref->name, >old_oid);
+   item->symref = xstrdup_or_null(ref->symref);
+   }
+
+   if (sorting)
+   ref_array_sort(sorting, );
+
+   for (i = 0; i < array.nr; i++) {
+   const struct ref_array_item *ref = array.items[i];
if (show_symref_target && ref->symref)
-   printf("ref: %s\t%s\n", ref->symref, ref->name);
-   printf("%s\t%s\n", oid_to_hex(>old_oid), ref->name);
+   printf("ref: %s\t%s\n", ref->symref, ref->refname);
+   printf("%s\t%s\n", oid_to_hex(>objectname), ref->refname);
status = 0; /* we found something */
}
return status;
diff --git a/t/t5512-ls-remote.sh b/t/t5512-ls-remote.sh
index 

[PATCH v11 3/4] ref-filter: factor ref_array pushing into its own function

2018-04-07 Thread Harald Nordgren
From: Jeff King 

In preparation for callers constructing their own ref_array
structs, let's move our own internal push operation into its
own function.

While we're at it, we can replace REALLOC_ARRAY() with
ALLOC_GROW(), which should give the growth operation
amortized linear complexity (as opposed to growing by one,
which is potentially quadratic, though in-place realloc
growth often makes this faster in practice).
---
 ref-filter.c | 16 +---
 ref-filter.h |  8 
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index c1c3cc948..6e9328b27 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1840,6 +1840,18 @@ static struct ref_array_item *new_ref_array_item(const 
char *refname,
return ref;
 }
 
+struct ref_array_item *ref_array_push(struct ref_array *array,
+ const char *refname,
+ const struct object_id *oid)
+{
+   struct ref_array_item *ref = new_ref_array_item(refname, oid);
+
+   ALLOC_GROW(array->items, array->nr + 1, array->alloc);
+   array->items[array->nr++] = ref;
+
+   return ref;
+}
+
 static int ref_kind_from_refname(const char *refname)
 {
unsigned int i;
@@ -1930,13 +1942,11 @@ static int ref_filter_handler(const char *refname, 
const struct object_id *oid,
 * to do its job and the resulting list may yet to be pruned
 * by maxcount logic.
 */
-   ref = new_ref_array_item(refname, oid);
+   ref = ref_array_push(ref_cbdata->array, refname, oid);
ref->commit = commit;
ref->flag = flag;
ref->kind = kind;
 
-   REALLOC_ARRAY(ref_cbdata->array->items, ref_cbdata->array->nr + 1);
-   ref_cbdata->array->items[ref_cbdata->array->nr++] = ref;
return 0;
 }
 
diff --git a/ref-filter.h b/ref-filter.h
index 68268f9eb..76cf87cb6 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -135,4 +135,12 @@ void setup_ref_filter_porcelain_msg(void);
 void pretty_print_ref(const char *name, const struct object_id *oid,
  const struct ref_format *format);
 
+/*
+ * Push a single ref onto the array; this can be used to construct your own
+ * ref_array without using filter_refs().
+ */
+struct ref_array_item *ref_array_push(struct ref_array *array,
+ const char *refname,
+ const struct object_id *oid);
+
 #endif /*  REF_FILTER_H  */
-- 
2.14.3 (Apple Git-98)



[PATCH v11 1/4] ref-filter: use "struct object_id" consistently

2018-04-07 Thread Harald Nordgren
From: Jeff King 

Internally we store a "struct object_id", and all of our
callers have one to pass us. But we insist that they peel it
to its bare-sha1 hash, which we then hashcpy() into place.
Let's pass it around as an object_id, which future-proofs us
for a post-sha1 world.
---
 builtin/tag.c|  2 +-
 builtin/verify-tag.c |  2 +-
 ref-filter.c | 10 +-
 ref-filter.h |  2 +-
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/builtin/tag.c b/builtin/tag.c
index da186691e..42278f516 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -117,7 +117,7 @@ static int verify_tag(const char *name, const char *ref,
return -1;
 
if (format->format)
-   pretty_print_ref(name, oid->hash, format);
+   pretty_print_ref(name, oid, format);
 
return 0;
 }
diff --git a/builtin/verify-tag.c b/builtin/verify-tag.c
index ad7b79fa5..6fa04b751 100644
--- a/builtin/verify-tag.c
+++ b/builtin/verify-tag.c
@@ -72,7 +72,7 @@ int cmd_verify_tag(int argc, const char **argv, const char 
*prefix)
}
 
if (format.format)
-   pretty_print_ref(name, oid.hash, );
+   pretty_print_ref(name, , );
}
return had_error;
 }
diff --git a/ref-filter.c b/ref-filter.c
index 45fc56216..ade97a848 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1826,12 +1826,12 @@ static const struct object_id *match_points_at(struct 
oid_array *points_at,
 
 /* Allocate space for a new ref_array_item and copy the objectname and flag to 
it */
 static struct ref_array_item *new_ref_array_item(const char *refname,
-const unsigned char 
*objectname,
+const struct object_id *oid,
 int flag)
 {
struct ref_array_item *ref;
FLEX_ALLOC_STR(ref, refname, refname);
-   hashcpy(ref->objectname.hash, objectname);
+   oidcpy(>objectname, oid);
ref->flag = flag;
 
return ref;
@@ -1927,7 +1927,7 @@ static int ref_filter_handler(const char *refname, const 
struct object_id *oid,
 * to do its job and the resulting list may yet to be pruned
 * by maxcount logic.
 */
-   ref = new_ref_array_item(refname, oid->hash, flag);
+   ref = new_ref_array_item(refname, oid, flag);
ref->commit = commit;
 
REALLOC_ARRAY(ref_cbdata->array->items, ref_cbdata->array->nr + 1);
@@ -2165,11 +2165,11 @@ void show_ref_array_item(struct ref_array_item *info,
putchar('\n');
 }
 
-void pretty_print_ref(const char *name, const unsigned char *sha1,
+void pretty_print_ref(const char *name, const struct object_id *oid,
  const struct ref_format *format)
 {
struct ref_array_item *ref_item;
-   ref_item = new_ref_array_item(name, sha1, 0);
+   ref_item = new_ref_array_item(name, oid, 0);
ref_item->kind = ref_kind_from_refname(name);
show_ref_array_item(ref_item, format);
free_array_item(ref_item);
diff --git a/ref-filter.h b/ref-filter.h
index 0d98342b3..68268f9eb 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -132,7 +132,7 @@ void setup_ref_filter_porcelain_msg(void);
  * Print a single ref, outside of any ref-filter. Note that the
  * name must be a fully qualified refname.
  */
-void pretty_print_ref(const char *name, const unsigned char *sha1,
+void pretty_print_ref(const char *name, const struct object_id *oid,
  const struct ref_format *format);
 
 #endif /*  REF_FILTER_H  */
-- 
2.14.3 (Apple Git-98)



Re: [PATCH 3/3] ref-filter: factor ref_array pushing into its own function

2018-04-07 Thread Harald Nordgren
Looks good to me.

Reviewed-by: Harald Nordgren 

On Fri, Apr 6, 2018 at 9:27 PM, Derrick Stolee  wrote:
> On 4/6/2018 2:59 PM, Jeff King wrote:
>>
>> In preparation for callers constructing their own ref_array
>> structs, let's move our own internal push operation into its
>> own function.
>>
>> While we're at it, we can replace REALLOC_ARRAY() with
>> ALLOC_GROW(), which should give the growth operation
>> amortized linear complexity (as opposed to growing by one,
>> which is potentially quadratic, though in-place realloc
>> growth often makes this faster in practice).
>>
>> Signed-off-by: Jeff King 
>> ---
>>   ref-filter.c | 16 +---
>>   ref-filter.h |  8 
>>   2 files changed, 21 insertions(+), 3 deletions(-)
>>
>> diff --git a/ref-filter.c b/ref-filter.c
>> index c1c3cc9480..6e9328b274 100644
>> --- a/ref-filter.c
>> +++ b/ref-filter.c
>> @@ -1840,6 +1840,18 @@ static struct ref_array_item
>> *new_ref_array_item(const char *refname,
>> return ref;
>>   }
>>   +struct ref_array_item *ref_array_push(struct ref_array *array,
>> + const char *refname,
>> + const struct object_id *oid)
>> +{
>> +   struct ref_array_item *ref = new_ref_array_item(refname, oid);
>> +
>> +   ALLOC_GROW(array->items, array->nr + 1, array->alloc);
>> +   array->items[array->nr++] = ref;
>> +
>> +   return ref;
>> +}
>> +
>>   static int ref_kind_from_refname(const char *refname)
>>   {
>> unsigned int i;
>> @@ -1930,13 +1942,11 @@ static int ref_filter_handler(const char *refname,
>> const struct object_id *oid,
>>  * to do its job and the resulting list may yet to be pruned
>>  * by maxcount logic.
>>  */
>> -   ref = new_ref_array_item(refname, oid);
>> +   ref = ref_array_push(ref_cbdata->array, refname, oid);
>> ref->commit = commit;
>> ref->flag = flag;
>> ref->kind = kind;
>>   - REALLOC_ARRAY(ref_cbdata->array->items, ref_cbdata->array->nr +
>> 1);
>> -   ref_cbdata->array->items[ref_cbdata->array->nr++] = ref;
>> return 0;
>>   }
>>   diff --git a/ref-filter.h b/ref-filter.h
>> index 68268f9ebc..76cf87cb6c 100644
>> --- a/ref-filter.h
>> +++ b/ref-filter.h
>> @@ -135,4 +135,12 @@ void setup_ref_filter_porcelain_msg(void);
>>   void pretty_print_ref(const char *name, const struct object_id *oid,
>>   const struct ref_format *format);
>>   +/*
>> + * Push a single ref onto the array; this can be used to construct your
>> own
>> + * ref_array without using filter_refs().
>> + */
>> +struct ref_array_item *ref_array_push(struct ref_array *array,
>> + const char *refname,
>> + const struct object_id *oid);
>> +
>>   #endif /*  REF_FILTER_H  */
>
>
> The three patches in this series look good to me.
>
> Reviewed-by: Derrick Stolee 


Re: [PATCH 4/3] Makefile: untangle DEVELOPER and -Werror

2018-04-07 Thread Ævar Arnfjörð Bjarmason

On Sat, Apr 07 2018, Duy Nguyen wrote:

> On Fri, Apr 6, 2018 at 11:42 PM, Jeff King  wrote:
>> On Tue, Apr 03, 2018 at 05:17:00PM +0200, Duy Nguyen wrote:
>>
>>> It's not that complex. With the EAGER_DEVELOPER patch removed, we can
>>> have something like this where eager devs just need to put
>>>
>>> DEVOPTS = gentle no-suppression
>>>
>>> and you put
>>>
>>> DEVOPTS = gentle
>>>
>>> (bad naming, I didn't spend time thinking about names)
>>
>> It seems to me like we're losing the point of DEVELOPER here. I thought
>> the idea was to have a turn-key flag you could set to get extra linting
>> on your commits. But now we're tweaking all kinds of individual options.
>> At some point are we better off just letting you put "-Wno-foo" in your
>> CFLAGS yourself?
>
> It's because what AEvar wanted is no longer a dev thing :)

It's still a dev thing, but just for a slightly different use-case. I
hack my patches up with DEVELOPER=1, and then eventually deploy a
patched git version with those (and others) on top of the latest release
tag.

At that point I want the same compiler assertions, but might be building
on much older compilers, so I just want the warning output without
-Werror.

Anyway, I see you've pushed a new version with DEVOPTS. I'll submit mine
on top of that once your new version lands (unless you want to try to
integrate it yourself).


business Proposal / Geschäftsvorschlag

2018-04-07 Thread Anders Karlsson
I have a business Proposal for you, contact me directly 
This business has a cash involvement of $250,000,000.00

Anders Karlsson

Ich habe einen Geschäftsvorschlag für Sie, kontaktieren Sie mich direkt

Dieses Unternehmen hat eine Beteiligung von $ 250.000.000,00

- [] Anders Karlsson


Re: [PATCH] send-email: fix docs regarding storing password with git credential

2018-04-07 Thread Michał Nazarewicz
2018-04-04 22:14 GMT+01:00 Jeff King :
> On the other hand, I'm not sure why we need to pre-seed here. Wouldn't
> it be sufficient to just issue a "git send-email", which would then
> prompt for the password? And then you'd input your generated token,
> which would get saved via the approve mechanism?

Yeah, this is precisely what I’ve figured as well.  As long as the credentials
helper is configured git-send-email will approve the password and it’ll be
stored then. New patch sent.


[PATCH] send-email: simplify Gmail example in the documentation

2018-04-07 Thread Michal Nazarewicz
There is no need for use to manually call ‘git credential’ especially
as the interface isn’t super user-friendly and a bit confusing.  ‘git
send-email’ will do that for them at the first execution and if the
password matches, it will be saved in the store.

Simplify the documentaion so it dosn’t include the ‘git credential’
invocation (which was incorrect anyway as it should use ‘approve’
instead of ‘fill’) and instead just mentions that credentials helper
must be set up.

Signed-off-by: Michał Nazarewicz 
---
 Documentation/git-send-email.txt | 16 ++--
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/Documentation/git-send-email.txt b/Documentation/git-send-email.txt
index 71ef97ba9..af07840b4 100644
--- a/Documentation/git-send-email.txt
+++ b/Documentation/git-send-email.txt
@@ -473,16 +473,7 @@ edit ~/.gitconfig to specify your account settings:

 If you have multifactor authentication setup on your gmail account, you will
 need to generate an app-specific password for use with 'git send-email'. Visit
-https://security.google.com/settings/security/apppasswords to setup an
-app-specific password.  Once setup, you can store it with the credentials
-helper:
-
-   $ git credential fill
-   protocol=smtp
-   host=smtp.gmail.com
-   username=youn...@gmail.com
-   password=app-password
-
+https://security.google.com/settings/security/apppasswords to create it.

 Once your commits are ready to be sent to the mailing list, run the
 following commands:
@@ -491,7 +482,11 @@ following commands:
$ edit outgoing/-*
$ git send-email outgoing/*

+The first time you run it, you will be prompted for your credentials.  Enter 
the
+app-specific or your regular password as appropriate.  If you have credential
+helper configured (see linkgit:git-credential[1]), the password will be saved 
in
+the credential store so you won't have to type it the next time.
+
 Note: the following perl modules are required
   Net::SMTP::SSL, MIME::Base64 and Authen::SASL

-- 
2.16.2


Re: [PATCH 4/3] Makefile: untangle DEVELOPER and -Werror

2018-04-07 Thread Duy Nguyen
On Fri, Apr 6, 2018 at 11:42 PM, Jeff King  wrote:
> On Tue, Apr 03, 2018 at 05:17:00PM +0200, Duy Nguyen wrote:
>
>> It's not that complex. With the EAGER_DEVELOPER patch removed, we can
>> have something like this where eager devs just need to put
>>
>> DEVOPTS = gentle no-suppression
>>
>> and you put
>>
>> DEVOPTS = gentle
>>
>> (bad naming, I didn't spend time thinking about names)
>
> It seems to me like we're losing the point of DEVELOPER here. I thought
> the idea was to have a turn-key flag you could set to get extra linting
> on your commits. But now we're tweaking all kinds of individual options.
> At some point are we better off just letting you put "-Wno-foo" in your
> CFLAGS yourself?

It's because what AEvar wanted is no longer a dev thing :)

> I don't mind the version-based checks because they're automatic, so the
> feature remains turn-key. But this kind of DEVOPTS seems like a step in
> the wrong direction.
>
> -Peff



-- 
Duy


Re: [RFC PATCH 00/19] object-store refactoring 3 (replace objects, main ref store)

2018-04-07 Thread Duy Nguyen
On Sat, Apr 7, 2018 at 1:21 AM, Stefan Beller  wrote: *
> diff --git a/repository.h b/repository.h
> index 09df94a472..2922d3a28b 100644
> --- a/repository.h
> +++ b/repository.h
> @@ -26,6 +26,11 @@ struct repository {
>  */
> struct raw_object_store *objects;
>
> +   /*
> +* The store in which the refs are hold.
> +*/
> +   struct ref_store *main_ref_store;
> +

We probably should drop the main_ prefix here because this could also
be a submodule ref store (when the repository is about a submodule).
worktree ref store is a different story and I don't know how to best
present it here yet (I'm still thinking a separate struct repository).
But we can worry about that when struct repository supports multiple
worktree.
-- 
Duy


wir bieten 2% Kredite

2018-04-07 Thread Ronald Bernstein
Sehr geehrte Damen und Herren,

Sie brauchen Geld? Sie sind auf der suche nach einem Darlehnen? Seriös und
unkompliziert?
Dann sind Sie hier bei uns genau richtig.
Durch unsere jahrelange Erfahrung und kompetente Beratung sind wir
Europaweit tätig.

Wir bieten jedem ein GÜNSTIGES Darlehnen zu TOP Konditionen an.
Darlehnen zwischen 5000 CHF/Euro bis zu 20 Millionen CHF/Euro möglich.
Wir erheben dazu 2% Zinssatz.

Lassen Sie sich von unserem kompetenten Team beraten.

Zögern Sie nicht und kontaktieren Sie mich unter für weitere Infos &
Anfragen unter der eingeblendeten Email Adresse.


Ich freue mich von Ihnen zu hören.


OK

2018-04-07 Thread AHMED ZAMA
Dear  Friend,



Please can both of us handle a lucrative deal.?? I will give you the
full detail explanation as soon as I hear from you.



Faithfully yours,
Mr Ahmed Zama


Re: [RFC][PATCH] git-stash: convert git stash list to C builtin

2018-04-07 Thread Eric Sunshine
On Sat, Apr 7, 2018 at 4:56 AM, Eric Sunshine  wrote:
> The existing git-stash requires a working directory:
>
> % git stash list
> fatal: not a git repository (or any of the parent directories): .git
> %

Correction on the error message:

% git stash list
fatal: git-stash cannot be used without a working tree.
%


Re: [RFC][PATCH] git-stash: convert git stash list to C builtin

2018-04-07 Thread Eric Sunshine
On Tue, Apr 3, 2018 at 5:38 PM, Paul-Sebastian Ungureanu
 wrote:
> On 25.03.2018 10:08, Eric Sunshine wrote:
>> On Sat, Mar 24, 2018 at 2:23 PM, Paul-Sebastian Ungureanu
>>  wrote:
>>> diff --git a/git.c b/git.c
>>> @@ -466,6 +466,7 @@ static struct cmd_struct commands[] = {
>>>  { "stage", cmd_add, RUN_SETUP | NEED_WORK_TREE },
>>> +   { "stash--helper", cmd_stash__helper, RUN_SETUP },
>>
>> You don't require a working tree? Seems odd for git-stash.
>
> For now, I do not think that it is necessary (for stash list), but I am
> pretty sure that it will be required in the future when porting other
> commands.

The existing git-stash requires a working directory:

% git stash list
fatal: not a git repository (or any of the parent directories): .git
%

so it would make sense for the C port to follow suit.

More generally, a stash is a temporary storage area for changes to the
_work tree_ (and index). Without a work tree, you wouldn't be able to
create any stashes in the first place, so "git stash list" without a
work tree would be meaningless.


Re: [PATCH v5 1/3] builtin/config: introduce `--default`

2018-04-07 Thread Eric Sunshine
On Fri, Apr 6, 2018 at 8:58 PM, Taylor Blau  wrote:
> On Fri, Apr 06, 2018 at 03:40:56AM -0400, Eric Sunshine wrote:
>> On Fri, Apr 6, 2018 at 2:53 AM, Eric Sunshine  
>> wrote:
>> One other issue. If "git config --default ..." fails, the --unset line
>> will never be invoked, thus cleanup won't happen.
>>
>> To ensure cleanup, either use:
>>
>> test_when_finished "git config --unset core.foo" &&
>>
>> earlier in the test (just after the --add line) or use the
>> test_config() function to both set the value and ensure its cleanup.
>
> Neat. I wasn't aware of 'test_when_finished'. I think that I prefer the
> explicitness of 'test_when_finished "git config ..."' over
> 'test_config()', but I am happy to use whichever is preferred. Since
> t1310 is new, there's no historical precedent to follow. Please let me
> know if you have a preference one way or another.

test_config() is preferred; not only is it shorter but, because it
automates cleanup, there's less opportunity to screw up (like
forgetting to use test_when_finished()).

However, since this config setting needs to be present for only a
single command, you can go even simpler by collapsing the entire test
to:

test_expect_success 'does not use --default when entry present' '
echo bar >expect &&
git -c core.foo=bar config --default baz core.foo >actual &&
test_cmp expect actual
'


Re: [PATCH v5 1/3] builtin/config: introduce `--default`

2018-04-07 Thread Eric Sunshine
On Fri, Apr 6, 2018 at 8:49 PM, Taylor Blau  wrote:
> On Fri, Apr 06, 2018 at 02:53:45AM -0400, Eric Sunshine wrote:
>> On Fri, Apr 6, 2018 at 2:30 AM, Taylor Blau  wrote:
>> > +test_expect_success 'uses --default when missing entry' '
>> > +   echo quux >expect &&
>> > +   git config -f config --default quux core.foo >actual &&
>> > +   test_cmp expect actual
>> > +'
>> >
>> > +test_expect_success 'uses entry when available' '
>> > +   echo bar >expect &&
>> > +   git config --add core.foo bar &&
>> > +   git config --default baz core.foo >actual &&
>> > +   git config --unset core.foo &&
>> > +   test_cmp expect actual
>> > +'
>>
>> If you happen to re-roll, can we move this test so it immediately
>> follows the "uses --default when missing entry" test? That's where I
>> had expected to find it and had even written a review comment saying
>> so (but deleted the comment once I got down to this spot in the
>> patch). Also, perhaps rename the test to make it clearer that it
>> complements the "uses --default when missing entry" test; perhaps
>> "does not fallback to --default when entry present" or something.
>
> That location makes much more sense, as does using --default=yes to
> ensure that canonicalization is taking place. I've updated that locally,
> as well as the preceding test to make it clearer that they are
> contrasting tests:
>
> - 'falls back to --default when missing entry'
> - 'does not fallback to --default when entry present'
>
> Though I am not sure about "falls back" to mean "uses --default". I am
> not sure which to pick here... what are your thoughts?

It's nice for the titles to show that the two tests complement one
another but the exact wording is not terribly important. Taking your
original test title (slightly modified), perhaps:

- 'uses --default when entry missing'
- 'does not use --default when entry present'


Re: [PATCH v6 2/2] builtin/config.c: support `--type=` as preferred alias for `--type`

2018-04-07 Thread Eric Sunshine
On Fri, Apr 6, 2018 at 8:39 PM, Taylor Blau  wrote:
> On Fri, Apr 06, 2018 at 03:04:53AM -0400, Eric Sunshine wrote:
>> Sorry for being such a stickler, but this is still too mushy. The
>> first two sentences are saying effectively the same thing. One or the
>> other should be dropped or they should somehow be combined in a way
>> that says everything that needs to be said without repetition.
>
> How do you feel about?
>
>   The `--type=` option instructs 'git config' to ensure that
>   incoming and outgoing values are canonicalize-able under the given
>   .  If no `--type=` is given, no canonicalization will be
>   performed. Callers may unset the existing `--type` specifier with
>   `--no-type`.

This sounds fine. (Maybe s/the existing/an existing/)

> I think documenting that `--no-type` unsets any pre-existing `--type`
> specifier is worthwhile. That said, I also think that there's a way to
> combine the last two sentences, but it might be clearer as two smaller
> pieces rather than one big one.

Smaller, simpler, more easily digested sentences are a win.

>> If necessary, iterate over updates of this paragraph here in the email
>> thread until a polished version is reached rather than re-rolling the
>> entire series every few minutes.
>
> Thanks, and will do :-). I am quite new to this and was unaware of the
> situations when re-rolling is and isn't desirable. I am going to wait to
> re-roll this series until it has gathered more feedback, or at least
> consensus,

Just as it's a burden for you to repeatedly re-roll, it's also a
burden on reviewers to repeatedly re-read the series. Ideally, we'd
like fewer, rather than more, re-rolls, so it's good to nail down
questionable or ambiguous issues via normal email discussion before
going for a re-roll, and some reviewers try to assist with deciding if
a re-roll is needed by saying whether or not a review comment warrants
one.

Allowing time for others to review and possibly comment on a series
before re-rolling is indeed a good idea.

Another useful tactic is to include, in the cover letter, an interdiff
between the previous and new versions, which gives reviewers a way to
quickly examine the changed parts of the series without necessarily
having to re-review each patch in entirety (an often time-consuming
task).

> after which I think it will be ready for queueing as-is.

Famous last words ;-)


Re: Is support for 10.8 dropped?

2018-04-07 Thread Eric Sunshine
On Fri, Apr 6, 2018 at 10:20 PM, Igor Korot  wrote:
>>> dyld: lazy symbol binding failed: Symbol not found: ___strlcpy_chk
>>>   Referenced from: /usr/local/git/libexec/git-core/git
>>>   Expected in: /usr/lib/libSystem.B.dylib
>>
>> It's not clear what installer you used? Was it the package from
>> git-scm? Was it from Homebrew?
>
> I just tried the git-scm installer and got exactly the same error
> during the runtime.

Have you tried any of the suggestions at these pages for resolving this issue?

https://stackoverflow.com/questions/22920497/git-mountain-lion-dyld-lazy-symbol-binding-failed-symbol-not-found-str

https://stackoverflow.com/questions/20929689/git-commands-not-working-in-mac-terminal-dyld-symbol-not-found-strlcpy-ch

>> I would guess that, even if the git-scm installer no longer supports
>> 10.8, it is likely that Homebrew does. Have you tried it?
>
> I don't want to pollute my system with Homebrew.
>
>> If both those options fail, you can always build from source.
>
> Where do I get the soure code? And how do I build it?
> I guess I have only one option left. ;-)

Source code for the latest release is at:
https://github.com/git/git/archive/v2.17.0.zip

To build it, you'll need to have the MacOS Developer Tools installed.
It's also quite likely that you'll need to build some prerequisite
libraries; at minimum OpenSSL. You may be able to skip other libraries
if you don't care about the functionality. Build settings which you
may need to adjust to disable dependence on libraries in which you're
not interested are documented at the top of Makefile. Place overrides
for those settings in a file named config.mak in the git directory.


Re: [PATCH v2 4/4] Document the new color.* settings to colorize push errors/hints

2018-04-07 Thread Eric Sunshine
On Fri, Apr 6, 2018 at 8:15 AM, Johannes Schindelin
 wrote:
> On Fri, 6 Apr 2018, Eric Sunshine wrote:
>> On Thu, Apr 5, 2018 at 6:48 PM, Johannes Schindelin
>>  wrote:
>> > +color.advice.advice::
>> > +   Use customized color for hints.
>>
>> Is "color.advice.advice" correct?
>
> As per the patch, yes. But you're right, it sounds silly. Will change to
> `color.advice.hint`, okay?

That's more understandable. Thanks.


Re: [PATCH 13/19] refs: store the main ref store inside the repository struct

2018-04-07 Thread Eric Sunshine
On Fri, Apr 6, 2018 at 7:21 PM, Stefan Beller  wrote:
> diff --git a/repository.h b/repository.h
> @@ -26,6 +26,11 @@ struct repository {
> +   /*
> +* The store in which the refs are hold.
> +*/

s/hold/held/

Also, this comment is short enough to fit on one line: /* ... */

> +   struct ref_store *main_ref_store;


Re: [PATCH 06/19] refs: add repository argument to get_main_ref_store

2018-04-07 Thread Eric Sunshine
On Fri, Apr 6, 2018 at 7:21 PM, Stefan Beller  wrote:
> Add a repository argument to allow the get_main_ref_store caller
> to be more specific about which repository to handle. This is a small
> mechanical change; it doesn't change the implementation to handle
> repositories other than the_repository yet.
>
> As with the previous commits, use a macro to catch callers passing a
> repository other than the_repository at compile time.
>
> Signed-off-by: Jonathan Nieder 
> Signed-off-by: Stefan Beller 
>
> # Conflicts:
> #   t/helper/test-ref-store.c

Meh.


Guten Morgen,

2018-04-07 Thread noreply

Nachricht des Absenders:
/Guten Morgen,
Es tut mir sehr leid, Ihnen diese Nachricht zu senden, die Sie heute von mir
nicht erwartet haben. Bitte seien Sie nicht überrascht, meine Nachricht zu
lesen, weil Sie der beabsichtigte Empfänger sind. Ich habe Ihre
E-Mail-Adresse über ein Online-Verzeichnis erhalten, und ich habe Sie
persönlich mit Ihrer E-Mail-Adresse kontaktiert, weil ich Ihre
Unterstützung dringend benötigte. Ich heiße Jana Olav. Ich bin 19 Jahre
alt und die einzige Tochter meiner verstorbenen Eltern Herr und Frau Desire
und Caroline Olav. Mein Vater war ein Kakaohändler und er arbeitete auch mit
einem Ölbohrunternehmen hier in meinem Land, der Elfenbeinküste Westafrika.
Mit tiefem Gefühl der Trauer und großen Verlorenen möchte ich Ihnen
mitteilen, dass mein Vater bei einem seiner Geschäftstreffen von seinen
Geschäftspartnern zu Tode vergiftet wurde. Meine Mutter starb, als ich noch
ein Baby war, und das war der Grund, warum mein Vater mich bis zu seinem Tod
am 20. Januar 2016 so besonders behandelt hat. Vor dem Tod meines Vaters in
einem Privatkrankenhaus hat er mir heimlich offenbart, dass er meinen Namen
hinterlegt hat (Jana Olav) eine Summe von 1.500.000,00 Euro (eine Million,
fünfhunderttausend Euro) in Westra Wemlands Sparbank in Schweden.
Ich habe dich heute kontaktiert, weil nach dem Tod meines Vaters mein
einziger Onkel, der sehr böse und böse ist, mich gezwungen hat, mein
Erbgeld von mir abzuholen, weil ich ein kleines Mädchen bin. Ich lehnte ab
und mein Onkel schickte mich vor fünf Monaten aus unserem Familienhaus. Ich
bleibe zurzeit in einem lokalen Hotel wegen meiner Sicherheit. Bitte ich
möchte, dass Sie mir auf folgende Weise helfen.
1- Um mir zu helfen, dieses Geld auf Ihr Bankkonto zu überweisen
2- Um mir zu helfen, die Mittel in ein gutes Geschäft in Ihrem Land nach der
Übertragung zu investieren
3- Um mir zu helfen, mich an einer medizinischen Universität zu
registrieren, wenn ich in dein Land komme.
Schließlich bin ich bereit, Ihnen 20% (450.000,00 Euro) als Entschädigung
für Ihre Hilfe nach der Überweisung auf Ihr Bankkonto anzubieten. Ich werde
auf Ihre schnelle Antwort und Zusicherung warten, um mir zu helfen, damit ich
Ihnen mehr Details gebe. Entschuldigung für meine schlechte 'deutsche'
Übersetzung, ich kann gute englische Sprache schreiben.
Beste Wünsche mit meiner ganzen Liebe,
Mit freundlichen Grüßen,
Frau Jana/
Parador de Alcañiz [1]

[1] http://www.parador.es/de/paradores/parador-de-alcaniz