Re: What's cooking in git.git (Nov 2013, #05; Thu, 21)

2013-11-22 Thread Vicent Marti
On Fri, Nov 22, 2013 at 6:26 PM, Jeff King wrote: >> Granted, the way I verified this was checking whether you renamed >> rlw_xor_run_bit() to something more fitting, so perhaps you just forgot >> that one thing but did all the rest. > > I didn't touch that. Vicent, did you have a comment on the n

Re: [PATCH 09/19] documentation: add documentation for the bitmap format

2013-10-30 Thread Vicent Marti
On Wed, Oct 30, 2013 at 11:23 AM, Shawn Pearce wrote: > The name-hash cache is probably important, but it would be nice to > have a variable or flag we can use to disable the name-cache > generation and thus permit Git to create JGit style v1 indexes, and > also use JGit v1 indexes if the name-cac

Re: [PATCH 09/19] documentation: add documentation for the bitmap format

2013-10-30 Thread Vicent Marti
On Wed, Oct 30, 2013 at 11:23 AM, Shawn Pearce wrote: > On Wed, Oct 30, 2013 at 7:50 AM, Jeff King wrote: >> On Fri, Oct 25, 2013 at 01:47:06PM +, Shawn O. Pearce wrote: >> >>> I think Colby and I talked about having additional optional sections >>> in this file, but Colby didn't want to over

Re: [PATCH 11/19] pack-objects: use bitmaps when packing objects

2013-10-30 Thread Vicent Marti
On Wed, Oct 30, 2013 at 11:38 AM, Shawn Pearce wrote: >> Since (1) is relatively rare, I think we are using this as a proxy for >> (2), so that we can do a regular walk rather than looking around for >> bitmaps that don't exist. But I may be misremembering the reasoning. >> Vicent? > > Ah. I am no

Re: [PATCH 10/19] pack-bitmap: add support for bitmap indexes

2013-10-30 Thread Vicent Marti
On Wed, Oct 30, 2013 at 9:10 AM, Jeff King wrote: > > In fact, I'm not quite sure that even a partial reuse up to an offset is > 100% safe. In a newly packed git repo it is, because we always put bases > before deltas (and OFS_DELTA objects need this). But if you had a bitmap > generated from a fi

[PATCH 07/16] compat: add endinanness helpers

2013-06-24 Thread Vicent Marti
The POSIX standard doesn't currently define a `nothll`/`htonll` function pair to perform network-to-host and host-to-network swaps of 64-bit data. These 64-bit swaps are necessary for the on-disk storage of EWAH bitmaps if they are not in native byte order. --- git-compat-util.h | 28 +++

[PATCH 13/16] repack: consider bitmaps when performing repacks

2013-06-24 Thread Vicent Marti
Since `pack-objects` will write a `.bitmap` file next to the `.pack` and `.idx` files, this commit teaches `git-repack` to consider the new bitmap indexes (if they exist) when performing repack operations. This implies moving old bitmap indexes out of the way if we are repacking a repository that

[PATCH 05/16] revision: allow setting custom limiter function

2013-06-24 Thread Vicent Marti
This commit enables users of `struct rev_info` to peform custom limiting during a revision walk (i.e. `get_revision`). If the field `include_check` has been set to a callback, this callback will be issued once for each commit before it is added to the "pending" list of the revwalk. If the include

[PATCH 08/16] ewah: compressed bitmap implementation

2013-06-24 Thread Vicent Marti
EWAH is a word-aligned compressed variant of a bitset (i.e. a data structure that acts as a 0-indexed boolean array for many entries). It uses a 64-bit run-length encoding (RLE) compression scheme, trading some compression for better processing speed. The goal of this word-aligned implementation

[PATCH 11/16] rev-list: add bitmap mode to speed up lists

2013-06-24 Thread Vicent Marti
The bitmap reachability index used to speed up the counting objects phase during `pack-objects` can also be used to optimize a normal rev-list if the only thing required are the SHA1s of the objects during the list. Calling `git rev-list --use-bitmaps [committish]` is the equivalent of `git rev-li

[PATCH 04/16] pack-objects: make `pack_name_hash` global

2013-06-24 Thread Vicent Marti
The hash function used by `builtin/pack-objects.c` to efficiently find delta bases when packing can be of interest for other parts of Git that also have to deal with delta bases. --- builtin/pack-objects.c | 24 ++-- cache.h|2 ++ sha1_file.c|

[PATCH 02/16] sha1_file: refactor into `find_pack_object_pos`

2013-06-24 Thread Vicent Marti
Looking up the offset in the packfile for a given SHA1 involves the following: - Finding the position in the index for the given SHA1 - Accessing the offset cache in the index for the found position There are cases however where we'd like to find the position of a SHA1 in the inde

[PATCH 10/16] pack-objects: use bitmaps when packing objects

2013-06-24 Thread Vicent Marti
A bitmap index is used, if available, to speed up the Counting Objects phase during `pack-objects`. The bitmap index is a `.bitmap` file that can be found inside `$GIT_DIR/objects/pack/`, next to its corresponding packfile, and contains precalculated reachability information for selected commits.

[PATCH 15/16] write-bitmap: implement new git command to write bitmaps

2013-06-24 Thread Vicent Marti
The `pack-objects` builtin is capable of writing out bitmap indexes (.bitmap) next to the their corresponding packfile, as part of the process of actually generating the packfile. This is a very efficient operation because all the required data for writing the bitmap index (commit traversal list,

[PATCH 14/16] sha1_file: implement `nth_packed_object_info`

2013-06-24 Thread Vicent Marti
A new helper function allows to efficiently query the size and real type of an object in a packfile based on its position on the packfile index. This is particularly useful when trying to parse all the information of an index in memory. --- cache.h |1 + sha1_file.c |6 ++ 2 files

[PATCH 06/16] sha1_file: export `git_open_noatime`

2013-06-24 Thread Vicent Marti
The `git_open_noatime` helper can be of general interest for other consumers of git's different on-disk formats. --- cache.h |1 + sha1_file.c |4 +--- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/cache.h b/cache.h index 95ef14d..bbe5e2a 100644 --- a/cache.h +++ b/cac

[PATCH 09/16] documentation: add documentation for the bitmap format

2013-06-24 Thread Vicent Marti
This is the technical documentation and design rationale for the new Bitmap v2 on-disk format. --- Documentation/technical/bitmap-format.txt | 235 + 1 file changed, 235 insertions(+) create mode 100644 Documentation/technical/bitmap-format.txt diff --git a/Documenta

[PATCH 16/16] rev-list: Optimize --count using bitmaps too

2013-06-24 Thread Vicent Marti
If bitmap indexes are available, the process of counting reachable commits with `git rev-list --count` can be greatly sped up. Instead of having to use callbacks that yield each object in the revision list, we can build the reachable bitmap for the list and then use an efficient popcount to find th

[PATCH 12/16] pack-objects: implement bitmap writing

2013-06-24 Thread Vicent Marti
This commit extends more the functionality of `pack-objects` by allowing it to write out a `.bitmap` index next to any written packs, together with the `.idx` index that currently gets written. If bitmaps are enabled for a given repository (either by calling `pack-objects` with the `--use-bitmaps`

[PATCH 00/16] Speed up Counting Objects with bitmap data

2013-06-24 Thread Vicent Marti
able, as we've been testing it on production on the world's largest Git host (Git Hub Dot Com The Web Site) with good results, so I'd love it to have it upstreamed on Core Git. Strawberry kisses, vmg Jeff King (1): list-objects: mark tree as unparsed when we free its buffer

[PATCH 01/16] list-objects: mark tree as unparsed when we free its buffer

2013-06-24 Thread Vicent Marti
From: Jeff King We free the tree buffer during traversal to save memory. However, we do not reset the "parsed" flag, which leaves a landmine for the next person to use the tree. When they call parse_tree it will do nothing, and they will segfault when they try to access the buffer. This hasn't m

Re: libgit2 status

2012-08-25 Thread Vicent Marti
On Sat, Aug 25, 2012 at 2:56 AM, Andreas Ericsson wrote: > Politically, I'm not sure how keen the git community is on handing > over control to the core stuff of git to a commercial entity, The development of libgit2 happens 100% in the open. I don't know what "commercial entity" are you talking