Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-07-07 Thread Shawn Pearce
On Sun, Jul 7, 2013 at 2:46 AM, Jeff King wrote: > On Mon, Jul 01, 2013 at 11:47:32AM -0700, Colby Ranger wrote: > >> > But I think we are comparing >> > apples to steaks here, Vincent is (rightfully) concerned about process >> > startup performance, whereas our timings were assuming the process w

Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-07-07 Thread Jeff King
On Mon, Jul 01, 2013 at 11:47:32AM -0700, Colby Ranger wrote: > > But I think we are comparing > > apples to steaks here, Vincent is (rightfully) concerned about process > > startup performance, whereas our timings were assuming the process was > > already running. > > > > I did some timing on lo

Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-07-01 Thread Shawn Pearce
On Mon, Jul 1, 2013 at 11:47 AM, Colby Ranger wrote: >> But I think we are comparing >> apples to steaks here, Vincent is (rightfully) concerned about process >> startup performance, whereas our timings were assuming the process was >> already running. >> > > I did some timing on loading the rever

Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-07-01 Thread Colby Ranger
> Right, the format and implementation in JGit can do "Counting objects" > in 87ms for the Linux kernel history. Actually, that was the timing when I first pushed the change. With the improvements submitted throughout the year, we can do counting in 50ms, on my same machine. > But I think we are

Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-06-27 Thread Jeff King
On Thu, Jun 27, 2013 at 09:07:38AM -0700, Shawn O. Pearce wrote: > > And the pack-order versus idx-order for the bitmaps is still up in the > > air. Do we have numbers on the on-disk sizes of the resulting EWAHs? > > I did not see any presented in this thread, and I am very interested > in this a

Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-06-27 Thread Shawn Pearce
On Wed, Jun 26, 2013 at 7:45 PM, Jeff King wrote: > > In particular, it seems like the slowness we saw with the v1 bitmap > format is not what Shawn and Colby have experienced. So it's possible > that our test setup is bad or different. Or maybe the C v1 reading > implementation had some problems

Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-06-26 Thread Jeff King
On Thu, Jun 27, 2013 at 04:36:54AM +0200, Vicent Martí wrote: > That was a very rude reply. :( > > Please refrain from interacting with me in the ML in the future. I'l > do accordingly. I agree that the pointer arithmetic thing may have been a little much, but I think there are some points we ne

Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-06-26 Thread Vicent Martí
That was a very rude reply. :( Please refrain from interacting with me in the ML in the future. I'l do accordingly. Thanks! vmg On Thu, Jun 27, 2013 at 3:11 AM, Shawn Pearce wrote: > On Tue, Jun 25, 2013 at 4:08 PM, Vicent Martí wrote: >> On Tue, Jun 25, 2013 at 11:17 PM, Junio C Hamano wrote

Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-06-26 Thread Shawn Pearce
On Wed, Jun 26, 2013 at 6:53 PM, Colby Ranger wrote: >> + Generating this reverse index at runtime is **not** free (around 900ms >> + generation time for a repository like `torvalds/linux`), and once again, >> + this generation time needs to happen every time `pack-objects` is >> + spawned. 9

Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-06-26 Thread Shawn Pearce
On Tue, Jun 25, 2013 at 11:11 PM, Jeff King wrote: > On Tue, Jun 25, 2013 at 09:33:11PM +0200, Vicent Martí wrote: > >> > One way we side-stepped the size inflation problem in JGit was to only >> > use the bitmap index information when sending data on the wire to a >> > client. Here delta reuse pl

Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-06-26 Thread Shawn Pearce
On Tue, Jun 25, 2013 at 4:08 PM, Vicent Martí wrote: > On Tue, Jun 25, 2013 at 11:17 PM, Junio C Hamano wrote: >> What case are you talking about? >> >> The n-th object must be one of these four types and can never be of >> more than one type at the same time, so a natural expectation from >> the

Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-06-26 Thread Colby Ranger
> + Generating this reverse index at runtime is **not** free (around 900ms > + generation time for a repository like `torvalds/linux`), and once again, > + this generation time needs to happen every time `pack-objects` is > + spawned. If generating the reverse index is expensive, it is probabl

Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-06-26 Thread Thomas Rast
Thomas Rast writes: [...] > The next word after `L_M` (if any) must again be a RLW, for the next > chunk. For efficient appending to the bitstream, the EWAH stores a > format to the last RLW in the stream. ^^ I have no idea what Freud did there, but "pointer" or some such is probably a sa

Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-06-26 Thread Thomas Rast
Vicent Martí writes: > On Tue, Jun 25, 2013 at 5:58 PM, Thomas Rast wrote: >> >> Please document the RLW format here. > > Har har. I was going to comment on your review of the Ewah patchset, > but might as well do it here: the only thing I know about Ewah bitmaps > is that they work. And I know

Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-06-26 Thread Colby Ranger
>> Pinning the bitmap index on the reverse index adds complexity (lookups >> are two-step: first find the entry in the reverse index, and then find >> the SHA1 in the index) and is measurably slower, in both loading and >> lookup times. Since Git doesn't have a memory problem, it's very hard >> to

Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-06-26 Thread Colby Ranger
> Pinning the bitmap index on the reverse index adds complexity (lookups > are two-step: first find the entry in the reverse index, and then find > the SHA1 in the index) and is measurably slower, in both loading and > lookup times. Since Git doesn't have a memory problem, it's very hard > to make

Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-06-25 Thread Jeff King
On Tue, Jun 25, 2013 at 09:33:11PM +0200, Vicent Martí wrote: > > One way we side-stepped the size inflation problem in JGit was to only > > use the bitmap index information when sending data on the wire to a > > client. Here delta reuse plays a significant factor in building the > > pack, and we

Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-06-25 Thread Vicent Martí
On Tue, Jun 25, 2013 at 5:58 PM, Thomas Rast wrote: > >> This is the technical documentation and design rationale for the new >> Bitmap v2 on-disk format. > > Hrmpf, that's what I get for reading the series in order... > >> + The folowing flags are supported: >

Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-06-25 Thread Vicent Martí
On Tue, Jun 25, 2013 at 11:17 PM, Junio C Hamano wrote: > What case are you talking about? > > The n-th object must be one of these four types and can never be of > more than one type at the same time, so a natural expectation from > the reader is "If you OR them together, you will get the same se

Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-06-25 Thread Thomas Rast
Vicent Marti writes: > This is the technical documentation and design rationale for the new > Bitmap v2 on-disk format. Hrmpf, that's what I get for reading the series in order... > + The folowing flags are supported: ^^ typos marked by ^ > +

Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-06-25 Thread Junio C Hamano
Vicent Martí writes: >>> + There is a bitmap for each Git object type, stored in the >>> following >>> + order: >>> + >>> + - Commits >>> + - Trees >>> + - Blobs >>> + - Tags >>> +

Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-06-25 Thread Vicent Martí
On Tue, Jun 25, 2013 at 7:42 AM, Shawn Pearce wrote: > I very much hate seeing a file format that is supposed to be portable > that supports both big-endian and little-endian encoding. Well, the bitmap index is not supposed to be portable, as it doesn't get sent over the wire in any situation. Re

Re: [PATCH 09/16] documentation: add documentation for the bitmap format

2013-06-24 Thread Shawn Pearce
On Mon, Jun 24, 2013 at 5:23 PM, Vicent Marti wrote: > This is the technical documentation and design rationale for the new > Bitmap v2 on-disk format. > --- > Documentation/technical/bitmap-format.txt | 235 > + > 1 file changed, 235 insertions(+) > create mode 100

[PATCH 09/16] documentation: add documentation for the bitmap format

2013-06-24 Thread Vicent Marti
This is the technical documentation and design rationale for the new Bitmap v2 on-disk format. --- Documentation/technical/bitmap-format.txt | 235 + 1 file changed, 235 insertions(+) create mode 100644 Documentation/technical/bitmap-format.txt diff --git a/Documenta