Re: [PATCH v2 15/16] index-pack: use nr_objects_final as sha1_table size

2013-09-12 Thread Nicolas Pitre
On Tue, 10 Sep 2013, Duy Nguyen wrote:

> On Mon, Sep 9, 2013 at 10:01 PM, Nicolas Pitre  wrote:
> > However this means that the progress meter will now be wrong and that's
> > terrible !  Users *will* complain that the meter doesn't reach 100% and
> > they'll protest for being denied the remaining objects during the
> > transfer !
> >
> > Joking aside, we should think about doing something about it.  I was
> > wondering if some kind of prefix to the pack stream could be inserted
> > onto the wire when sending a pack v4.  Something like:
> >
> > 'T', 'H', 'I', 'N', 
> >
> > This 8-byte prefix would simply be discarded by index-pack after being
> > parsed.
> >
> > What do you think?
> 
> I have no problem with this. Although I rather we generalize the case
> to support multiple packs in the same stream (in some case the server
> can just stream away one big existing pack, followed by a smaller pack
> of recent updates), where "thin" is just a special pack that is not
> saved on disk. So except for the signature difference, it should at
> least follow the pack header (sig, version, nr_objects)

Except in this case this is not a separate pack.  This prefix is there 
to provide information that is valid only for the pack to follow and 
therefore cannot be considered as some independent data.


Nicolas
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 15/16] index-pack: use nr_objects_final as sha1_table size

2013-09-09 Thread Duy Nguyen
On Mon, Sep 9, 2013 at 10:01 PM, Nicolas Pitre  wrote:
> However this means that the progress meter will now be wrong and that's
> terrible !  Users *will* complain that the meter doesn't reach 100% and
> they'll protest for being denied the remaining objects during the
> transfer !
>
> Joking aside, we should think about doing something about it.  I was
> wondering if some kind of prefix to the pack stream could be inserted
> onto the wire when sending a pack v4.  Something like:
>
> 'T', 'H', 'I', 'N', 
>
> This 8-byte prefix would simply be discarded by index-pack after being
> parsed.
>
> What do you think?

I have no problem with this. Although I rather we generalize the case
to support multiple packs in the same stream (in some case the server
can just stream away one big existing pack, followed by a smaller pack
of recent updates), where "thin" is just a special pack that is not
saved on disk. So except for the signature difference, it should at
least follow the pack header (sig, version, nr_objects)
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 15/16] index-pack: use nr_objects_final as sha1_table size

2013-09-09 Thread Nicolas Pitre
On Mon, 9 Sep 2013, Junio C Hamano wrote:

> Nicolas Pitre  writes:
> 
> > Do we know the actual number of objects to send during the capability 
> > negociation?
> 
> No, and that is not what I meant.  We know upfront after capability
> negotiation (by seeing a request to give them a thin-pack) that we
> will send, in addition to the usual packfile, the prefix that
> carries that information and that is the important part.  That lets
> the receiver decide whether to _expect_ to see the prefix or no
> prefix.  Without such, there needs some clue in the prefix part
> itself if there are prefixes that carry information computed after
> capability negotiation finished (i.e. after "object enumeration").

In this case, if negociation concludes on "thin" and "pack-version=4" 
then that could mean there is a prefix to be expected.


Nicolas
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 15/16] index-pack: use nr_objects_final as sha1_table size

2013-09-09 Thread Junio C Hamano
Nicolas Pitre  writes:

> Do we know the actual number of objects to send during the capability 
> negociation?

No, and that is not what I meant.  We know upfront after capability
negotiation (by seeing a request to give them a thin-pack) that we
will send, in addition to the usual packfile, the prefix that
carries that information and that is the important part.  That lets
the receiver decide whether to _expect_ to see the prefix or no
prefix.  Without such, there needs some clue in the prefix part
itself if there are prefixes that carry information computed after
capability negotiation finished (i.e. after "object enumeration").

Sorry if I was unclear.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 15/16] index-pack: use nr_objects_final as sha1_table size

2013-09-09 Thread Nicolas Pitre
On Mon, 9 Sep 2013, Junio C Hamano wrote:

> Nicolas Pitre  writes:
> 
> >> > ...  I was 
> >> > wondering if some kind of prefix to the pack stream could be inserted 
> >> > onto the wire when sending a pack v4.  Something like:
> >> >
> >> > 'T', 'H', 'I', 'N', 
> >> >
> >> > This 8-byte prefix would simply be discarded by index-pack after being 
> >> > parsed.
> >> >
> >> > What do you think?
> >> 
> >> I do not think it is _too_ bad if the meter jumped from 92% to 100%
> >> when we finish reading from the other end ;-), as long as we can
> >> reliably tell that we read the right thing.
> >
> > Sure.  but eventually people will complain about this.  So while we're 
> > about to introduce a new pack format anyway, better think of this little 
> > cosmetic detail now when it can be included in the pack v4 capability 
> > negociation.
> 
> Oh, I completely agree on that part.  When we send a self-contained
> pack, would we send nothing?  That is, should the receiving end
> expect and rely on that the sending end will send a thin pack and
> never a fat pack when asked to send a thin pack (and vice versa)?
> 
> Also should we make the "even though we have negotiated the protocol
> parameters, after enumerating the objects and deciding what the pack
> stream would look like, we have a bit more information to tell you"
> the sending side gives the receiver extensible?  I am wondering if
> that prefix needs something like "end of prefix" marker (or "here
> comes N-bytes worth of prefix information" upfront); we probably do
> not need it, as the capability exchange will determine what kind of
> information will be sent (e.g. "actual objects in the thin pack data
> stream").

Do we know the actual number of objects to send during the capability 
negociation?  I don't think so as this is known only after the 
"compressing objects" phase, and that already depends on the capability 
negociation before it can start.


Nicolas
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 15/16] index-pack: use nr_objects_final as sha1_table size

2013-09-09 Thread Junio C Hamano
Nicolas Pitre  writes:

> On Mon, 9 Sep 2013, Nguyễn Thái Ngọc Duy wrote:
>
>> nr_objects in the next patch is used to reflect the number of actual
>> objects in the stream, which may be smaller than the number recorded
>> in pack header.
>
> This highlights an issue that has been nagging me for a while.
>
> We decided to send the final number of objects in the thin pack header 
> for two reasons:
>
> 1) it allows to properly size the SHA1 table upfront which already 
>contains entries for the omitted objects;
>
> 2) the whole pack doesn't have to be re-summed again after being 
>completed on the receiving end since we don't alter the header.
>
> However this means that the progress meter will now be wrong and that's 
> terrible !  Users *will* complain that the meter doesn't reach 100% and 
> they'll protest for being denied the remaining objects during the 
> transfer !
>
> Joking aside, we should think about doing something about it.  I was 
> wondering if some kind of prefix to the pack stream could be inserted 
> onto the wire when sending a pack v4.  Something like:
>
> 'T', 'H', 'I', 'N', 
>
> This 8-byte prefix would simply be discarded by index-pack after being 
> parsed.
>
> What do you think?

I do not think it is _too_ bad if the meter jumped from 92% to 100%
when we finish reading from the other end ;-), as long as we can
reliably tell that we read the right thing.

Which brings me to a tangent.  Do we have a means to make sure that
the data received over the wire is bit-for-bit correct as a whole
when it is a thin pack stream?  When it is a non-thin pack stream,
we have the checksum at the end added by sha1close() which
index-pack.c::parse_pack_objects() can (and does) verify.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 15/16] index-pack: use nr_objects_final as sha1_table size

2013-09-09 Thread Nicolas Pitre
On Mon, 9 Sep 2013, Junio C Hamano wrote:

> Nicolas Pitre  writes:
> 
> > On Mon, 9 Sep 2013, Nguyễn Thái Ngọc Duy wrote:
> >
> >> nr_objects in the next patch is used to reflect the number of actual
> >> objects in the stream, which may be smaller than the number recorded
> >> in pack header.
> >
> > This highlights an issue that has been nagging me for a while.
> >
> > We decided to send the final number of objects in the thin pack header 
> > for two reasons:
> >
> > 1) it allows to properly size the SHA1 table upfront which already 
> >contains entries for the omitted objects;
> >
> > 2) the whole pack doesn't have to be re-summed again after being 
> >completed on the receiving end since we don't alter the header.
> >
> > However this means that the progress meter will now be wrong and that's 
> > terrible !  Users *will* complain that the meter doesn't reach 100% and 
> > they'll protest for being denied the remaining objects during the 
> > transfer !
> >
> > Joking aside, we should think about doing something about it.  I was 
> > wondering if some kind of prefix to the pack stream could be inserted 
> > onto the wire when sending a pack v4.  Something like:
> >
> > 'T', 'H', 'I', 'N', 
> >
> > This 8-byte prefix would simply be discarded by index-pack after being 
> > parsed.
> >
> > What do you think?
> 
> I do not think it is _too_ bad if the meter jumped from 92% to 100%
> when we finish reading from the other end ;-), as long as we can
> reliably tell that we read the right thing.

Sure.  but eventually people will complain about this.  So while we're 
about to introduce a new pack format anyway, better think of this little 
cosmetic detail now when it can be included in the pack v4 capability 
negociation.

> Which brings me to a tangent.  Do we have a means to make sure that
> the data received over the wire is bit-for-bit correct as a whole
> when it is a thin pack stream?  When it is a non-thin pack stream,
> we have the checksum at the end added by sha1close() which
> index-pack.c::parse_pack_objects() can (and does) verify.

The trailing checksum is still there.  Nothing has changed in that 
regard.


Nicolas


Re: [PATCH v2 15/16] index-pack: use nr_objects_final as sha1_table size

2013-09-09 Thread Junio C Hamano
Nicolas Pitre  writes:

>> > ...  I was 
>> > wondering if some kind of prefix to the pack stream could be inserted 
>> > onto the wire when sending a pack v4.  Something like:
>> >
>> > 'T', 'H', 'I', 'N', 
>> >
>> > This 8-byte prefix would simply be discarded by index-pack after being 
>> > parsed.
>> >
>> > What do you think?
>> 
>> I do not think it is _too_ bad if the meter jumped from 92% to 100%
>> when we finish reading from the other end ;-), as long as we can
>> reliably tell that we read the right thing.
>
> Sure.  but eventually people will complain about this.  So while we're 
> about to introduce a new pack format anyway, better think of this little 
> cosmetic detail now when it can be included in the pack v4 capability 
> negociation.

Oh, I completely agree on that part.  When we send a self-contained
pack, would we send nothing?  That is, should the receiving end
expect and rely on that the sending end will send a thin pack and
never a fat pack when asked to send a thin pack (and vice versa)?

Also should we make the "even though we have negotiated the protocol
parameters, after enumerating the objects and deciding what the pack
stream would look like, we have a bit more information to tell you"
the sending side gives the receiver extensible?  I am wondering if
that prefix needs something like "end of prefix" marker (or "here
comes N-bytes worth of prefix information" upfront); we probably do
not need it, as the capability exchange will determine what kind of
information will be sent (e.g. "actual objects in the thin pack data
stream").
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 15/16] index-pack: use nr_objects_final as sha1_table size

2013-09-09 Thread Nicolas Pitre
On Mon, 9 Sep 2013, Nguyễn Thái Ngọc Duy wrote:

> nr_objects in the next patch is used to reflect the number of actual
> objects in the stream, which may be smaller than the number recorded
> in pack header.

This highlights an issue that has been nagging me for a while.

We decided to send the final number of objects in the thin pack header 
for two reasons:

1) it allows to properly size the SHA1 table upfront which already 
   contains entries for the omitted objects;

2) the whole pack doesn't have to be re-summed again after being 
   completed on the receiving end since we don't alter the header.

However this means that the progress meter will now be wrong and that's 
terrible !  Users *will* complain that the meter doesn't reach 100% and 
they'll protest for being denied the remaining objects during the 
transfer !

Joking aside, we should think about doing something about it.  I was 
wondering if some kind of prefix to the pack stream could be inserted 
onto the wire when sending a pack v4.  Something like:

'T', 'H', 'I', 'N', 

This 8-byte prefix would simply be discarded by index-pack after being 
parsed.

What do you think?


Nicolas