Re: [PATCH v2 15/16] index-pack: use nr_objects_final as sha1_table size
On Tue, 10 Sep 2013, Duy Nguyen wrote: > On Mon, Sep 9, 2013 at 10:01 PM, Nicolas Pitre wrote: > > However this means that the progress meter will now be wrong and that's > > terrible ! Users *will* complain that the meter doesn't reach 100% and > > they'll protest for being denied the remaining objects during the > > transfer ! > > > > Joking aside, we should think about doing something about it. I was > > wondering if some kind of prefix to the pack stream could be inserted > > onto the wire when sending a pack v4. Something like: > > > > 'T', 'H', 'I', 'N', > > > > This 8-byte prefix would simply be discarded by index-pack after being > > parsed. > > > > What do you think? > > I have no problem with this. Although I rather we generalize the case > to support multiple packs in the same stream (in some case the server > can just stream away one big existing pack, followed by a smaller pack > of recent updates), where "thin" is just a special pack that is not > saved on disk. So except for the signature difference, it should at > least follow the pack header (sig, version, nr_objects) Except in this case this is not a separate pack. This prefix is there to provide information that is valid only for the pack to follow and therefore cannot be considered as some independent data. Nicolas -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 15/16] index-pack: use nr_objects_final as sha1_table size
On Mon, Sep 9, 2013 at 10:01 PM, Nicolas Pitre wrote: > However this means that the progress meter will now be wrong and that's > terrible ! Users *will* complain that the meter doesn't reach 100% and > they'll protest for being denied the remaining objects during the > transfer ! > > Joking aside, we should think about doing something about it. I was > wondering if some kind of prefix to the pack stream could be inserted > onto the wire when sending a pack v4. Something like: > > 'T', 'H', 'I', 'N', > > This 8-byte prefix would simply be discarded by index-pack after being > parsed. > > What do you think? I have no problem with this. Although I rather we generalize the case to support multiple packs in the same stream (in some case the server can just stream away one big existing pack, followed by a smaller pack of recent updates), where "thin" is just a special pack that is not saved on disk. So except for the signature difference, it should at least follow the pack header (sig, version, nr_objects) -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 15/16] index-pack: use nr_objects_final as sha1_table size
On Mon, 9 Sep 2013, Junio C Hamano wrote: > Nicolas Pitre writes: > > > Do we know the actual number of objects to send during the capability > > negociation? > > No, and that is not what I meant. We know upfront after capability > negotiation (by seeing a request to give them a thin-pack) that we > will send, in addition to the usual packfile, the prefix that > carries that information and that is the important part. That lets > the receiver decide whether to _expect_ to see the prefix or no > prefix. Without such, there needs some clue in the prefix part > itself if there are prefixes that carry information computed after > capability negotiation finished (i.e. after "object enumeration"). In this case, if negociation concludes on "thin" and "pack-version=4" then that could mean there is a prefix to be expected. Nicolas -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 15/16] index-pack: use nr_objects_final as sha1_table size
Nicolas Pitre writes: > Do we know the actual number of objects to send during the capability > negociation? No, and that is not what I meant. We know upfront after capability negotiation (by seeing a request to give them a thin-pack) that we will send, in addition to the usual packfile, the prefix that carries that information and that is the important part. That lets the receiver decide whether to _expect_ to see the prefix or no prefix. Without such, there needs some clue in the prefix part itself if there are prefixes that carry information computed after capability negotiation finished (i.e. after "object enumeration"). Sorry if I was unclear. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 15/16] index-pack: use nr_objects_final as sha1_table size
On Mon, 9 Sep 2013, Junio C Hamano wrote: > Nicolas Pitre writes: > > >> > ... I was > >> > wondering if some kind of prefix to the pack stream could be inserted > >> > onto the wire when sending a pack v4. Something like: > >> > > >> > 'T', 'H', 'I', 'N', > >> > > >> > This 8-byte prefix would simply be discarded by index-pack after being > >> > parsed. > >> > > >> > What do you think? > >> > >> I do not think it is _too_ bad if the meter jumped from 92% to 100% > >> when we finish reading from the other end ;-), as long as we can > >> reliably tell that we read the right thing. > > > > Sure. but eventually people will complain about this. So while we're > > about to introduce a new pack format anyway, better think of this little > > cosmetic detail now when it can be included in the pack v4 capability > > negociation. > > Oh, I completely agree on that part. When we send a self-contained > pack, would we send nothing? That is, should the receiving end > expect and rely on that the sending end will send a thin pack and > never a fat pack when asked to send a thin pack (and vice versa)? > > Also should we make the "even though we have negotiated the protocol > parameters, after enumerating the objects and deciding what the pack > stream would look like, we have a bit more information to tell you" > the sending side gives the receiver extensible? I am wondering if > that prefix needs something like "end of prefix" marker (or "here > comes N-bytes worth of prefix information" upfront); we probably do > not need it, as the capability exchange will determine what kind of > information will be sent (e.g. "actual objects in the thin pack data > stream"). Do we know the actual number of objects to send during the capability negociation? I don't think so as this is known only after the "compressing objects" phase, and that already depends on the capability negociation before it can start. Nicolas -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 15/16] index-pack: use nr_objects_final as sha1_table size
Nicolas Pitre writes: > On Mon, 9 Sep 2013, Nguyễn Thái Ngọc Duy wrote: > >> nr_objects in the next patch is used to reflect the number of actual >> objects in the stream, which may be smaller than the number recorded >> in pack header. > > This highlights an issue that has been nagging me for a while. > > We decided to send the final number of objects in the thin pack header > for two reasons: > > 1) it allows to properly size the SHA1 table upfront which already >contains entries for the omitted objects; > > 2) the whole pack doesn't have to be re-summed again after being >completed on the receiving end since we don't alter the header. > > However this means that the progress meter will now be wrong and that's > terrible ! Users *will* complain that the meter doesn't reach 100% and > they'll protest for being denied the remaining objects during the > transfer ! > > Joking aside, we should think about doing something about it. I was > wondering if some kind of prefix to the pack stream could be inserted > onto the wire when sending a pack v4. Something like: > > 'T', 'H', 'I', 'N', > > This 8-byte prefix would simply be discarded by index-pack after being > parsed. > > What do you think? I do not think it is _too_ bad if the meter jumped from 92% to 100% when we finish reading from the other end ;-), as long as we can reliably tell that we read the right thing. Which brings me to a tangent. Do we have a means to make sure that the data received over the wire is bit-for-bit correct as a whole when it is a thin pack stream? When it is a non-thin pack stream, we have the checksum at the end added by sha1close() which index-pack.c::parse_pack_objects() can (and does) verify. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 15/16] index-pack: use nr_objects_final as sha1_table size
On Mon, 9 Sep 2013, Junio C Hamano wrote: > Nicolas Pitre writes: > > > On Mon, 9 Sep 2013, Nguyễn Thái Ngọc Duy wrote: > > > >> nr_objects in the next patch is used to reflect the number of actual > >> objects in the stream, which may be smaller than the number recorded > >> in pack header. > > > > This highlights an issue that has been nagging me for a while. > > > > We decided to send the final number of objects in the thin pack header > > for two reasons: > > > > 1) it allows to properly size the SHA1 table upfront which already > >contains entries for the omitted objects; > > > > 2) the whole pack doesn't have to be re-summed again after being > >completed on the receiving end since we don't alter the header. > > > > However this means that the progress meter will now be wrong and that's > > terrible ! Users *will* complain that the meter doesn't reach 100% and > > they'll protest for being denied the remaining objects during the > > transfer ! > > > > Joking aside, we should think about doing something about it. I was > > wondering if some kind of prefix to the pack stream could be inserted > > onto the wire when sending a pack v4. Something like: > > > > 'T', 'H', 'I', 'N', > > > > This 8-byte prefix would simply be discarded by index-pack after being > > parsed. > > > > What do you think? > > I do not think it is _too_ bad if the meter jumped from 92% to 100% > when we finish reading from the other end ;-), as long as we can > reliably tell that we read the right thing. Sure. but eventually people will complain about this. So while we're about to introduce a new pack format anyway, better think of this little cosmetic detail now when it can be included in the pack v4 capability negociation. > Which brings me to a tangent. Do we have a means to make sure that > the data received over the wire is bit-for-bit correct as a whole > when it is a thin pack stream? When it is a non-thin pack stream, > we have the checksum at the end added by sha1close() which > index-pack.c::parse_pack_objects() can (and does) verify. The trailing checksum is still there. Nothing has changed in that regard. Nicolas
Re: [PATCH v2 15/16] index-pack: use nr_objects_final as sha1_table size
Nicolas Pitre writes: >> > ... I was >> > wondering if some kind of prefix to the pack stream could be inserted >> > onto the wire when sending a pack v4. Something like: >> > >> > 'T', 'H', 'I', 'N', >> > >> > This 8-byte prefix would simply be discarded by index-pack after being >> > parsed. >> > >> > What do you think? >> >> I do not think it is _too_ bad if the meter jumped from 92% to 100% >> when we finish reading from the other end ;-), as long as we can >> reliably tell that we read the right thing. > > Sure. but eventually people will complain about this. So while we're > about to introduce a new pack format anyway, better think of this little > cosmetic detail now when it can be included in the pack v4 capability > negociation. Oh, I completely agree on that part. When we send a self-contained pack, would we send nothing? That is, should the receiving end expect and rely on that the sending end will send a thin pack and never a fat pack when asked to send a thin pack (and vice versa)? Also should we make the "even though we have negotiated the protocol parameters, after enumerating the objects and deciding what the pack stream would look like, we have a bit more information to tell you" the sending side gives the receiver extensible? I am wondering if that prefix needs something like "end of prefix" marker (or "here comes N-bytes worth of prefix information" upfront); we probably do not need it, as the capability exchange will determine what kind of information will be sent (e.g. "actual objects in the thin pack data stream"). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 15/16] index-pack: use nr_objects_final as sha1_table size
On Mon, 9 Sep 2013, Nguyễn Thái Ngọc Duy wrote: > nr_objects in the next patch is used to reflect the number of actual > objects in the stream, which may be smaller than the number recorded > in pack header. This highlights an issue that has been nagging me for a while. We decided to send the final number of objects in the thin pack header for two reasons: 1) it allows to properly size the SHA1 table upfront which already contains entries for the omitted objects; 2) the whole pack doesn't have to be re-summed again after being completed on the receiving end since we don't alter the header. However this means that the progress meter will now be wrong and that's terrible ! Users *will* complain that the meter doesn't reach 100% and they'll protest for being denied the remaining objects during the transfer ! Joking aside, we should think about doing something about it. I was wondering if some kind of prefix to the pack stream could be inserted onto the wire when sending a pack v4. Something like: 'T', 'H', 'I', 'N', This 8-byte prefix would simply be discarded by index-pack after being parsed. What do you think? Nicolas