Yeah, I think the spec should be strict.  And for convenience, I'd say
it should probably be the padded length (though I don't have a strong
opinion).

Regards

Antoine.


Le 03/10/2019 à 06:23, Micah Kornfield a écrit :
> Hi Wes,
> It seems fine to be flexible here.  However:
> 
> 
>> This could have implications for hashing or
>> comparisons, for example, so I think that having the flexibility to do
>> either is a good idea.
> 
> This statement of use-cases makes me a little nervous.  It seems like it
> could lead to bugs if a consumer is reading from two producers that use
> different alternatives?
> 
> Thanks,
> Micah
> 
> On Mon, Sep 30, 2019 at 5:24 PM Wes McKinney <wesmck...@gmail.com> wrote:
> 
>> I just updated my pull request from May adding language to clarify
>> what protocol writers are expected to set when producing the Arrow
>> binary protocol
>>
>> https://github.com/apache/arrow/pull/4370
>>
>> Implementations may allocate small buffers, or use memory which does
>> not meet the 8-byte minimal padding requirements of the Arrow
>> protocol. It becomes a question, then, whether to set the in-memory
>> buffer size or the padded size when producing the protocol.
>>
>> This PR states that either is acceptable. As an example, a 1-byte
>> validity buffer could have Buffer metadata stating that the size
>> either is 1 byte or 8 bytes. Either way, 7 bytes of padding must be
>> written to conform to the protocol. The metadata, therefore, reflects
>> the "intent" of the protocol writer for the protocol reader. If the
>> writer says the length is 1, then the protocol reader understands that
>> the writer does not expect the reader to concern itself with the 7
>> bytes of padding. This could have implications for hashing or
>> comparisons, for example, so I think that having the flexibility to do
>> either is a good idea.
>>
>> For an application that wants to guarantee that AVX512 instructions
>> can be used on all buffers on the receiver side, it would be
>> appropriate to include 512-bit padding in the accounting.
>>
>> Let me know if others think differently so we can have this properly
>> documented for the 1.0.0 Format release.
>>
>> Thanks,
>> Wes
>>
> 

Reply via email to