Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-09-13 Thread Neal Richardson
Nice work!

On Fri, Sep 13, 2019 at 10:43 AM Wes McKinney  wrote:
>
> Integration tests are passing:
>
> https://travis-ci.org/apache/arrow/jobs/584361357
>
> I'm merging the feature branch into master. Thanks all
>
> On Thu, Sep 12, 2019 at 6:15 PM Wes McKinney  wrote:
> >
> > Thanks all!
> >
> > It looks like all the changes are in, so we need to see if the
> > integration tests pass before we can merge these changes into master
> >
> > On Thu, Sep 12, 2019 at 7:19 AM Sebastien Binet  wrote:
> > >
> > > hi there,
> > >
> > > On Thu, Sep 12, 2019 at 12:45 AM Wes McKinney  wrote:
> > >
> > > > Thanks Bryan.
> > > >
> > > > I merged the Java patch with the EOS change and submitted a C++ patch
> > > > which also updates the specification
> > > >
> > > > https://github.com/apache/arrow/pull/5361
> > > >
> > > > Let me know when the JS or C# patches are ready to go and I can merge
> > > > those.
> > > >
> > > > I updated https://issues.apache.org/jira/browse/ARROW-6545 to track
> > > > the Go change corresponding to this
> > > >
> > >
> > > done. (I think)
> > >
> > > -s


Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-09-13 Thread Wes McKinney
Integration tests are passing:

https://travis-ci.org/apache/arrow/jobs/584361357

I'm merging the feature branch into master. Thanks all

On Thu, Sep 12, 2019 at 6:15 PM Wes McKinney  wrote:
>
> Thanks all!
>
> It looks like all the changes are in, so we need to see if the
> integration tests pass before we can merge these changes into master
>
> On Thu, Sep 12, 2019 at 7:19 AM Sebastien Binet  wrote:
> >
> > hi there,
> >
> > On Thu, Sep 12, 2019 at 12:45 AM Wes McKinney  wrote:
> >
> > > Thanks Bryan.
> > >
> > > I merged the Java patch with the EOS change and submitted a C++ patch
> > > which also updates the specification
> > >
> > > https://github.com/apache/arrow/pull/5361
> > >
> > > Let me know when the JS or C# patches are ready to go and I can merge
> > > those.
> > >
> > > I updated https://issues.apache.org/jira/browse/ARROW-6545 to track
> > > the Go change corresponding to this
> > >
> >
> > done. (I think)
> >
> > -s


Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-09-12 Thread Sebastien Binet
hi there,

On Thu, Sep 12, 2019 at 12:45 AM Wes McKinney  wrote:

> Thanks Bryan.
>
> I merged the Java patch with the EOS change and submitted a C++ patch
> which also updates the specification
>
> https://github.com/apache/arrow/pull/5361
>
> Let me know when the JS or C# patches are ready to go and I can merge
> those.
>
> I updated https://issues.apache.org/jira/browse/ARROW-6545 to track
> the Go change corresponding to this
>

done. (I think)

-s


Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-09-11 Thread Wes McKinney
Thanks Bryan.

I merged the Java patch with the EOS change and submitted a C++ patch
which also updates the specification

https://github.com/apache/arrow/pull/5361

Let me know when the JS or C# patches are ready to go and I can merge those.

I updated https://issues.apache.org/jira/browse/ARROW-6545 to track
the Go change corresponding to this

On Wed, Sep 11, 2019 at 12:33 AM Bryan Cutler  wrote:
>
> I have the patch for the EOS with Java writers up here
> https://github.com/apache/arrow/pull/5345. Just to clarify, the EOS of
> {0x, 0x} is used for both stream and file formats, in
> non-legacy writing mode.
>
> On Mon, Sep 9, 2019 at 8:01 PM Bryan Cutler  wrote:
>
> > Sounds good to me also and I don't think we need a vote either.
> >
> > On Sat, Sep 7, 2019 at 7:36 PM Micah Kornfield 
> > wrote:
> >
> >> +1 on this, I also don't think a vote is necessary as long as we make the
> >> change before 0.15.0
> >>
> >> On Saturday, September 7, 2019, Wes McKinney  wrote:
> >>
> >> > I see, thank you for catching this nuance.
> >> >
> >> > I agree that using {0x, 0x} for EOS will resolve the
> >> > issue while allowing implementations to be backwards compatible (i.e.
> >> > handling the 4-byte EOS from older payloads).
> >> >
> >> > I'm not sure that we need to have a vote about this, what do others
> >> think?
> >> >
> >> > On Sat, Sep 7, 2019 at 12:47 AM Ji Liu 
> >> wrote:
> >> > >
> >> > > Hi all,
> >> > >
> >> > > During the java code review[1], seems there is a problem with the
> >> > current implementations(C++/Java etc) when reaching EOS, since the new
> >> > format EOS is 8 bytes and the reader only reads 4 bytes when reach the
> >> end
> >> > of stream, and the additional 4 bytes will not be read which cause
> >> problems
> >> > for following up readings.
> >> > >
> >> > > There are some optional suggestions[2] as below, we should reach
> >> > consistent and fix this problem before 0.15 release.
> >> > > i. For the new format, an 8-byte EOS token should look like
> >> {0x,
> >> > 0x}, so we read the continuation token first, and then know to
> >> read
> >> > the next 4 bytes, which are then 0 to signal EOS.ii. Reader just
> >> remember
> >> > the state, so if it reads the continuation token from the beginning,
> >> then
> >> > read all 8 bytes at the end.
> >> > >
> >> > > Thanks,
> >> > > Ji Liu
> >> > >
> >> > > [1] https://github.com/apache/arrow/pull/5229
> >> > > [2] https://github.com/apache/arrow/pull/5229#discussion_r321715682
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > From:Eric Erhardt 
> >> > > Send Time:2019年9月5日(星期四) 07:16
> >> > > To:dev@arrow.apache.org ; Ji Liu <
> >> > niki...@aliyun.com>
> >> > > Cc:emkornfield ; Paul Taylor <
> >> ptay...@apache.org>
> >> > > Subject:RE: [RESULT] [VOTE] Alter Arrow binary protocol to address
> >> > 8-byte Flatbuffer alignment requirements (2nd vote)
> >> > >
> >> > > The C# PR is up.
> >> > >
> >> > > https://github.com/apache/arrow/pull/5280
> >> > >
> >> > > Eric
> >> > >
> >> > > -----Original Message-
> >> > > From: Eric Erhardt 
> >> > > Sent: Wednesday, September 4, 2019 10:12 AM
> >> > > To: dev@arrow.apache.org; Ji Liu 
> >> > > Cc: emkornfield ; Paul Taylor <
> >> ptay...@apache.org
> >> > >
> >> > > Subject: RE: [RESULT] [VOTE] Alter Arrow binary protocol to address
> >> > 8-byte Flatbuffer alignment requirements (2nd vote)
> >> > >
> >> > > I'm working on a PR for the C# bindings. I hope to have it up in the
> >> > next day or two. Integration tests for C# would be a great addition at
> >> some
> >> > point - it's been on my backlog. For now I plan on manually testing it.
> >> > >
> >> > > -Original Message-
> >> > > From: Wes McKinney 
> >> > > Sent: Tuesday, September 3, 2019 10:17 PM
&g

Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-09-10 Thread Bryan Cutler
I have the patch for the EOS with Java writers up here
https://github.com/apache/arrow/pull/5345. Just to clarify, the EOS of
{0x, 0x} is used for both stream and file formats, in
non-legacy writing mode.

On Mon, Sep 9, 2019 at 8:01 PM Bryan Cutler  wrote:

> Sounds good to me also and I don't think we need a vote either.
>
> On Sat, Sep 7, 2019 at 7:36 PM Micah Kornfield 
> wrote:
>
>> +1 on this, I also don't think a vote is necessary as long as we make the
>> change before 0.15.0
>>
>> On Saturday, September 7, 2019, Wes McKinney  wrote:
>>
>> > I see, thank you for catching this nuance.
>> >
>> > I agree that using {0x, 0x} for EOS will resolve the
>> > issue while allowing implementations to be backwards compatible (i.e.
>> > handling the 4-byte EOS from older payloads).
>> >
>> > I'm not sure that we need to have a vote about this, what do others
>> think?
>> >
>> > On Sat, Sep 7, 2019 at 12:47 AM Ji Liu 
>> wrote:
>> > >
>> > > Hi all,
>> > >
>> > > During the java code review[1], seems there is a problem with the
>> > current implementations(C++/Java etc) when reaching EOS, since the new
>> > format EOS is 8 bytes and the reader only reads 4 bytes when reach the
>> end
>> > of stream, and the additional 4 bytes will not be read which cause
>> problems
>> > for following up readings.
>> > >
>> > > There are some optional suggestions[2] as below, we should reach
>> > consistent and fix this problem before 0.15 release.
>> > > i. For the new format, an 8-byte EOS token should look like
>> {0x,
>> > 0x}, so we read the continuation token first, and then know to
>> read
>> > the next 4 bytes, which are then 0 to signal EOS.ii. Reader just
>> remember
>> > the state, so if it reads the continuation token from the beginning,
>> then
>> > read all 8 bytes at the end.
>> > >
>> > > Thanks,
>> > > Ji Liu
>> > >
>> > > [1] https://github.com/apache/arrow/pull/5229
>> > > [2] https://github.com/apache/arrow/pull/5229#discussion_r321715682
>> > >
>> > >
>> > >
>> > >
>> > > --
>> > > From:Eric Erhardt 
>> > > Send Time:2019年9月5日(星期四) 07:16
>> > > To:dev@arrow.apache.org ; Ji Liu <
>> > niki...@aliyun.com>
>> > > Cc:emkornfield ; Paul Taylor <
>> ptay...@apache.org>
>> > > Subject:RE: [RESULT] [VOTE] Alter Arrow binary protocol to address
>> > 8-byte Flatbuffer alignment requirements (2nd vote)
>> > >
>> > > The C# PR is up.
>> > >
>> > > https://github.com/apache/arrow/pull/5280
>> > >
>> > > Eric
>> > >
>> > > -Original Message-
>> > > From: Eric Erhardt 
>> > > Sent: Wednesday, September 4, 2019 10:12 AM
>> > > To: dev@arrow.apache.org; Ji Liu 
>> > > Cc: emkornfield ; Paul Taylor <
>> ptay...@apache.org
>> > >
>> > > Subject: RE: [RESULT] [VOTE] Alter Arrow binary protocol to address
>> > 8-byte Flatbuffer alignment requirements (2nd vote)
>> > >
>> > > I'm working on a PR for the C# bindings. I hope to have it up in the
>> > next day or two. Integration tests for C# would be a great addition at
>> some
>> > point - it's been on my backlog. For now I plan on manually testing it.
>> > >
>> > > -Original Message-
>> > > From: Wes McKinney 
>> > > Sent: Tuesday, September 3, 2019 10:17 PM
>> > > To: Ji Liu 
>> > > Cc: emkornfield ; dev ;
>> > Paul Taylor 
>> > > Subject: Re: [RESULT] [VOTE] Alter Arrow binary protocol to address
>> > 8-byte Flatbuffer alignment requirements (2nd vote)
>> > >
>> > > hi folks,
>> > >
>> > > We now have patches up for Java, JS, and Go. How are we doing on the
>> > code reviews for getting these in?
>> > >
>> > > Since C# implements the binary protocol, the C# developers might want
>> to
>> > look at this before the 0.15.0 release also. Absent integration tests
>> it's
>> > difficult to verify the C# library, though
>> > >
>> > > Thanks
>> > >
>> > &g

Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-09-09 Thread Bryan Cutler
Sounds good to me also and I don't think we need a vote either.

On Sat, Sep 7, 2019 at 7:36 PM Micah Kornfield 
wrote:

> +1 on this, I also don't think a vote is necessary as long as we make the
> change before 0.15.0
>
> On Saturday, September 7, 2019, Wes McKinney  wrote:
>
> > I see, thank you for catching this nuance.
> >
> > I agree that using {0x, 0x} for EOS will resolve the
> > issue while allowing implementations to be backwards compatible (i.e.
> > handling the 4-byte EOS from older payloads).
> >
> > I'm not sure that we need to have a vote about this, what do others
> think?
> >
> > On Sat, Sep 7, 2019 at 12:47 AM Ji Liu 
> wrote:
> > >
> > > Hi all,
> > >
> > > During the java code review[1], seems there is a problem with the
> > current implementations(C++/Java etc) when reaching EOS, since the new
> > format EOS is 8 bytes and the reader only reads 4 bytes when reach the
> end
> > of stream, and the additional 4 bytes will not be read which cause
> problems
> > for following up readings.
> > >
> > > There are some optional suggestions[2] as below, we should reach
> > consistent and fix this problem before 0.15 release.
> > > i. For the new format, an 8-byte EOS token should look like
> {0x,
> > 0x}, so we read the continuation token first, and then know to
> read
> > the next 4 bytes, which are then 0 to signal EOS.ii. Reader just remember
> > the state, so if it reads the continuation token from the beginning, then
> > read all 8 bytes at the end.
> > >
> > > Thanks,
> > > Ji Liu
> > >
> > > [1] https://github.com/apache/arrow/pull/5229
> > > [2] https://github.com/apache/arrow/pull/5229#discussion_r321715682
> > >
> > >
> > >
> > >
> > > ------------------
> > > From:Eric Erhardt 
> > > Send Time:2019年9月5日(星期四) 07:16
> > > To:dev@arrow.apache.org ; Ji Liu <
> > niki...@aliyun.com>
> > > Cc:emkornfield ; Paul Taylor <
> ptay...@apache.org>
> > > Subject:RE: [RESULT] [VOTE] Alter Arrow binary protocol to address
> > 8-byte Flatbuffer alignment requirements (2nd vote)
> > >
> > > The C# PR is up.
> > >
> > > https://github.com/apache/arrow/pull/5280
> > >
> > > Eric
> > >
> > > -Original Message-
> > > From: Eric Erhardt 
> > > Sent: Wednesday, September 4, 2019 10:12 AM
> > > To: dev@arrow.apache.org; Ji Liu 
> > > Cc: emkornfield ; Paul Taylor <
> ptay...@apache.org
> > >
> > > Subject: RE: [RESULT] [VOTE] Alter Arrow binary protocol to address
> > 8-byte Flatbuffer alignment requirements (2nd vote)
> > >
> > > I'm working on a PR for the C# bindings. I hope to have it up in the
> > next day or two. Integration tests for C# would be a great addition at
> some
> > point - it's been on my backlog. For now I plan on manually testing it.
> > >
> > > -Original Message-
> > > From: Wes McKinney 
> > > Sent: Tuesday, September 3, 2019 10:17 PM
> > > To: Ji Liu 
> > > Cc: emkornfield ; dev ;
> > Paul Taylor 
> > > Subject: Re: [RESULT] [VOTE] Alter Arrow binary protocol to address
> > 8-byte Flatbuffer alignment requirements (2nd vote)
> > >
> > > hi folks,
> > >
> > > We now have patches up for Java, JS, and Go. How are we doing on the
> > code reviews for getting these in?
> > >
> > > Since C# implements the binary protocol, the C# developers might want
> to
> > look at this before the 0.15.0 release also. Absent integration tests
> it's
> > difficult to verify the C# library, though
> > >
> > > Thanks
> > >
> > > On Thu, Aug 29, 2019 at 8:13 AM Ji Liu  wrote:
> > > >
> > > > Here is the Java implementation
> > > >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> > > > ub.com
> %2Fapache%2Farrow%2Fpull%2F5229data=02%7C01%7CEric.Erhardt%
> > > > 40microsoft.com
> %7C90f02600c4ce40ff5c9008d730e66b68%7C72f988bf86f141af9
> > > >
> 1ab2d7cd011db47%7C1%7C0%7C637031638512163816sdata=b87u5x8lLvfdnU5
> > > > 6LrGzYR8H0Jh8FfwY2cVjbOsY9hY%3Dreserved=0
> > > >
> > > > cc @Wes McKinney @emkornfield
> > > >
> > > > Thanks,
> > > > Ji Liu
> > > >
> > > > 

Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-09-07 Thread Micah Kornfield
+1 on this, I also don't think a vote is necessary as long as we make the
change before 0.15.0

On Saturday, September 7, 2019, Wes McKinney  wrote:

> I see, thank you for catching this nuance.
>
> I agree that using {0x, 0x} for EOS will resolve the
> issue while allowing implementations to be backwards compatible (i.e.
> handling the 4-byte EOS from older payloads).
>
> I'm not sure that we need to have a vote about this, what do others think?
>
> On Sat, Sep 7, 2019 at 12:47 AM Ji Liu  wrote:
> >
> > Hi all,
> >
> > During the java code review[1], seems there is a problem with the
> current implementations(C++/Java etc) when reaching EOS, since the new
> format EOS is 8 bytes and the reader only reads 4 bytes when reach the end
> of stream, and the additional 4 bytes will not be read which cause problems
> for following up readings.
> >
> > There are some optional suggestions[2] as below, we should reach
> consistent and fix this problem before 0.15 release.
> > i. For the new format, an 8-byte EOS token should look like {0x,
> 0x}, so we read the continuation token first, and then know to read
> the next 4 bytes, which are then 0 to signal EOS.ii. Reader just remember
> the state, so if it reads the continuation token from the beginning, then
> read all 8 bytes at the end.
> >
> > Thanks,
> > Ji Liu
> >
> > [1] https://github.com/apache/arrow/pull/5229
> > [2] https://github.com/apache/arrow/pull/5229#discussion_r321715682
> >
> >
> >
> >
> > --
> > From:Eric Erhardt 
> > Send Time:2019年9月5日(星期四) 07:16
> > To:dev@arrow.apache.org ; Ji Liu <
> niki...@aliyun.com>
> > Cc:emkornfield ; Paul Taylor 
> > Subject:RE: [RESULT] [VOTE] Alter Arrow binary protocol to address
> 8-byte Flatbuffer alignment requirements (2nd vote)
> >
> > The C# PR is up.
> >
> > https://github.com/apache/arrow/pull/5280
> >
> > Eric
> >
> > -----Original Message-----
> > From: Eric Erhardt 
> > Sent: Wednesday, September 4, 2019 10:12 AM
> > To: dev@arrow.apache.org; Ji Liu 
> > Cc: emkornfield ; Paul Taylor  >
> > Subject: RE: [RESULT] [VOTE] Alter Arrow binary protocol to address
> 8-byte Flatbuffer alignment requirements (2nd vote)
> >
> > I'm working on a PR for the C# bindings. I hope to have it up in the
> next day or two. Integration tests for C# would be a great addition at some
> point - it's been on my backlog. For now I plan on manually testing it.
> >
> > -Original Message-
> > From: Wes McKinney 
> > Sent: Tuesday, September 3, 2019 10:17 PM
> > To: Ji Liu 
> > Cc: emkornfield ; dev ;
> Paul Taylor 
> > Subject: Re: [RESULT] [VOTE] Alter Arrow binary protocol to address
> 8-byte Flatbuffer alignment requirements (2nd vote)
> >
> > hi folks,
> >
> > We now have patches up for Java, JS, and Go. How are we doing on the
> code reviews for getting these in?
> >
> > Since C# implements the binary protocol, the C# developers might want to
> look at this before the 0.15.0 release also. Absent integration tests it's
> difficult to verify the C# library, though
> >
> > Thanks
> >
> > On Thu, Aug 29, 2019 at 8:13 AM Ji Liu  wrote:
> > >
> > > Here is the Java implementation
> > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> > > ub.com%2Fapache%2Farrow%2Fpull%2F5229data=02%7C01%7CEric.Erhardt%
> > > 40microsoft.com%7C90f02600c4ce40ff5c9008d730e66b68%7C72f988bf86f141af9
> > > 1ab2d7cd011db47%7C1%7C0%7C637031638512163816sdata=b87u5x8lLvfdnU5
> > > 6LrGzYR8H0Jh8FfwY2cVjbOsY9hY%3Dreserved=0
> > >
> > > cc @Wes McKinney @emkornfield
> > >
> > > Thanks,
> > > Ji Liu
> > >
> > > --
> > > From:Ji Liu  Send Time:2019年8月28日(星期三)
> > > 17:34 To:emkornfield ; dev
> > >  Cc:Paul Taylor 
> > > Subject:Re: [RESULT] [VOTE] Alter Arrow binary protocol to address
> > > 8-byte Flatbuffer alignment requirements (2nd vote)
> > >
> > > I could take the Java implementation and will take a close watch on
> this issue in the next few days.
> > >
> > > Thanks,
> > > Ji Liu
> > >
> > >
> > > --
> > > From:Micah Kornfield  Send Time:2019年8月28日(星期三)
> > > 17:14 To:dev  Cc:Paul Taylor
> > > 
> > >

Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-09-07 Thread Wes McKinney
I see, thank you for catching this nuance.

I agree that using {0x, 0x} for EOS will resolve the
issue while allowing implementations to be backwards compatible (i.e.
handling the 4-byte EOS from older payloads).

I'm not sure that we need to have a vote about this, what do others think?

On Sat, Sep 7, 2019 at 12:47 AM Ji Liu  wrote:
>
> Hi all,
>
> During the java code review[1], seems there is a problem with the current 
> implementations(C++/Java etc) when reaching EOS, since the new format EOS is 
> 8 bytes and the reader only reads 4 bytes when reach the end of stream, and 
> the additional 4 bytes will not be read which cause problems for following up 
> readings.
>
> There are some optional suggestions[2] as below, we should reach consistent 
> and fix this problem before 0.15 release.
> i. For the new format, an 8-byte EOS token should look like {0x, 
> 0x}, so we read the continuation token first, and then know to read 
> the next 4 bytes, which are then 0 to signal EOS.ii. Reader just remember the 
> state, so if it reads the continuation token from the beginning, then read 
> all 8 bytes at the end.
>
> Thanks,
> Ji Liu
>
> [1] https://github.com/apache/arrow/pull/5229
> [2] https://github.com/apache/arrow/pull/5229#discussion_r321715682
>
>
>
>
> --
> From:Eric Erhardt 
> Send Time:2019年9月5日(星期四) 07:16
> To:dev@arrow.apache.org ; Ji Liu 
> Cc:emkornfield ; Paul Taylor 
> Subject:RE: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte 
> Flatbuffer alignment requirements (2nd vote)
>
> The C# PR is up.
>
> https://github.com/apache/arrow/pull/5280
>
> Eric
>
> -Original Message-
> From: Eric Erhardt 
> Sent: Wednesday, September 4, 2019 10:12 AM
> To: dev@arrow.apache.org; Ji Liu 
> Cc: emkornfield ; Paul Taylor 
> Subject: RE: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte 
> Flatbuffer alignment requirements (2nd vote)
>
> I'm working on a PR for the C# bindings. I hope to have it up in the next day 
> or two. Integration tests for C# would be a great addition at some point - 
> it's been on my backlog. For now I plan on manually testing it.
>
> -Original Message-
> From: Wes McKinney 
> Sent: Tuesday, September 3, 2019 10:17 PM
> To: Ji Liu 
> Cc: emkornfield ; dev ; Paul 
> Taylor 
> Subject: Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte 
> Flatbuffer alignment requirements (2nd vote)
>
> hi folks,
>
> We now have patches up for Java, JS, and Go. How are we doing on the code 
> reviews for getting these in?
>
> Since C# implements the binary protocol, the C# developers might want to look 
> at this before the 0.15.0 release also. Absent integration tests it's 
> difficult to verify the C# library, though
>
> Thanks
>
> On Thu, Aug 29, 2019 at 8:13 AM Ji Liu  wrote:
> >
> > Here is the Java implementation
> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> > ub.com%2Fapache%2Farrow%2Fpull%2F5229data=02%7C01%7CEric.Erhardt%
> > 40microsoft.com%7C90f02600c4ce40ff5c9008d730e66b68%7C72f988bf86f141af9
> > 1ab2d7cd011db47%7C1%7C0%7C637031638512163816sdata=b87u5x8lLvfdnU5
> > 6LrGzYR8H0Jh8FfwY2cVjbOsY9hY%3Dreserved=0
> >
> > cc @Wes McKinney @emkornfield
> >
> > Thanks,
> > Ji Liu
> >
> > --
> > From:Ji Liu  Send Time:2019年8月28日(星期三)
> > 17:34 To:emkornfield ; dev
> >  Cc:Paul Taylor 
> > Subject:Re: [RESULT] [VOTE] Alter Arrow binary protocol to address
> > 8-byte Flatbuffer alignment requirements (2nd vote)
> >
> > I could take the Java implementation and will take a close watch on this 
> > issue in the next few days.
> >
> > Thanks,
> > Ji Liu
> >
> >
> > --
> > From:Micah Kornfield  Send Time:2019年8月28日(星期三)
> > 17:14 To:dev  Cc:Paul Taylor
> > 
> > Subject:Re: [RESULT] [VOTE] Alter Arrow binary protocol to address
> > 8-byte Flatbuffer alignment requirements (2nd vote)
> >
> > I should have integration tests with 0.14.1 generated binaries in the
> > next few days.  I think the one remaining unassigned piece of work in
> > the Java implementation, i can take that up next if no one else gets to it.
> >
> > On Tue, Aug 27, 2019 at 7:19 PM Wes McKinney  wrote:
> >
> > > Here's the C++ changes
> > >
> > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgi
> > > thub.com%2Fapache%2Far

Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-09-06 Thread Ji Liu
Hi all,

During the java code review[1], seems there is a problem with the current 
implementations(C++/Java etc) when reaching EOS, since the new format EOS is 8 
bytes and the reader only reads 4 bytes when reach the end of stream, and the 
additional 4 bytes will not be read which cause problems for following up 
readings.

There are some optional suggestions[2] as below, we should reach consistent and 
fix this problem before 0.15 release.
i. For the new format, an 8-byte EOS token should look like {0x, 
0x}, so we read the continuation token first, and then know to read the 
next 4 bytes, which are then 0 to signal EOS.ii. Reader just remember the 
state, so if it reads the continuation token from the beginning, then read all 
8 bytes at the end.

Thanks,
Ji Liu

[1] https://github.com/apache/arrow/pull/5229
[2] https://github.com/apache/arrow/pull/5229#discussion_r321715682




--
From:Eric Erhardt 
Send Time:2019年9月5日(星期四) 07:16
To:dev@arrow.apache.org ; Ji Liu 
Cc:emkornfield ; Paul Taylor 
Subject:RE: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte 
Flatbuffer alignment requirements (2nd vote)

The C# PR is up.

https://github.com/apache/arrow/pull/5280

Eric

-Original Message-
From: Eric Erhardt  
Sent: Wednesday, September 4, 2019 10:12 AM
To: dev@arrow.apache.org; Ji Liu 
Cc: emkornfield ; Paul Taylor 
Subject: RE: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte 
Flatbuffer alignment requirements (2nd vote)

I'm working on a PR for the C# bindings. I hope to have it up in the next day 
or two. Integration tests for C# would be a great addition at some point - it's 
been on my backlog. For now I plan on manually testing it.

-Original Message-
From: Wes McKinney 
Sent: Tuesday, September 3, 2019 10:17 PM
To: Ji Liu 
Cc: emkornfield ; dev ; Paul 
Taylor 
Subject: Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte 
Flatbuffer alignment requirements (2nd vote)

hi folks,

We now have patches up for Java, JS, and Go. How are we doing on the code 
reviews for getting these in?

Since C# implements the binary protocol, the C# developers might want to look 
at this before the 0.15.0 release also. Absent integration tests it's difficult 
to verify the C# library, though

Thanks

On Thu, Aug 29, 2019 at 8:13 AM Ji Liu  wrote:
>
> Here is the Java implementation
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> ub.com%2Fapache%2Farrow%2Fpull%2F5229data=02%7C01%7CEric.Erhardt%
> 40microsoft.com%7C90f02600c4ce40ff5c9008d730e66b68%7C72f988bf86f141af9
> 1ab2d7cd011db47%7C1%7C0%7C637031638512163816sdata=b87u5x8lLvfdnU5
> 6LrGzYR8H0Jh8FfwY2cVjbOsY9hY%3Dreserved=0
>
> cc @Wes McKinney @emkornfield
>
> Thanks,
> Ji Liu
>
> --
> From:Ji Liu  Send Time:2019年8月28日(星期三)
> 17:34 To:emkornfield ; dev 
>  Cc:Paul Taylor 
> Subject:Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 
> 8-byte Flatbuffer alignment requirements (2nd vote)
>
> I could take the Java implementation and will take a close watch on this 
> issue in the next few days.
>
> Thanks,
> Ji Liu
>
>
> --
> From:Micah Kornfield  Send Time:2019年8月28日(星期三)
> 17:14 To:dev  Cc:Paul Taylor 
> 
> Subject:Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 
> 8-byte Flatbuffer alignment requirements (2nd vote)
>
> I should have integration tests with 0.14.1 generated binaries in the 
> next few days.  I think the one remaining unassigned piece of work in 
> the Java implementation, i can take that up next if no one else gets to it.
>
> On Tue, Aug 27, 2019 at 7:19 PM Wes McKinney  wrote:
>
> > Here's the C++ changes
> >
> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgi
> > thub.com%2Fapache%2Farrow%2Fpull%2F5211data=02%7C01%7CEric.Erha
> > rdt%40microsoft.com%7C90f02600c4ce40ff5c9008d730e66b68%7C72f988bf86f
> > 141af91ab2d7cd011db47%7C1%7C0%7C637031638512163816sdata=zWaHS8X
> > YIQA85xcFG%2FMrOcSfrI8xZtyuHRoaDH%2FIP2g%3Dreserved=0
> >
> > I'm going to create a integration branch where we can merge each 
> > patch before merging to master
> >
> > On Fri, Aug 23, 2019 at 9:03 AM Wes McKinney  wrote:
> > >
> > > It isn't implemented in C++ yet but I will try to get a patch up 
> > > for that soon (today maybe). I think we should create a branch 
> > > where we can stack the patches that implement this for each language.
> > >
> > > On Fri, Aug 23, 2019 at 4:04 AM Paul Taylor 
> > > 
> > wrote:
> > > >
> > > > I'll do the JS updates. Is it safe to 

RE: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-09-04 Thread Eric Erhardt
The C# PR is up.

https://github.com/apache/arrow/pull/5280

Eric

-Original Message-
From: Eric Erhardt  
Sent: Wednesday, September 4, 2019 10:12 AM
To: dev@arrow.apache.org; Ji Liu 
Cc: emkornfield ; Paul Taylor 
Subject: RE: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte 
Flatbuffer alignment requirements (2nd vote)

I'm working on a PR for the C# bindings. I hope to have it up in the next day 
or two. Integration tests for C# would be a great addition at some point - it's 
been on my backlog. For now I plan on manually testing it.

-Original Message-
From: Wes McKinney 
Sent: Tuesday, September 3, 2019 10:17 PM
To: Ji Liu 
Cc: emkornfield ; dev ; Paul 
Taylor 
Subject: Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte 
Flatbuffer alignment requirements (2nd vote)

hi folks,

We now have patches up for Java, JS, and Go. How are we doing on the code 
reviews for getting these in?

Since C# implements the binary protocol, the C# developers might want to look 
at this before the 0.15.0 release also. Absent integration tests it's difficult 
to verify the C# library, though

Thanks

On Thu, Aug 29, 2019 at 8:13 AM Ji Liu  wrote:
>
> Here is the Java implementation
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> ub.com%2Fapache%2Farrow%2Fpull%2F5229data=02%7C01%7CEric.Erhardt%
> 40microsoft.com%7C90f02600c4ce40ff5c9008d730e66b68%7C72f988bf86f141af9
> 1ab2d7cd011db47%7C1%7C0%7C637031638512163816sdata=b87u5x8lLvfdnU5
> 6LrGzYR8H0Jh8FfwY2cVjbOsY9hY%3Dreserved=0
>
> cc @Wes McKinney @emkornfield
>
> Thanks,
> Ji Liu
>
> --
> From:Ji Liu  Send Time:2019年8月28日(星期三)
> 17:34 To:emkornfield ; dev 
>  Cc:Paul Taylor 
> Subject:Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 
> 8-byte Flatbuffer alignment requirements (2nd vote)
>
> I could take the Java implementation and will take a close watch on this 
> issue in the next few days.
>
> Thanks,
> Ji Liu
>
>
> --
> From:Micah Kornfield  Send Time:2019年8月28日(星期三)
> 17:14 To:dev  Cc:Paul Taylor 
> 
> Subject:Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 
> 8-byte Flatbuffer alignment requirements (2nd vote)
>
> I should have integration tests with 0.14.1 generated binaries in the 
> next few days.  I think the one remaining unassigned piece of work in 
> the Java implementation, i can take that up next if no one else gets to it.
>
> On Tue, Aug 27, 2019 at 7:19 PM Wes McKinney  wrote:
>
> > Here's the C++ changes
> >
> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgi
> > thub.com%2Fapache%2Farrow%2Fpull%2F5211data=02%7C01%7CEric.Erha
> > rdt%40microsoft.com%7C90f02600c4ce40ff5c9008d730e66b68%7C72f988bf86f
> > 141af91ab2d7cd011db47%7C1%7C0%7C637031638512163816sdata=zWaHS8X
> > YIQA85xcFG%2FMrOcSfrI8xZtyuHRoaDH%2FIP2g%3Dreserved=0
> >
> > I'm going to create a integration branch where we can merge each 
> > patch before merging to master
> >
> > On Fri, Aug 23, 2019 at 9:03 AM Wes McKinney  wrote:
> > >
> > > It isn't implemented in C++ yet but I will try to get a patch up 
> > > for that soon (today maybe). I think we should create a branch 
> > > where we can stack the patches that implement this for each language.
> > >
> > > On Fri, Aug 23, 2019 at 4:04 AM Paul Taylor 
> > > 
> > wrote:
> > > >
> > > > I'll do the JS updates. Is it safe to validate against the Arrow
> > > > C++ integration tests?
> > > >
> > > >
> > > > On 8/22/19 7:28 PM, Micah Kornfield wrote:
> > > > > I created
> > > > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2
> > > > > F%2Fissues.apache.org%2Fjira%2Fbrowse%2FARROW-6313data=02
> > > > > %7C01%7CEric.Erhardt%40microsoft.com%7C90f02600c4ce40ff5c9008d
> > > > > 730e66b68%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C6370316
> > > > > 38512163816sdata=L57rZWFPdeuRtxFTkL%2F4g9RNI8lXFkRDXQadmj
> > > > > NiLxI%3Dreserved=0 as a
> > tracking
> > > > > issue with sub-issues on the development work.  So far no-one 
> > > > > has
> > claimed
> > > > > Java and Javascript tasks.
> > > > >
> > > > > Would it make sense to have a separate dev branch for this work?
> > > > >
> > > > > Thanks,
> > > > > Micah
> > > > >
> > > > > On Thu, Aug 22, 2019 at 3:24 PM Wes McKinney 
> > > > > 
&g

RE: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-09-04 Thread Eric Erhardt
I'm working on a PR for the C# bindings. I hope to have it up in the next day 
or two. Integration tests for C# would be a great addition at some point - it's 
been on my backlog. For now I plan on manually testing it.

-Original Message-
From: Wes McKinney  
Sent: Tuesday, September 3, 2019 10:17 PM
To: Ji Liu 
Cc: emkornfield ; dev ; Paul 
Taylor 
Subject: Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte 
Flatbuffer alignment requirements (2nd vote)

hi folks,

We now have patches up for Java, JS, and Go. How are we doing on the code 
reviews for getting these in?

Since C# implements the binary protocol, the C# developers might want to look 
at this before the 0.15.0 release also. Absent integration tests it's difficult 
to verify the C# library, though

Thanks

On Thu, Aug 29, 2019 at 8:13 AM Ji Liu  wrote:
>
> Here is the Java implementation
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> ub.com%2Fapache%2Farrow%2Fpull%2F5229data=02%7C01%7CEric.Erhardt%
> 40microsoft.com%7C90f02600c4ce40ff5c9008d730e66b68%7C72f988bf86f141af9
> 1ab2d7cd011db47%7C1%7C0%7C637031638512163816sdata=b87u5x8lLvfdnU5
> 6LrGzYR8H0Jh8FfwY2cVjbOsY9hY%3Dreserved=0
>
> cc @Wes McKinney @emkornfield
>
> Thanks,
> Ji Liu
>
> --
> From:Ji Liu  Send Time:2019年8月28日(星期三) 
> 17:34 To:emkornfield ; dev 
>  Cc:Paul Taylor 
> Subject:Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 
> 8-byte Flatbuffer alignment requirements (2nd vote)
>
> I could take the Java implementation and will take a close watch on this 
> issue in the next few days.
>
> Thanks,
> Ji Liu
>
>
> --
> From:Micah Kornfield  Send Time:2019年8月28日(星期三) 
> 17:14 To:dev  Cc:Paul Taylor 
> 
> Subject:Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 
> 8-byte Flatbuffer alignment requirements (2nd vote)
>
> I should have integration tests with 0.14.1 generated binaries in the 
> next few days.  I think the one remaining unassigned piece of work in 
> the Java implementation, i can take that up next if no one else gets to it.
>
> On Tue, Aug 27, 2019 at 7:19 PM Wes McKinney  wrote:
>
> > Here's the C++ changes
> >
> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgi
> > thub.com%2Fapache%2Farrow%2Fpull%2F5211data=02%7C01%7CEric.Erha
> > rdt%40microsoft.com%7C90f02600c4ce40ff5c9008d730e66b68%7C72f988bf86f
> > 141af91ab2d7cd011db47%7C1%7C0%7C637031638512163816sdata=zWaHS8X
> > YIQA85xcFG%2FMrOcSfrI8xZtyuHRoaDH%2FIP2g%3Dreserved=0
> >
> > I'm going to create a integration branch where we can merge each 
> > patch before merging to master
> >
> > On Fri, Aug 23, 2019 at 9:03 AM Wes McKinney  wrote:
> > >
> > > It isn't implemented in C++ yet but I will try to get a patch up 
> > > for that soon (today maybe). I think we should create a branch 
> > > where we can stack the patches that implement this for each language.
> > >
> > > On Fri, Aug 23, 2019 at 4:04 AM Paul Taylor 
> > > 
> > wrote:
> > > >
> > > > I'll do the JS updates. Is it safe to validate against the Arrow 
> > > > C++ integration tests?
> > > >
> > > >
> > > > On 8/22/19 7:28 PM, Micah Kornfield wrote:
> > > > > I created 
> > > > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2
> > > > > F%2Fissues.apache.org%2Fjira%2Fbrowse%2FARROW-6313data=02
> > > > > %7C01%7CEric.Erhardt%40microsoft.com%7C90f02600c4ce40ff5c9008d
> > > > > 730e66b68%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C6370316
> > > > > 38512163816sdata=L57rZWFPdeuRtxFTkL%2F4g9RNI8lXFkRDXQadmj
> > > > > NiLxI%3Dreserved=0 as a
> > tracking
> > > > > issue with sub-issues on the development work.  So far no-one 
> > > > > has
> > claimed
> > > > > Java and Javascript tasks.
> > > > >
> > > > > Would it make sense to have a separate dev branch for this work?
> > > > >
> > > > > Thanks,
> > > > > Micah
> > > > >
> > > > > On Thu, Aug 22, 2019 at 3:24 PM Wes McKinney 
> > > > > 
> > wrote:
> > > > >
> > > > >> The vote carries with 4 binding +1 votes and 1 non-binding +1
> > > > >>
> > > > >> I'll merge the specification patch later today and we can 
> > > > >> begin working on implementations so we can get this done for 
> &

Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-09-03 Thread Wes McKinney
hi folks,

We now have patches up for Java, JS, and Go. How are we doing on the
code reviews for getting these in?

Since C# implements the binary protocol, the C# developers might want
to look at this before the 0.15.0 release also. Absent integration
tests it's difficult to verify the C# library, though

Thanks

On Thu, Aug 29, 2019 at 8:13 AM Ji Liu  wrote:
>
> Here is the Java implementation
> https://github.com/apache/arrow/pull/5229
>
> cc @Wes McKinney @emkornfield
>
> Thanks,
> Ji Liu
>
> --
> From:Ji Liu 
> Send Time:2019年8月28日(星期三) 17:34
> To:emkornfield ; dev 
> Cc:Paul Taylor 
> Subject:Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte 
> Flatbuffer alignment requirements (2nd vote)
>
> I could take the Java implementation and will take a close watch on this 
> issue in the next few days.
>
> Thanks,
> Ji Liu
>
>
> --
> From:Micah Kornfield 
> Send Time:2019年8月28日(星期三) 17:14
> To:dev 
> Cc:Paul Taylor 
> Subject:Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte 
> Flatbuffer alignment requirements (2nd vote)
>
> I should have integration tests with 0.14.1 generated binaries in the next
> few days.  I think the one remaining unassigned piece of work in the Java
> implementation, i can take that up next if no one else gets to it.
>
> On Tue, Aug 27, 2019 at 7:19 PM Wes McKinney  wrote:
>
> > Here's the C++ changes
> >
> > https://github.com/apache/arrow/pull/5211
> >
> > I'm going to create a integration branch where we can merge each patch
> > before merging to master
> >
> > On Fri, Aug 23, 2019 at 9:03 AM Wes McKinney  wrote:
> > >
> > > It isn't implemented in C++ yet but I will try to get a patch up for
> > > that soon (today maybe). I think we should create a branch where we
> > > can stack the patches that implement this for each language.
> > >
> > > On Fri, Aug 23, 2019 at 4:04 AM Paul Taylor 
> > wrote:
> > > >
> > > > I'll do the JS updates. Is it safe to validate against the Arrow C++
> > > > integration tests?
> > > >
> > > >
> > > > On 8/22/19 7:28 PM, Micah Kornfield wrote:
> > > > > I created https://issues.apache.org/jira/browse/ARROW-6313 as a
> > tracking
> > > > > issue with sub-issues on the development work.  So far no-one has
> > claimed
> > > > > Java and Javascript tasks.
> > > > >
> > > > > Would it make sense to have a separate dev branch for this work?
> > > > >
> > > > > Thanks,
> > > > > Micah
> > > > >
> > > > > On Thu, Aug 22, 2019 at 3:24 PM Wes McKinney 
> > wrote:
> > > > >
> > > > >> The vote carries with 4 binding +1 votes and 1 non-binding +1
> > > > >>
> > > > >> I'll merge the specification patch later today and we can begin
> > > > >> working on implementations so we can get this done for 0.15.0
> > > > >>
> > > > >> On Tue, Aug 20, 2019 at 12:30 PM Bryan Cutler 
> > wrote:
> > > > >>> +1 (non-binding)
> > > > >>>
> > > > >>> On Tue, Aug 20, 2019, 7:43 AM Antoine Pitrou 
> > > > >> wrote:
> > > > >>>> Sorry, had forgotten to send my vote on this.
> > > > >>>>
> > > > >>>> +1 from me.
> > > > >>>>
> > > > >>>> Regards
> > > > >>>>
> > > > >>>> Antoine.
> > > > >>>>
> > > > >>>>
> > > > >>>> On Wed, 14 Aug 2019 17:42:33 -0500
> > > > >>>> Wes McKinney  wrote:
> > > > >>>>> hi all,
> > > > >>>>>
> > > > >>>>> As we've been discussing [1], there is a need to introduce 4
> > bytes of
> > > > >>>>> padding into the preamble of the "encapsulated IPC message"
> > format to
> > > > >>>>> ensure that the Flatbuffers metadata payload begins on an 8-byte
> > > > >>>>> aligned memory offset. The alternative to this would be for Arrow
> > > > >>>>> implementations where alignment is important (e.g. C or C++) to
> > copy
> > > > >>>>> the metadata (whic

Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-08-29 Thread Ji Liu
Here is the Java implementation
https://github.com/apache/arrow/pull/5229

cc @Wes McKinney @emkornfield

Thanks,
Ji Liu


--
From:Ji Liu 
Send Time:2019年8月28日(星期三) 17:34
To:emkornfield ; dev 
Cc:Paul Taylor 
Subject:Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte 
Flatbuffer alignment requirements (2nd vote)

I could take the Java implementation and will take a close watch on this issue 
in the next few days.

Thanks,
Ji Liu


--
From:Micah Kornfield 
Send Time:2019年8月28日(星期三) 17:14
To:dev 
Cc:Paul Taylor 
Subject:Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte 
Flatbuffer alignment requirements (2nd vote)

I should have integration tests with 0.14.1 generated binaries in the next
few days.  I think the one remaining unassigned piece of work in the Java
implementation, i can take that up next if no one else gets to it.

On Tue, Aug 27, 2019 at 7:19 PM Wes McKinney  wrote:

> Here's the C++ changes
>
> https://github.com/apache/arrow/pull/5211
>
> I'm going to create a integration branch where we can merge each patch
> before merging to master
>
> On Fri, Aug 23, 2019 at 9:03 AM Wes McKinney  wrote:
> >
> > It isn't implemented in C++ yet but I will try to get a patch up for
> > that soon (today maybe). I think we should create a branch where we
> > can stack the patches that implement this for each language.
> >
> > On Fri, Aug 23, 2019 at 4:04 AM Paul Taylor 
> wrote:
> > >
> > > I'll do the JS updates. Is it safe to validate against the Arrow C++
> > > integration tests?
> > >
> > >
> > > On 8/22/19 7:28 PM, Micah Kornfield wrote:
> > > > I created https://issues.apache.org/jira/browse/ARROW-6313 as a
> tracking
> > > > issue with sub-issues on the development work.  So far no-one has
> claimed
> > > > Java and Javascript tasks.
> > > >
> > > > Would it make sense to have a separate dev branch for this work?
> > > >
> > > > Thanks,
> > > > Micah
> > > >
> > > > On Thu, Aug 22, 2019 at 3:24 PM Wes McKinney 
> wrote:
> > > >
> > > >> The vote carries with 4 binding +1 votes and 1 non-binding +1
> > > >>
> > > >> I'll merge the specification patch later today and we can begin
> > > >> working on implementations so we can get this done for 0.15.0
> > > >>
> > > >> On Tue, Aug 20, 2019 at 12:30 PM Bryan Cutler 
> wrote:
> > > >>> +1 (non-binding)
> > > >>>
> > > >>> On Tue, Aug 20, 2019, 7:43 AM Antoine Pitrou 
> > > >> wrote:
> > > >>>> Sorry, had forgotten to send my vote on this.
> > > >>>>
> > > >>>> +1 from me.
> > > >>>>
> > > >>>> Regards
> > > >>>>
> > > >>>> Antoine.
> > > >>>>
> > > >>>>
> > > >>>> On Wed, 14 Aug 2019 17:42:33 -0500
> > > >>>> Wes McKinney  wrote:
> > > >>>>> hi all,
> > > >>>>>
> > > >>>>> As we've been discussing [1], there is a need to introduce 4
> bytes of
> > > >>>>> padding into the preamble of the "encapsulated IPC message"
> format to
> > > >>>>> ensure that the Flatbuffers metadata payload begins on an 8-byte
> > > >>>>> aligned memory offset. The alternative to this would be for Arrow
> > > >>>>> implementations where alignment is important (e.g. C or C++) to
> copy
> > > >>>>> the metadata (which is not always small) into memory when it is
> > > >>>>> unaligned.
> > > >>>>>
> > > >>>>> Micah has proposed to address this by adding a
> > > >>>>> 4-byte "continuation" value at the beginning of the payload
> > > >>>>> having the value 0x. The reason to do it this way is that
> > > >>>>> old clients will see an invalid length (what is currently the
> > > >>>>> first 4 bytes of the message -- a 32-bit little endian signed
> > > >>>>> integer indicating the metadata length) rather than potentially
> > > >>>>> crashing on a valid length. We also propose to expand the "end of
> > > >>>>> stream" marker used i

Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-08-28 Thread Ji Liu
I could take the Java implementation and will take a close watch on this issue 
in the next few days.

Thanks,
Ji Liu


--
From:Micah Kornfield 
Send Time:2019年8月28日(星期三) 17:14
To:dev 
Cc:Paul Taylor 
Subject:Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte 
Flatbuffer alignment requirements (2nd vote)

I should have integration tests with 0.14.1 generated binaries in the next
few days.  I think the one remaining unassigned piece of work in the Java
implementation, i can take that up next if no one else gets to it.

On Tue, Aug 27, 2019 at 7:19 PM Wes McKinney  wrote:

> Here's the C++ changes
>
> https://github.com/apache/arrow/pull/5211
>
> I'm going to create a integration branch where we can merge each patch
> before merging to master
>
> On Fri, Aug 23, 2019 at 9:03 AM Wes McKinney  wrote:
> >
> > It isn't implemented in C++ yet but I will try to get a patch up for
> > that soon (today maybe). I think we should create a branch where we
> > can stack the patches that implement this for each language.
> >
> > On Fri, Aug 23, 2019 at 4:04 AM Paul Taylor 
> wrote:
> > >
> > > I'll do the JS updates. Is it safe to validate against the Arrow C++
> > > integration tests?
> > >
> > >
> > > On 8/22/19 7:28 PM, Micah Kornfield wrote:
> > > > I created https://issues.apache.org/jira/browse/ARROW-6313 as a
> tracking
> > > > issue with sub-issues on the development work.  So far no-one has
> claimed
> > > > Java and Javascript tasks.
> > > >
> > > > Would it make sense to have a separate dev branch for this work?
> > > >
> > > > Thanks,
> > > > Micah
> > > >
> > > > On Thu, Aug 22, 2019 at 3:24 PM Wes McKinney 
> wrote:
> > > >
> > > >> The vote carries with 4 binding +1 votes and 1 non-binding +1
> > > >>
> > > >> I'll merge the specification patch later today and we can begin
> > > >> working on implementations so we can get this done for 0.15.0
> > > >>
> > > >> On Tue, Aug 20, 2019 at 12:30 PM Bryan Cutler 
> wrote:
> > > >>> +1 (non-binding)
> > > >>>
> > > >>> On Tue, Aug 20, 2019, 7:43 AM Antoine Pitrou 
> > > >> wrote:
> > > >>>> Sorry, had forgotten to send my vote on this.
> > > >>>>
> > > >>>> +1 from me.
> > > >>>>
> > > >>>> Regards
> > > >>>>
> > > >>>> Antoine.
> > > >>>>
> > > >>>>
> > > >>>> On Wed, 14 Aug 2019 17:42:33 -0500
> > > >>>> Wes McKinney  wrote:
> > > >>>>> hi all,
> > > >>>>>
> > > >>>>> As we've been discussing [1], there is a need to introduce 4
> bytes of
> > > >>>>> padding into the preamble of the "encapsulated IPC message"
> format to
> > > >>>>> ensure that the Flatbuffers metadata payload begins on an 8-byte
> > > >>>>> aligned memory offset. The alternative to this would be for Arrow
> > > >>>>> implementations where alignment is important (e.g. C or C++) to
> copy
> > > >>>>> the metadata (which is not always small) into memory when it is
> > > >>>>> unaligned.
> > > >>>>>
> > > >>>>> Micah has proposed to address this by adding a
> > > >>>>> 4-byte "continuation" value at the beginning of the payload
> > > >>>>> having the value 0x. The reason to do it this way is that
> > > >>>>> old clients will see an invalid length (what is currently the
> > > >>>>> first 4 bytes of the message -- a 32-bit little endian signed
> > > >>>>> integer indicating the metadata length) rather than potentially
> > > >>>>> crashing on a valid length. We also propose to expand the "end of
> > > >>>>> stream" marker used in the stream and file format from 4 to 8
> > > >>>>> bytes. This has the additional effect of aligning the file footer
> > > >>>>> defined in File.fbs.
> > > >>>>>
> > > >>>>> This would be a backwards incompatible protocol change, so older
> > > >> Arrow
> > > >>&g

Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-08-28 Thread Micah Kornfield
I should have integration tests with 0.14.1 generated binaries in the next
few days.  I think the one remaining unassigned piece of work in the Java
implementation, i can take that up next if no one else gets to it.

On Tue, Aug 27, 2019 at 7:19 PM Wes McKinney  wrote:

> Here's the C++ changes
>
> https://github.com/apache/arrow/pull/5211
>
> I'm going to create a integration branch where we can merge each patch
> before merging to master
>
> On Fri, Aug 23, 2019 at 9:03 AM Wes McKinney  wrote:
> >
> > It isn't implemented in C++ yet but I will try to get a patch up for
> > that soon (today maybe). I think we should create a branch where we
> > can stack the patches that implement this for each language.
> >
> > On Fri, Aug 23, 2019 at 4:04 AM Paul Taylor 
> wrote:
> > >
> > > I'll do the JS updates. Is it safe to validate against the Arrow C++
> > > integration tests?
> > >
> > >
> > > On 8/22/19 7:28 PM, Micah Kornfield wrote:
> > > > I created https://issues.apache.org/jira/browse/ARROW-6313 as a
> tracking
> > > > issue with sub-issues on the development work.  So far no-one has
> claimed
> > > > Java and Javascript tasks.
> > > >
> > > > Would it make sense to have a separate dev branch for this work?
> > > >
> > > > Thanks,
> > > > Micah
> > > >
> > > > On Thu, Aug 22, 2019 at 3:24 PM Wes McKinney 
> wrote:
> > > >
> > > >> The vote carries with 4 binding +1 votes and 1 non-binding +1
> > > >>
> > > >> I'll merge the specification patch later today and we can begin
> > > >> working on implementations so we can get this done for 0.15.0
> > > >>
> > > >> On Tue, Aug 20, 2019 at 12:30 PM Bryan Cutler 
> wrote:
> > > >>> +1 (non-binding)
> > > >>>
> > > >>> On Tue, Aug 20, 2019, 7:43 AM Antoine Pitrou 
> > > >> wrote:
> > >  Sorry, had forgotten to send my vote on this.
> > > 
> > >  +1 from me.
> > > 
> > >  Regards
> > > 
> > >  Antoine.
> > > 
> > > 
> > >  On Wed, 14 Aug 2019 17:42:33 -0500
> > >  Wes McKinney  wrote:
> > > > hi all,
> > > >
> > > > As we've been discussing [1], there is a need to introduce 4
> bytes of
> > > > padding into the preamble of the "encapsulated IPC message"
> format to
> > > > ensure that the Flatbuffers metadata payload begins on an 8-byte
> > > > aligned memory offset. The alternative to this would be for Arrow
> > > > implementations where alignment is important (e.g. C or C++) to
> copy
> > > > the metadata (which is not always small) into memory when it is
> > > > unaligned.
> > > >
> > > > Micah has proposed to address this by adding a
> > > > 4-byte "continuation" value at the beginning of the payload
> > > > having the value 0x. The reason to do it this way is that
> > > > old clients will see an invalid length (what is currently the
> > > > first 4 bytes of the message -- a 32-bit little endian signed
> > > > integer indicating the metadata length) rather than potentially
> > > > crashing on a valid length. We also propose to expand the "end of
> > > > stream" marker used in the stream and file format from 4 to 8
> > > > bytes. This has the additional effect of aligning the file footer
> > > > defined in File.fbs.
> > > >
> > > > This would be a backwards incompatible protocol change, so older
> > > >> Arrow
> > > > libraries would not be able to read these new messages.
> Maintaining
> > > > forward compatibility (reading data produced by older libraries)
> > > >> would
> > > > be possible as we can reason that a value other than the
> continuation
> > > > value was produced by an older library (and then validate the
> > > > Flatbuffer message of course). Arrow implementations could offer
> a
> > > > backward compatibility mode for the sake of old readers if they
> > > >> desire
> > > > (this may also assist with testing).
> > > >
> > > > Additionally with this vote, we want to formally approve the
> change
> > > >> to
> > > > the Arrow "file" format to always write the (new 8-byte)
> > > >> end-of-stream
> > > > marker, which enables code that processes Arrow streams to safely
> > > >> read
> > > > the file's internal messages as though they were a normal stream.
> > > >
> > > > The PR making these changes to the IPC documentation is here
> > > >
> > > > https://github.com/apache/arrow/pull/4951
> > > >
> > > > Please vote to accept these changes. This vote will be open for
> at
> > > > least 72 hours
> > > >
> > > > [ ] +1 Adopt these Arrow protocol changes
> > > > [ ] +0
> > > > [ ] -1 I disagree because...
> > > >
> > > > Here is my vote: +1
> > > >
> > > > Thanks,
> > > > Wes
> > > >
> > > > [1]:
> > > >>
> https://lists.apache.org/thread.html/8440be572c49b7b2ffb76b63e6d935ada9efd9c1c2021369b6d27786@%3Cdev.arrow.apache.org%3E
> > > 
> > > 
> > > 
>


Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-08-27 Thread Wes McKinney
Here's the C++ changes

https://github.com/apache/arrow/pull/5211

I'm going to create a integration branch where we can merge each patch
before merging to master

On Fri, Aug 23, 2019 at 9:03 AM Wes McKinney  wrote:
>
> It isn't implemented in C++ yet but I will try to get a patch up for
> that soon (today maybe). I think we should create a branch where we
> can stack the patches that implement this for each language.
>
> On Fri, Aug 23, 2019 at 4:04 AM Paul Taylor  wrote:
> >
> > I'll do the JS updates. Is it safe to validate against the Arrow C++
> > integration tests?
> >
> >
> > On 8/22/19 7:28 PM, Micah Kornfield wrote:
> > > I created https://issues.apache.org/jira/browse/ARROW-6313 as a tracking
> > > issue with sub-issues on the development work.  So far no-one has claimed
> > > Java and Javascript tasks.
> > >
> > > Would it make sense to have a separate dev branch for this work?
> > >
> > > Thanks,
> > > Micah
> > >
> > > On Thu, Aug 22, 2019 at 3:24 PM Wes McKinney  wrote:
> > >
> > >> The vote carries with 4 binding +1 votes and 1 non-binding +1
> > >>
> > >> I'll merge the specification patch later today and we can begin
> > >> working on implementations so we can get this done for 0.15.0
> > >>
> > >> On Tue, Aug 20, 2019 at 12:30 PM Bryan Cutler  wrote:
> > >>> +1 (non-binding)
> > >>>
> > >>> On Tue, Aug 20, 2019, 7:43 AM Antoine Pitrou 
> > >> wrote:
> >  Sorry, had forgotten to send my vote on this.
> > 
> >  +1 from me.
> > 
> >  Regards
> > 
> >  Antoine.
> > 
> > 
> >  On Wed, 14 Aug 2019 17:42:33 -0500
> >  Wes McKinney  wrote:
> > > hi all,
> > >
> > > As we've been discussing [1], there is a need to introduce 4 bytes of
> > > padding into the preamble of the "encapsulated IPC message" format to
> > > ensure that the Flatbuffers metadata payload begins on an 8-byte
> > > aligned memory offset. The alternative to this would be for Arrow
> > > implementations where alignment is important (e.g. C or C++) to copy
> > > the metadata (which is not always small) into memory when it is
> > > unaligned.
> > >
> > > Micah has proposed to address this by adding a
> > > 4-byte "continuation" value at the beginning of the payload
> > > having the value 0x. The reason to do it this way is that
> > > old clients will see an invalid length (what is currently the
> > > first 4 bytes of the message -- a 32-bit little endian signed
> > > integer indicating the metadata length) rather than potentially
> > > crashing on a valid length. We also propose to expand the "end of
> > > stream" marker used in the stream and file format from 4 to 8
> > > bytes. This has the additional effect of aligning the file footer
> > > defined in File.fbs.
> > >
> > > This would be a backwards incompatible protocol change, so older
> > >> Arrow
> > > libraries would not be able to read these new messages. Maintaining
> > > forward compatibility (reading data produced by older libraries)
> > >> would
> > > be possible as we can reason that a value other than the continuation
> > > value was produced by an older library (and then validate the
> > > Flatbuffer message of course). Arrow implementations could offer a
> > > backward compatibility mode for the sake of old readers if they
> > >> desire
> > > (this may also assist with testing).
> > >
> > > Additionally with this vote, we want to formally approve the change
> > >> to
> > > the Arrow "file" format to always write the (new 8-byte)
> > >> end-of-stream
> > > marker, which enables code that processes Arrow streams to safely
> > >> read
> > > the file's internal messages as though they were a normal stream.
> > >
> > > The PR making these changes to the IPC documentation is here
> > >
> > > https://github.com/apache/arrow/pull/4951
> > >
> > > Please vote to accept these changes. This vote will be open for at
> > > least 72 hours
> > >
> > > [ ] +1 Adopt these Arrow protocol changes
> > > [ ] +0
> > > [ ] -1 I disagree because...
> > >
> > > Here is my vote: +1
> > >
> > > Thanks,
> > > Wes
> > >
> > > [1]:
> > >> https://lists.apache.org/thread.html/8440be572c49b7b2ffb76b63e6d935ada9efd9c1c2021369b6d27786@%3Cdev.arrow.apache.org%3E
> > 
> > 
> > 


Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-08-23 Thread Wes McKinney
It isn't implemented in C++ yet but I will try to get a patch up for
that soon (today maybe). I think we should create a branch where we
can stack the patches that implement this for each language.

On Fri, Aug 23, 2019 at 4:04 AM Paul Taylor  wrote:
>
> I'll do the JS updates. Is it safe to validate against the Arrow C++
> integration tests?
>
>
> On 8/22/19 7:28 PM, Micah Kornfield wrote:
> > I created https://issues.apache.org/jira/browse/ARROW-6313 as a tracking
> > issue with sub-issues on the development work.  So far no-one has claimed
> > Java and Javascript tasks.
> >
> > Would it make sense to have a separate dev branch for this work?
> >
> > Thanks,
> > Micah
> >
> > On Thu, Aug 22, 2019 at 3:24 PM Wes McKinney  wrote:
> >
> >> The vote carries with 4 binding +1 votes and 1 non-binding +1
> >>
> >> I'll merge the specification patch later today and we can begin
> >> working on implementations so we can get this done for 0.15.0
> >>
> >> On Tue, Aug 20, 2019 at 12:30 PM Bryan Cutler  wrote:
> >>> +1 (non-binding)
> >>>
> >>> On Tue, Aug 20, 2019, 7:43 AM Antoine Pitrou 
> >> wrote:
>  Sorry, had forgotten to send my vote on this.
> 
>  +1 from me.
> 
>  Regards
> 
>  Antoine.
> 
> 
>  On Wed, 14 Aug 2019 17:42:33 -0500
>  Wes McKinney  wrote:
> > hi all,
> >
> > As we've been discussing [1], there is a need to introduce 4 bytes of
> > padding into the preamble of the "encapsulated IPC message" format to
> > ensure that the Flatbuffers metadata payload begins on an 8-byte
> > aligned memory offset. The alternative to this would be for Arrow
> > implementations where alignment is important (e.g. C or C++) to copy
> > the metadata (which is not always small) into memory when it is
> > unaligned.
> >
> > Micah has proposed to address this by adding a
> > 4-byte "continuation" value at the beginning of the payload
> > having the value 0x. The reason to do it this way is that
> > old clients will see an invalid length (what is currently the
> > first 4 bytes of the message -- a 32-bit little endian signed
> > integer indicating the metadata length) rather than potentially
> > crashing on a valid length. We also propose to expand the "end of
> > stream" marker used in the stream and file format from 4 to 8
> > bytes. This has the additional effect of aligning the file footer
> > defined in File.fbs.
> >
> > This would be a backwards incompatible protocol change, so older
> >> Arrow
> > libraries would not be able to read these new messages. Maintaining
> > forward compatibility (reading data produced by older libraries)
> >> would
> > be possible as we can reason that a value other than the continuation
> > value was produced by an older library (and then validate the
> > Flatbuffer message of course). Arrow implementations could offer a
> > backward compatibility mode for the sake of old readers if they
> >> desire
> > (this may also assist with testing).
> >
> > Additionally with this vote, we want to formally approve the change
> >> to
> > the Arrow "file" format to always write the (new 8-byte)
> >> end-of-stream
> > marker, which enables code that processes Arrow streams to safely
> >> read
> > the file's internal messages as though they were a normal stream.
> >
> > The PR making these changes to the IPC documentation is here
> >
> > https://github.com/apache/arrow/pull/4951
> >
> > Please vote to accept these changes. This vote will be open for at
> > least 72 hours
> >
> > [ ] +1 Adopt these Arrow protocol changes
> > [ ] +0
> > [ ] -1 I disagree because...
> >
> > Here is my vote: +1
> >
> > Thanks,
> > Wes
> >
> > [1]:
> >> https://lists.apache.org/thread.html/8440be572c49b7b2ffb76b63e6d935ada9efd9c1c2021369b6d27786@%3Cdev.arrow.apache.org%3E
> 
> 
> 


Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-08-23 Thread Paul Taylor
I'll do the JS updates. Is it safe to validate against the Arrow C++ 
integration tests?



On 8/22/19 7:28 PM, Micah Kornfield wrote:

I created https://issues.apache.org/jira/browse/ARROW-6313 as a tracking
issue with sub-issues on the development work.  So far no-one has claimed
Java and Javascript tasks.

Would it make sense to have a separate dev branch for this work?

Thanks,
Micah

On Thu, Aug 22, 2019 at 3:24 PM Wes McKinney  wrote:


The vote carries with 4 binding +1 votes and 1 non-binding +1

I'll merge the specification patch later today and we can begin
working on implementations so we can get this done for 0.15.0

On Tue, Aug 20, 2019 at 12:30 PM Bryan Cutler  wrote:

+1 (non-binding)

On Tue, Aug 20, 2019, 7:43 AM Antoine Pitrou 

wrote:

Sorry, had forgotten to send my vote on this.

+1 from me.

Regards

Antoine.


On Wed, 14 Aug 2019 17:42:33 -0500
Wes McKinney  wrote:

hi all,

As we've been discussing [1], there is a need to introduce 4 bytes of
padding into the preamble of the "encapsulated IPC message" format to
ensure that the Flatbuffers metadata payload begins on an 8-byte
aligned memory offset. The alternative to this would be for Arrow
implementations where alignment is important (e.g. C or C++) to copy
the metadata (which is not always small) into memory when it is
unaligned.

Micah has proposed to address this by adding a
4-byte "continuation" value at the beginning of the payload
having the value 0x. The reason to do it this way is that
old clients will see an invalid length (what is currently the
first 4 bytes of the message -- a 32-bit little endian signed
integer indicating the metadata length) rather than potentially
crashing on a valid length. We also propose to expand the "end of
stream" marker used in the stream and file format from 4 to 8
bytes. This has the additional effect of aligning the file footer
defined in File.fbs.

This would be a backwards incompatible protocol change, so older

Arrow

libraries would not be able to read these new messages. Maintaining
forward compatibility (reading data produced by older libraries)

would

be possible as we can reason that a value other than the continuation
value was produced by an older library (and then validate the
Flatbuffer message of course). Arrow implementations could offer a
backward compatibility mode for the sake of old readers if they

desire

(this may also assist with testing).

Additionally with this vote, we want to formally approve the change

to

the Arrow "file" format to always write the (new 8-byte)

end-of-stream

marker, which enables code that processes Arrow streams to safely

read

the file's internal messages as though they were a normal stream.

The PR making these changes to the IPC documentation is here

https://github.com/apache/arrow/pull/4951

Please vote to accept these changes. This vote will be open for at
least 72 hours

[ ] +1 Adopt these Arrow protocol changes
[ ] +0
[ ] -1 I disagree because...

Here is my vote: +1

Thanks,
Wes

[1]:

https://lists.apache.org/thread.html/8440be572c49b7b2ffb76b63e6d935ada9efd9c1c2021369b6d27786@%3Cdev.arrow.apache.org%3E






Re: [RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-08-22 Thread Micah Kornfield
I created https://issues.apache.org/jira/browse/ARROW-6313 as a tracking
issue with sub-issues on the development work.  So far no-one has claimed
Java and Javascript tasks.

Would it make sense to have a separate dev branch for this work?

Thanks,
Micah

On Thu, Aug 22, 2019 at 3:24 PM Wes McKinney  wrote:

> The vote carries with 4 binding +1 votes and 1 non-binding +1
>
> I'll merge the specification patch later today and we can begin
> working on implementations so we can get this done for 0.15.0
>
> On Tue, Aug 20, 2019 at 12:30 PM Bryan Cutler  wrote:
> >
> > +1 (non-binding)
> >
> > On Tue, Aug 20, 2019, 7:43 AM Antoine Pitrou 
> wrote:
> >
> > >
> > > Sorry, had forgotten to send my vote on this.
> > >
> > > +1 from me.
> > >
> > > Regards
> > >
> > > Antoine.
> > >
> > >
> > > On Wed, 14 Aug 2019 17:42:33 -0500
> > > Wes McKinney  wrote:
> > > > hi all,
> > > >
> > > > As we've been discussing [1], there is a need to introduce 4 bytes of
> > > > padding into the preamble of the "encapsulated IPC message" format to
> > > > ensure that the Flatbuffers metadata payload begins on an 8-byte
> > > > aligned memory offset. The alternative to this would be for Arrow
> > > > implementations where alignment is important (e.g. C or C++) to copy
> > > > the metadata (which is not always small) into memory when it is
> > > > unaligned.
> > > >
> > > > Micah has proposed to address this by adding a
> > > > 4-byte "continuation" value at the beginning of the payload
> > > > having the value 0x. The reason to do it this way is that
> > > > old clients will see an invalid length (what is currently the
> > > > first 4 bytes of the message -- a 32-bit little endian signed
> > > > integer indicating the metadata length) rather than potentially
> > > > crashing on a valid length. We also propose to expand the "end of
> > > > stream" marker used in the stream and file format from 4 to 8
> > > > bytes. This has the additional effect of aligning the file footer
> > > > defined in File.fbs.
> > > >
> > > > This would be a backwards incompatible protocol change, so older
> Arrow
> > > > libraries would not be able to read these new messages. Maintaining
> > > > forward compatibility (reading data produced by older libraries)
> would
> > > > be possible as we can reason that a value other than the continuation
> > > > value was produced by an older library (and then validate the
> > > > Flatbuffer message of course). Arrow implementations could offer a
> > > > backward compatibility mode for the sake of old readers if they
> desire
> > > > (this may also assist with testing).
> > > >
> > > > Additionally with this vote, we want to formally approve the change
> to
> > > > the Arrow "file" format to always write the (new 8-byte)
> end-of-stream
> > > > marker, which enables code that processes Arrow streams to safely
> read
> > > > the file's internal messages as though they were a normal stream.
> > > >
> > > > The PR making these changes to the IPC documentation is here
> > > >
> > > > https://github.com/apache/arrow/pull/4951
> > > >
> > > > Please vote to accept these changes. This vote will be open for at
> > > > least 72 hours
> > > >
> > > > [ ] +1 Adopt these Arrow protocol changes
> > > > [ ] +0
> > > > [ ] -1 I disagree because...
> > > >
> > > > Here is my vote: +1
> > > >
> > > > Thanks,
> > > > Wes
> > > >
> > > > [1]:
> > >
> https://lists.apache.org/thread.html/8440be572c49b7b2ffb76b63e6d935ada9efd9c1c2021369b6d27786@%3Cdev.arrow.apache.org%3E
> > > >
> > >
> > >
> > >
> > >
>


[RESULT] [VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-08-22 Thread Wes McKinney
The vote carries with 4 binding +1 votes and 1 non-binding +1

I'll merge the specification patch later today and we can begin
working on implementations so we can get this done for 0.15.0

On Tue, Aug 20, 2019 at 12:30 PM Bryan Cutler  wrote:
>
> +1 (non-binding)
>
> On Tue, Aug 20, 2019, 7:43 AM Antoine Pitrou  wrote:
>
> >
> > Sorry, had forgotten to send my vote on this.
> >
> > +1 from me.
> >
> > Regards
> >
> > Antoine.
> >
> >
> > On Wed, 14 Aug 2019 17:42:33 -0500
> > Wes McKinney  wrote:
> > > hi all,
> > >
> > > As we've been discussing [1], there is a need to introduce 4 bytes of
> > > padding into the preamble of the "encapsulated IPC message" format to
> > > ensure that the Flatbuffers metadata payload begins on an 8-byte
> > > aligned memory offset. The alternative to this would be for Arrow
> > > implementations where alignment is important (e.g. C or C++) to copy
> > > the metadata (which is not always small) into memory when it is
> > > unaligned.
> > >
> > > Micah has proposed to address this by adding a
> > > 4-byte "continuation" value at the beginning of the payload
> > > having the value 0x. The reason to do it this way is that
> > > old clients will see an invalid length (what is currently the
> > > first 4 bytes of the message -- a 32-bit little endian signed
> > > integer indicating the metadata length) rather than potentially
> > > crashing on a valid length. We also propose to expand the "end of
> > > stream" marker used in the stream and file format from 4 to 8
> > > bytes. This has the additional effect of aligning the file footer
> > > defined in File.fbs.
> > >
> > > This would be a backwards incompatible protocol change, so older Arrow
> > > libraries would not be able to read these new messages. Maintaining
> > > forward compatibility (reading data produced by older libraries) would
> > > be possible as we can reason that a value other than the continuation
> > > value was produced by an older library (and then validate the
> > > Flatbuffer message of course). Arrow implementations could offer a
> > > backward compatibility mode for the sake of old readers if they desire
> > > (this may also assist with testing).
> > >
> > > Additionally with this vote, we want to formally approve the change to
> > > the Arrow "file" format to always write the (new 8-byte) end-of-stream
> > > marker, which enables code that processes Arrow streams to safely read
> > > the file's internal messages as though they were a normal stream.
> > >
> > > The PR making these changes to the IPC documentation is here
> > >
> > > https://github.com/apache/arrow/pull/4951
> > >
> > > Please vote to accept these changes. This vote will be open for at
> > > least 72 hours
> > >
> > > [ ] +1 Adopt these Arrow protocol changes
> > > [ ] +0
> > > [ ] -1 I disagree because...
> > >
> > > Here is my vote: +1
> > >
> > > Thanks,
> > > Wes
> > >
> > > [1]:
> > https://lists.apache.org/thread.html/8440be572c49b7b2ffb76b63e6d935ada9efd9c1c2021369b6d27786@%3Cdev.arrow.apache.org%3E
> > >
> >
> >
> >
> >