hi Eric -- yes, that's correct. I'm planning to amend the Format docs
today regarding the EOS issue and also update the C++ library

On Wed, Sep 11, 2019 at 11:21 AM Eric Erhardt
<eric.erha...@microsoft.com> wrote:
>
> I assume the plan is to merge the ARROW-6313-flatbuffer-alignment branch into 
> master before the 0.15 release, correct?
>
> BTW - I believe the C# alignment changes are ready to be merged into the 
> alignment branch -  https://github.com/apache/arrow/pull/5280/
>
> Eric
>
> -----Original Message-----
> From: Micah Kornfield <emkornfi...@gmail.com>
> Sent: Tuesday, September 10, 2019 10:24 PM
> To: Wes McKinney <wesmck...@gmail.com>
> Cc: dev <dev@arrow.apache.org>; niki.lj <niki...@aliyun.com>
> Subject: Re: Timeline for 0.15.0 release
>
> I should have a little more bandwidth to help with some of the packaging 
> starting tomorrow and going into the weekend.
>
> On Tuesday, September 10, 2019, Wes McKinney <wesmck...@gmail.com> wrote:
>
> > Hi folks,
> >
> > With the state of nightly packaging and integration builds things
> > aren't looking too good for being in release readiness by the end of
> > this week but maybe I'm wrong. I'm planning to be working to close as
> > many issues as I can and also to help with the ongoing alignment fixes.
> >
> > Wes
> >
> > On Thu, Sep 5, 2019, 11:07 PM Micah Kornfield <emkornfi...@gmail.com>
> > wrote:
> >
> >> Just for reference [1] has a dashboard of the current issues:
> >>
> >> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwi
> >> ki.apache.org%2Fconfluence%2Fdisplay%2FARROW%2FArrow%2B0.15.0%2BRelea
> >> se&amp;data=02%7C01%7CEric.Erhardt%40microsoft.com%7Ccbead81a42104034
> >> a4f308d736678a45%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C6370376
> >> 90648216338&amp;sdata=0Upux3i%2B9X6f8uanGKSGM5VYxR6c2ADWrxSPi1%2FgbH4
> >> %3D&amp;reserved=0
> >>
> >> On Thu, Sep 5, 2019 at 3:43 PM Wes McKinney <wesmck...@gmail.com> wrote:
> >>
> >>> hi all,
> >>>
> >>> It doesn't seem like we're going to be in a position to release at
> >>> the beginning of next week. I hope that one more week of work (or
> >>> less) will be enough to get us there. Aside from merging the
> >>> alignment changes, we need to make sure that our packaging jobs
> >>> required for the release candidate are all working.
> >>>
> >>> If folks could remove issues from the 0.15.0 backlog that they don't
> >>> think they will finish by end of next week that would help focus
> >>> efforts (there are currently 78 issues in 0.15.0 still). I am
> >>> looking to tackle a few small features related to dictionaries while
> >>> the release window is still open.
> >>>
> >>> - Wes
> >>>
> >>> On Tue, Aug 27, 2019 at 3:48 PM Wes McKinney <wesmck...@gmail.com>
> >>> wrote:
> >>> >
> >>> > hi,
> >>> >
> >>> > I think we should try to release the week of September 9, so
> >>> > development work should be completed by end of next week.
> >>> >
> >>> > Does that seem reasonable?
> >>> >
> >>> > I plan to get up a patch for the protocol alignment changes for
> >>> > C++ in the next couple of days -- I think that getting the
> >>> > alignment work done is the main barrier to releasing.
> >>> >
> >>> > Thanks
> >>> > Wes
> >>> >
> >>> > On Mon, Aug 19, 2019 at 12:25 PM Ji Liu
> >>> > <niki...@aliyun.com.invalid>
> >>> wrote:
> >>> > >
> >>> > > Hi, Wes, on the java side, I can think of several bugs that need
> >>> > > to
> >>> be fixed or reminded.
> >>> > >
> >>> > > i. ARROW-6040: Dictionary entries are required in IPC streams
> >>> > > even
> >>> when empty[1]
> >>> > > This one is under review now, however through this PR we find
> >>> > > that
> >>> there seems a bug in java reading and writing dictionaries in IPC
> >>> which is Inconsistent with spec[2] since it assumes all dictionaries
> >>> are at the start of stream (see details in PR comments,  and this
> >>> fix may not catch up with version 0.15). @Micah Kornfield
> >>> > >
> >>> > > ii. ARROW-1875: Write 64-bit ints as strings in integration test
> >>> JSON files[3]
> >>> > > Java side code already checked in, other implementations seems not.
> >>> > >
> >>> > > iii. ARROW-6202: OutOfMemory in JdbcAdapter[4] Caused by trying
> >>> > > to load all records in one contiguous batch, fixed
> >>> by providing iterator API for iteratively reading in ARROW-6219[5].
> >>> > >
> >>> > > Thanks,
> >>> > > Ji Liu
> >>> > >
> >>> > > [1]
> >>> > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%
> >>> > > 2Fgithub.com%2Fapache%2Farrow%2Fpull%2F4960&amp;data=02%7C01%7CE
> >>> > > ric.Erhardt%40microsoft.com%7Ccbead81a42104034a4f308d736678a45%7
> >>> > > C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637037690648216338&a
> >>> > > mp;sdata=eDF%2FAsJmVs7WjfEuNBYo%2F1TypIN44xx1TTlK6kQHZVg%3D&amp;
> >>> > > reserved=0 [2]
> >>> > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%
> >>> > > 2Farrow.apache.org%2Fdocs%2Fipc.html&amp;data=02%7C01%7CEric.Erh
> >>> > > ardt%40microsoft.com%7Ccbead81a42104034a4f308d736678a45%7C72f988
> >>> > > bf86f141af91ab2d7cd011db47%7C1%7C0%7C637037690648216338&amp;sdat
> >>> > > a=H0pM8bVKsOyeORDhHxLlS%2BpaS%2F5meT52wxTKmNssuMk%3D&amp;reserve
> >>> > > d=0 [3]
> >>> > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%
> >>> > > 2Fissues.apache.org%2Fjira%2Fbrowse%2FARROW-1875&amp;data=02%7C0
> >>> > > 1%7CEric.Erhardt%40microsoft.com%7Ccbead81a42104034a4f308d736678
> >>> > > a45%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637037690648216
> >>> > > 338&amp;sdata=coTpuoEGhfjyOSBTagdlohOTX24DQZmtbWC0gYsDmkM%3D&amp
> >>> > > ;reserved=0 [4]
> >>> > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%
> >>> > > 2Fissues.apache.org%2Fjira%2Fbrowse%2FARROW-6202%5B5&amp;data=02
> >>> > > %7C01%7CEric.Erhardt%40microsoft.com%7Ccbead81a42104034a4f308d73
> >>> > > 6678a45%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63703769064
> >>> > > 8216338&amp;sdata=gnyUMk8cUgwc802QBLF3eAp3mznYwonlbF0qmGyzgmY%3D
> >>> > > &amp;reserved=0]
> >>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fis
> >>> sues.apache.org%2Fjira%2Fbrowse%2FARROW-6219&amp;data=02%7C01%7CEric
> >>> .Erhardt%40microsoft.com%7Ccbead81a42104034a4f308d736678a45%7C72f988
> >>> bf86f141af91ab2d7cd011db47%7C1%7C0%7C637037690648216338&amp;sdata=d3
> >>> LF%2BTeWSprASqO%2ByE4LywlsULHGcb1Iq%2F2byHrEPkY%3D&amp;reserved=0
> >>> > >
> >>> > >
> >>> > >
> >>> > > ----------------------------------------------------------------
> >>> > > -- From:Wes McKinney <wesmck...@gmail.com> Send
> >>> > > Time:2019年8月19日(星期一) 23:03 To:dev <dev@arrow.apache.org>
> >>> > > Subject:Re: Timeline for 0.15.0 release
> >>> > >
> >>> > > I'm going to work some on organizing the 0.15.0 backlog some
> >>> > > this week, if anyone wants to help with grooming (particularly
> >>> > > for languages other than C++/Python where I'm focusing) that
> >>> > > would be helpful. There have been almost 500 JIRA issues opened
> >>> > > since the
> >>> > > 0.14.0 release, so we should make sure to check whether there's
> >>> > > any regressions or other serious bugs that we should try to fix
> >>> > > for 0.15.0.
> >>> > >
> >>> > > On Thu, Aug 15, 2019 at 6:23 PM Wes McKinney
> >>> > > <wesmck...@gmail.com>
> >>> wrote:
> >>> > > >
> >>> > > > The Windows wheel issue in 0.14.1 seems to be
> >>> > > >
> >>> > > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2
> >>> > > > F%2Fissues.apache.org%2Fjira%2Fbrowse%2FARROW-6015&amp;data=02
> >>> > > > %7C01%7CEric.Erhardt%40microsoft.com%7Ccbead81a42104034a4f308d
> >>> > > > 736678a45%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C6370376
> >>> > > > 90648216338&amp;sdata=D9lqHR16oRAFlPaIrcXq3UtW%2BLuJQW1u0Gom2u
> >>> > > > WEWg0%3D&amp;reserved=0
> >>> > > >
> >>> > > > I think the root cause could be the Windows changes in
> >>> > > >
> >>> > > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2
> >>> > > > F%2Fgithub.com%2Fapache%2Farrow%2Fcommit%2F&amp;data=02%7C01%7
> >>> > > > CEric.Erhardt%40microsoft.com%7Ccbead81a42104034a4f308d736678a
> >>> > > > 45%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63703769064821
> >>> > > > 6338&amp;sdata=iPmFB%2BncIbmvp5D31vjB4A2KyuMP%2B83Vp7%2BDiOxvl
> >>> > > > bs%3D&amp;reserved=0
> >>> 223ae744cc2a12c60cecb5db593263a03c13f85a
> >>> > > >
> >>> > > > I would be appreciative if a volunteer would look into what
> >>> > > > was
> >>> wrong
> >>> > > > with the 0.14.1 wheels on Windows. Otherwise 0.15.0 Windows
> >>> > > > wheels will be broken, too
> >>> > > >
> >>> > > > The bad wheels can be found at
> >>> > > >
> >>> > > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2
> >>> > > > F%2Fbintray.com%2Fapache%2Farrow%2Fpython%23files%2Fpython%252
> >>> > > > F0.14.1&amp;data=02%7C01%7CEric.Erhardt%40microsoft.com%7Ccbea
> >>> > > > d81a42104034a4f308d736678a45%7C72f988bf86f141af91ab2d7cd011db4
> >>> > > > 7%7C1%7C0%7C637037690648216338&amp;sdata=vZzx4HNS9qp2UWhFagqfJ
> >>> > > > zbY%2BGzwspH1TO3wdfrbA6Y%3D&amp;reserved=0
> >>> > > >
> >>> > > > On Thu, Aug 15, 2019 at 1:28 PM Antoine Pitrou <
> >>> solip...@pitrou.net> wrote:
> >>> > > > >
> >>> > > > > On Thu, 15 Aug 2019 11:17:07 -0700 Micah Kornfield
> >>> > > > > <emkornfi...@gmail.com> wrote:
> >>> > > > > > >
> >>> > > > > > > In C++ they are
> >>> > > > > > > independent, we could have 32-bit array lengths and
> >>> variable-length
> >>> > > > > > > types with 64-bit offsets if we wanted (we just wouldn't
> >>> > > > > > > be
> >>> able to
> >>> > > > > > > have a List child with more than INT32_MAX elements).
> >>> > > > > >
> >>> > > > > > I think the point is we could do this in C++ but we don't.
> >>> I'm not sure we
> >>> > > > > > would have introduced the "Large" types if we did.
> >>> > > > >
> >>> > > > > 64-bit offsets take twice as much space as 32-bit offsets,
> >>> > > > > so if
> >>> you're
> >>> > > > > storing lots of small-ish lists or strings, 32-bit offsets
> >>> > > > > are preferrable.  So even with 64-bit array lengths from the
> >>> > > > > start
> >>> it would
> >>> > > > > still be beneficial to have types with 32-bit offsets.
> >>> > > > >
> >>> > > > > > Going with the limited address space in Java and calling
> >>> > > > > > it a
> >>> reference
> >>> > > > > > implementation seems suboptimal. If a consumer uses a "Large"
> >>> type
> >>> > > > > > presumably it is because they need the ability to store
> >>> > > > > > more
> >>> than INT32_MAX
> >>> > > > > > child elements in a column, otherwise it is just wasting
> >>> > > > > > space
> >>> [1].
> >>> > > > >
> >>> > > > > Probably. Though if the individual elements (lists or
> >>> > > > > strings)
> >>> are
> >>> > > > > large, not much space is wasted in proportion, so it may be
> >>> simpler in
> >>> > > > > such a case to always create a "Large" type array.
> >>> > > > >
> >>> > > > > > [1] I suppose theoretically there might be some
> >>> > > > > > performance
> >>> benefits on
> >>> > > > > > 64-bit architectures to using the native word sizes.
> >>> > > > >
> >>> > > > > Concretely, common 64-bit architectures don't do that, as
> >>> > > > > 32-bit
> >>> is an
> >>> > > > > extremely common integer size even in high-performance code.
> >>> > > > >
> >>> > > > > Regards
> >>> > > > >
> >>> > > > > Antoine.
> >>> > > > >
> >>> > > > >
> >>> > >
> >>>
> >>

Reply via email to