I assume the plan is to merge the ARROW-6313-flatbuffer-alignment branch into master before the 0.15 release, correct?
BTW - I believe the C# alignment changes are ready to be merged into the alignment branch - https://github.com/apache/arrow/pull/5280/ Eric -----Original Message----- From: Micah Kornfield <emkornfi...@gmail.com> Sent: Tuesday, September 10, 2019 10:24 PM To: Wes McKinney <wesmck...@gmail.com> Cc: dev <dev@arrow.apache.org>; niki.lj <niki...@aliyun.com> Subject: Re: Timeline for 0.15.0 release I should have a little more bandwidth to help with some of the packaging starting tomorrow and going into the weekend. On Tuesday, September 10, 2019, Wes McKinney <wesmck...@gmail.com> wrote: > Hi folks, > > With the state of nightly packaging and integration builds things > aren't looking too good for being in release readiness by the end of > this week but maybe I'm wrong. I'm planning to be working to close as > many issues as I can and also to help with the ongoing alignment fixes. > > Wes > > On Thu, Sep 5, 2019, 11:07 PM Micah Kornfield <emkornfi...@gmail.com> > wrote: > >> Just for reference [1] has a dashboard of the current issues: >> >> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwi >> ki.apache.org%2Fconfluence%2Fdisplay%2FARROW%2FArrow%2B0.15.0%2BRelea >> se&data=02%7C01%7CEric.Erhardt%40microsoft.com%7Ccbead81a42104034 >> a4f308d736678a45%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C6370376 >> 90648216338&sdata=0Upux3i%2B9X6f8uanGKSGM5VYxR6c2ADWrxSPi1%2FgbH4 >> %3D&reserved=0 >> >> On Thu, Sep 5, 2019 at 3:43 PM Wes McKinney <wesmck...@gmail.com> wrote: >> >>> hi all, >>> >>> It doesn't seem like we're going to be in a position to release at >>> the beginning of next week. I hope that one more week of work (or >>> less) will be enough to get us there. Aside from merging the >>> alignment changes, we need to make sure that our packaging jobs >>> required for the release candidate are all working. >>> >>> If folks could remove issues from the 0.15.0 backlog that they don't >>> think they will finish by end of next week that would help focus >>> efforts (there are currently 78 issues in 0.15.0 still). I am >>> looking to tackle a few small features related to dictionaries while >>> the release window is still open. >>> >>> - Wes >>> >>> On Tue, Aug 27, 2019 at 3:48 PM Wes McKinney <wesmck...@gmail.com> >>> wrote: >>> > >>> > hi, >>> > >>> > I think we should try to release the week of September 9, so >>> > development work should be completed by end of next week. >>> > >>> > Does that seem reasonable? >>> > >>> > I plan to get up a patch for the protocol alignment changes for >>> > C++ in the next couple of days -- I think that getting the >>> > alignment work done is the main barrier to releasing. >>> > >>> > Thanks >>> > Wes >>> > >>> > On Mon, Aug 19, 2019 at 12:25 PM Ji Liu >>> > <niki...@aliyun.com.invalid> >>> wrote: >>> > > >>> > > Hi, Wes, on the java side, I can think of several bugs that need >>> > > to >>> be fixed or reminded. >>> > > >>> > > i. ARROW-6040: Dictionary entries are required in IPC streams >>> > > even >>> when empty[1] >>> > > This one is under review now, however through this PR we find >>> > > that >>> there seems a bug in java reading and writing dictionaries in IPC >>> which is Inconsistent with spec[2] since it assumes all dictionaries >>> are at the start of stream (see details in PR comments, and this >>> fix may not catch up with version 0.15). @Micah Kornfield >>> > > >>> > > ii. ARROW-1875: Write 64-bit ints as strings in integration test >>> JSON files[3] >>> > > Java side code already checked in, other implementations seems not. >>> > > >>> > > iii. ARROW-6202: OutOfMemory in JdbcAdapter[4] Caused by trying >>> > > to load all records in one contiguous batch, fixed >>> by providing iterator API for iteratively reading in ARROW-6219[5]. >>> > > >>> > > Thanks, >>> > > Ji Liu >>> > > >>> > > [1] >>> > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F% >>> > > 2Fgithub.com%2Fapache%2Farrow%2Fpull%2F4960&data=02%7C01%7CE >>> > > ric.Erhardt%40microsoft.com%7Ccbead81a42104034a4f308d736678a45%7 >>> > > C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637037690648216338&a >>> > > mp;sdata=eDF%2FAsJmVs7WjfEuNBYo%2F1TypIN44xx1TTlK6kQHZVg%3D& >>> > > reserved=0 [2] >>> > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F% >>> > > 2Farrow.apache.org%2Fdocs%2Fipc.html&data=02%7C01%7CEric.Erh >>> > > ardt%40microsoft.com%7Ccbead81a42104034a4f308d736678a45%7C72f988 >>> > > bf86f141af91ab2d7cd011db47%7C1%7C0%7C637037690648216338&sdat >>> > > a=H0pM8bVKsOyeORDhHxLlS%2BpaS%2F5meT52wxTKmNssuMk%3D&reserve >>> > > d=0 [3] >>> > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F% >>> > > 2Fissues.apache.org%2Fjira%2Fbrowse%2FARROW-1875&data=02%7C0 >>> > > 1%7CEric.Erhardt%40microsoft.com%7Ccbead81a42104034a4f308d736678 >>> > > a45%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637037690648216 >>> > > 338&sdata=coTpuoEGhfjyOSBTagdlohOTX24DQZmtbWC0gYsDmkM%3D& >>> > > ;reserved=0 [4] >>> > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F% >>> > > 2Fissues.apache.org%2Fjira%2Fbrowse%2FARROW-6202%5B5&data=02 >>> > > %7C01%7CEric.Erhardt%40microsoft.com%7Ccbead81a42104034a4f308d73 >>> > > 6678a45%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63703769064 >>> > > 8216338&sdata=gnyUMk8cUgwc802QBLF3eAp3mznYwonlbF0qmGyzgmY%3D >>> > > &reserved=0] >>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fis >>> sues.apache.org%2Fjira%2Fbrowse%2FARROW-6219&data=02%7C01%7CEric >>> .Erhardt%40microsoft.com%7Ccbead81a42104034a4f308d736678a45%7C72f988 >>> bf86f141af91ab2d7cd011db47%7C1%7C0%7C637037690648216338&sdata=d3 >>> LF%2BTeWSprASqO%2ByE4LywlsULHGcb1Iq%2F2byHrEPkY%3D&reserved=0 >>> > > >>> > > >>> > > >>> > > ---------------------------------------------------------------- >>> > > -- From:Wes McKinney <wesmck...@gmail.com> Send >>> > > Time:2019年8月19日(星期一) 23:03 To:dev <dev@arrow.apache.org> >>> > > Subject:Re: Timeline for 0.15.0 release >>> > > >>> > > I'm going to work some on organizing the 0.15.0 backlog some >>> > > this week, if anyone wants to help with grooming (particularly >>> > > for languages other than C++/Python where I'm focusing) that >>> > > would be helpful. There have been almost 500 JIRA issues opened >>> > > since the >>> > > 0.14.0 release, so we should make sure to check whether there's >>> > > any regressions or other serious bugs that we should try to fix >>> > > for 0.15.0. >>> > > >>> > > On Thu, Aug 15, 2019 at 6:23 PM Wes McKinney >>> > > <wesmck...@gmail.com> >>> wrote: >>> > > > >>> > > > The Windows wheel issue in 0.14.1 seems to be >>> > > > >>> > > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2 >>> > > > F%2Fissues.apache.org%2Fjira%2Fbrowse%2FARROW-6015&data=02 >>> > > > %7C01%7CEric.Erhardt%40microsoft.com%7Ccbead81a42104034a4f308d >>> > > > 736678a45%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C6370376 >>> > > > 90648216338&sdata=D9lqHR16oRAFlPaIrcXq3UtW%2BLuJQW1u0Gom2u >>> > > > WEWg0%3D&reserved=0 >>> > > > >>> > > > I think the root cause could be the Windows changes in >>> > > > >>> > > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2 >>> > > > F%2Fgithub.com%2Fapache%2Farrow%2Fcommit%2F&data=02%7C01%7 >>> > > > CEric.Erhardt%40microsoft.com%7Ccbead81a42104034a4f308d736678a >>> > > > 45%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63703769064821 >>> > > > 6338&sdata=iPmFB%2BncIbmvp5D31vjB4A2KyuMP%2B83Vp7%2BDiOxvl >>> > > > bs%3D&reserved=0 >>> 223ae744cc2a12c60cecb5db593263a03c13f85a >>> > > > >>> > > > I would be appreciative if a volunteer would look into what >>> > > > was >>> wrong >>> > > > with the 0.14.1 wheels on Windows. Otherwise 0.15.0 Windows >>> > > > wheels will be broken, too >>> > > > >>> > > > The bad wheels can be found at >>> > > > >>> > > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2 >>> > > > F%2Fbintray.com%2Fapache%2Farrow%2Fpython%23files%2Fpython%252 >>> > > > F0.14.1&data=02%7C01%7CEric.Erhardt%40microsoft.com%7Ccbea >>> > > > d81a42104034a4f308d736678a45%7C72f988bf86f141af91ab2d7cd011db4 >>> > > > 7%7C1%7C0%7C637037690648216338&sdata=vZzx4HNS9qp2UWhFagqfJ >>> > > > zbY%2BGzwspH1TO3wdfrbA6Y%3D&reserved=0 >>> > > > >>> > > > On Thu, Aug 15, 2019 at 1:28 PM Antoine Pitrou < >>> solip...@pitrou.net> wrote: >>> > > > > >>> > > > > On Thu, 15 Aug 2019 11:17:07 -0700 Micah Kornfield >>> > > > > <emkornfi...@gmail.com> wrote: >>> > > > > > > >>> > > > > > > In C++ they are >>> > > > > > > independent, we could have 32-bit array lengths and >>> variable-length >>> > > > > > > types with 64-bit offsets if we wanted (we just wouldn't >>> > > > > > > be >>> able to >>> > > > > > > have a List child with more than INT32_MAX elements). >>> > > > > > >>> > > > > > I think the point is we could do this in C++ but we don't. >>> I'm not sure we >>> > > > > > would have introduced the "Large" types if we did. >>> > > > > >>> > > > > 64-bit offsets take twice as much space as 32-bit offsets, >>> > > > > so if >>> you're >>> > > > > storing lots of small-ish lists or strings, 32-bit offsets >>> > > > > are preferrable. So even with 64-bit array lengths from the >>> > > > > start >>> it would >>> > > > > still be beneficial to have types with 32-bit offsets. >>> > > > > >>> > > > > > Going with the limited address space in Java and calling >>> > > > > > it a >>> reference >>> > > > > > implementation seems suboptimal. If a consumer uses a "Large" >>> type >>> > > > > > presumably it is because they need the ability to store >>> > > > > > more >>> than INT32_MAX >>> > > > > > child elements in a column, otherwise it is just wasting >>> > > > > > space >>> [1]. >>> > > > > >>> > > > > Probably. Though if the individual elements (lists or >>> > > > > strings) >>> are >>> > > > > large, not much space is wasted in proportion, so it may be >>> simpler in >>> > > > > such a case to always create a "Large" type array. >>> > > > > >>> > > > > > [1] I suppose theoretically there might be some >>> > > > > > performance >>> benefits on >>> > > > > > 64-bit architectures to using the native word sizes. >>> > > > > >>> > > > > Concretely, common 64-bit architectures don't do that, as >>> > > > > 32-bit >>> is an >>> > > > > extremely common integer size even in high-performance code. >>> > > > > >>> > > > > Regards >>> > > > > >>> > > > > Antoine. >>> > > > > >>> > > > > >>> > > >>> >>