I assume the plan is to merge the ARROW-6313-flatbuffer-alignment branch into 
master before the 0.15 release, correct?

BTW - I believe the C# alignment changes are ready to be merged into the 
alignment branch -  https://github.com/apache/arrow/pull/5280/ 

Eric

-----Original Message-----
From: Micah Kornfield <emkornfi...@gmail.com> 
Sent: Tuesday, September 10, 2019 10:24 PM
To: Wes McKinney <wesmck...@gmail.com>
Cc: dev <dev@arrow.apache.org>; niki.lj <niki...@aliyun.com>
Subject: Re: Timeline for 0.15.0 release

I should have a little more bandwidth to help with some of the packaging 
starting tomorrow and going into the weekend.

On Tuesday, September 10, 2019, Wes McKinney <wesmck...@gmail.com> wrote:

> Hi folks,
>
> With the state of nightly packaging and integration builds things 
> aren't looking too good for being in release readiness by the end of 
> this week but maybe I'm wrong. I'm planning to be working to close as 
> many issues as I can and also to help with the ongoing alignment fixes.
>
> Wes
>
> On Thu, Sep 5, 2019, 11:07 PM Micah Kornfield <emkornfi...@gmail.com>
> wrote:
>
>> Just for reference [1] has a dashboard of the current issues:
>>
>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwi
>> ki.apache.org%2Fconfluence%2Fdisplay%2FARROW%2FArrow%2B0.15.0%2BRelea
>> se&amp;data=02%7C01%7CEric.Erhardt%40microsoft.com%7Ccbead81a42104034
>> a4f308d736678a45%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C6370376
>> 90648216338&amp;sdata=0Upux3i%2B9X6f8uanGKSGM5VYxR6c2ADWrxSPi1%2FgbH4
>> %3D&amp;reserved=0
>>
>> On Thu, Sep 5, 2019 at 3:43 PM Wes McKinney <wesmck...@gmail.com> wrote:
>>
>>> hi all,
>>>
>>> It doesn't seem like we're going to be in a position to release at 
>>> the beginning of next week. I hope that one more week of work (or 
>>> less) will be enough to get us there. Aside from merging the 
>>> alignment changes, we need to make sure that our packaging jobs 
>>> required for the release candidate are all working.
>>>
>>> If folks could remove issues from the 0.15.0 backlog that they don't 
>>> think they will finish by end of next week that would help focus 
>>> efforts (there are currently 78 issues in 0.15.0 still). I am 
>>> looking to tackle a few small features related to dictionaries while 
>>> the release window is still open.
>>>
>>> - Wes
>>>
>>> On Tue, Aug 27, 2019 at 3:48 PM Wes McKinney <wesmck...@gmail.com>
>>> wrote:
>>> >
>>> > hi,
>>> >
>>> > I think we should try to release the week of September 9, so 
>>> > development work should be completed by end of next week.
>>> >
>>> > Does that seem reasonable?
>>> >
>>> > I plan to get up a patch for the protocol alignment changes for 
>>> > C++ in the next couple of days -- I think that getting the 
>>> > alignment work done is the main barrier to releasing.
>>> >
>>> > Thanks
>>> > Wes
>>> >
>>> > On Mon, Aug 19, 2019 at 12:25 PM Ji Liu 
>>> > <niki...@aliyun.com.invalid>
>>> wrote:
>>> > >
>>> > > Hi, Wes, on the java side, I can think of several bugs that need 
>>> > > to
>>> be fixed or reminded.
>>> > >
>>> > > i. ARROW-6040: Dictionary entries are required in IPC streams 
>>> > > even
>>> when empty[1]
>>> > > This one is under review now, however through this PR we find 
>>> > > that
>>> there seems a bug in java reading and writing dictionaries in IPC 
>>> which is Inconsistent with spec[2] since it assumes all dictionaries 
>>> are at the start of stream (see details in PR comments,  and this 
>>> fix may not catch up with version 0.15). @Micah Kornfield
>>> > >
>>> > > ii. ARROW-1875: Write 64-bit ints as strings in integration test
>>> JSON files[3]
>>> > > Java side code already checked in, other implementations seems not.
>>> > >
>>> > > iii. ARROW-6202: OutOfMemory in JdbcAdapter[4] Caused by trying 
>>> > > to load all records in one contiguous batch, fixed
>>> by providing iterator API for iteratively reading in ARROW-6219[5].
>>> > >
>>> > > Thanks,
>>> > > Ji Liu
>>> > >
>>> > > [1] 
>>> > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%
>>> > > 2Fgithub.com%2Fapache%2Farrow%2Fpull%2F4960&amp;data=02%7C01%7CE
>>> > > ric.Erhardt%40microsoft.com%7Ccbead81a42104034a4f308d736678a45%7
>>> > > C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637037690648216338&a
>>> > > mp;sdata=eDF%2FAsJmVs7WjfEuNBYo%2F1TypIN44xx1TTlK6kQHZVg%3D&amp;
>>> > > reserved=0 [2] 
>>> > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%
>>> > > 2Farrow.apache.org%2Fdocs%2Fipc.html&amp;data=02%7C01%7CEric.Erh
>>> > > ardt%40microsoft.com%7Ccbead81a42104034a4f308d736678a45%7C72f988
>>> > > bf86f141af91ab2d7cd011db47%7C1%7C0%7C637037690648216338&amp;sdat
>>> > > a=H0pM8bVKsOyeORDhHxLlS%2BpaS%2F5meT52wxTKmNssuMk%3D&amp;reserve
>>> > > d=0 [3] 
>>> > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%
>>> > > 2Fissues.apache.org%2Fjira%2Fbrowse%2FARROW-1875&amp;data=02%7C0
>>> > > 1%7CEric.Erhardt%40microsoft.com%7Ccbead81a42104034a4f308d736678
>>> > > a45%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637037690648216
>>> > > 338&amp;sdata=coTpuoEGhfjyOSBTagdlohOTX24DQZmtbWC0gYsDmkM%3D&amp
>>> > > ;reserved=0 [4] 
>>> > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%
>>> > > 2Fissues.apache.org%2Fjira%2Fbrowse%2FARROW-6202%5B5&amp;data=02
>>> > > %7C01%7CEric.Erhardt%40microsoft.com%7Ccbead81a42104034a4f308d73
>>> > > 6678a45%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63703769064
>>> > > 8216338&amp;sdata=gnyUMk8cUgwc802QBLF3eAp3mznYwonlbF0qmGyzgmY%3D
>>> > > &amp;reserved=0]
>>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fis
>>> sues.apache.org%2Fjira%2Fbrowse%2FARROW-6219&amp;data=02%7C01%7CEric
>>> .Erhardt%40microsoft.com%7Ccbead81a42104034a4f308d736678a45%7C72f988
>>> bf86f141af91ab2d7cd011db47%7C1%7C0%7C637037690648216338&amp;sdata=d3
>>> LF%2BTeWSprASqO%2ByE4LywlsULHGcb1Iq%2F2byHrEPkY%3D&amp;reserved=0
>>> > >
>>> > >
>>> > >
>>> > > ----------------------------------------------------------------
>>> > > -- From:Wes McKinney <wesmck...@gmail.com> Send 
>>> > > Time:2019年8月19日(星期一) 23:03 To:dev <dev@arrow.apache.org>
>>> > > Subject:Re: Timeline for 0.15.0 release
>>> > >
>>> > > I'm going to work some on organizing the 0.15.0 backlog some 
>>> > > this week, if anyone wants to help with grooming (particularly 
>>> > > for languages other than C++/Python where I'm focusing) that 
>>> > > would be helpful. There have been almost 500 JIRA issues opened 
>>> > > since the
>>> > > 0.14.0 release, so we should make sure to check whether there's 
>>> > > any regressions or other serious bugs that we should try to fix 
>>> > > for 0.15.0.
>>> > >
>>> > > On Thu, Aug 15, 2019 at 6:23 PM Wes McKinney 
>>> > > <wesmck...@gmail.com>
>>> wrote:
>>> > > >
>>> > > > The Windows wheel issue in 0.14.1 seems to be
>>> > > >
>>> > > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2
>>> > > > F%2Fissues.apache.org%2Fjira%2Fbrowse%2FARROW-6015&amp;data=02
>>> > > > %7C01%7CEric.Erhardt%40microsoft.com%7Ccbead81a42104034a4f308d
>>> > > > 736678a45%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C6370376
>>> > > > 90648216338&amp;sdata=D9lqHR16oRAFlPaIrcXq3UtW%2BLuJQW1u0Gom2u
>>> > > > WEWg0%3D&amp;reserved=0
>>> > > >
>>> > > > I think the root cause could be the Windows changes in
>>> > > >
>>> > > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2
>>> > > > F%2Fgithub.com%2Fapache%2Farrow%2Fcommit%2F&amp;data=02%7C01%7
>>> > > > CEric.Erhardt%40microsoft.com%7Ccbead81a42104034a4f308d736678a
>>> > > > 45%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63703769064821
>>> > > > 6338&amp;sdata=iPmFB%2BncIbmvp5D31vjB4A2KyuMP%2B83Vp7%2BDiOxvl
>>> > > > bs%3D&amp;reserved=0
>>> 223ae744cc2a12c60cecb5db593263a03c13f85a
>>> > > >
>>> > > > I would be appreciative if a volunteer would look into what 
>>> > > > was
>>> wrong
>>> > > > with the 0.14.1 wheels on Windows. Otherwise 0.15.0 Windows 
>>> > > > wheels will be broken, too
>>> > > >
>>> > > > The bad wheels can be found at
>>> > > >
>>> > > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2
>>> > > > F%2Fbintray.com%2Fapache%2Farrow%2Fpython%23files%2Fpython%252
>>> > > > F0.14.1&amp;data=02%7C01%7CEric.Erhardt%40microsoft.com%7Ccbea
>>> > > > d81a42104034a4f308d736678a45%7C72f988bf86f141af91ab2d7cd011db4
>>> > > > 7%7C1%7C0%7C637037690648216338&amp;sdata=vZzx4HNS9qp2UWhFagqfJ
>>> > > > zbY%2BGzwspH1TO3wdfrbA6Y%3D&amp;reserved=0
>>> > > >
>>> > > > On Thu, Aug 15, 2019 at 1:28 PM Antoine Pitrou <
>>> solip...@pitrou.net> wrote:
>>> > > > >
>>> > > > > On Thu, 15 Aug 2019 11:17:07 -0700 Micah Kornfield 
>>> > > > > <emkornfi...@gmail.com> wrote:
>>> > > > > > >
>>> > > > > > > In C++ they are
>>> > > > > > > independent, we could have 32-bit array lengths and
>>> variable-length
>>> > > > > > > types with 64-bit offsets if we wanted (we just wouldn't 
>>> > > > > > > be
>>> able to
>>> > > > > > > have a List child with more than INT32_MAX elements).
>>> > > > > >
>>> > > > > > I think the point is we could do this in C++ but we don't.
>>> I'm not sure we
>>> > > > > > would have introduced the "Large" types if we did.
>>> > > > >
>>> > > > > 64-bit offsets take twice as much space as 32-bit offsets, 
>>> > > > > so if
>>> you're
>>> > > > > storing lots of small-ish lists or strings, 32-bit offsets 
>>> > > > > are preferrable.  So even with 64-bit array lengths from the 
>>> > > > > start
>>> it would
>>> > > > > still be beneficial to have types with 32-bit offsets.
>>> > > > >
>>> > > > > > Going with the limited address space in Java and calling 
>>> > > > > > it a
>>> reference
>>> > > > > > implementation seems suboptimal. If a consumer uses a "Large"
>>> type
>>> > > > > > presumably it is because they need the ability to store 
>>> > > > > > more
>>> than INT32_MAX
>>> > > > > > child elements in a column, otherwise it is just wasting 
>>> > > > > > space
>>> [1].
>>> > > > >
>>> > > > > Probably. Though if the individual elements (lists or 
>>> > > > > strings)
>>> are
>>> > > > > large, not much space is wasted in proportion, so it may be
>>> simpler in
>>> > > > > such a case to always create a "Large" type array.
>>> > > > >
>>> > > > > > [1] I suppose theoretically there might be some 
>>> > > > > > performance
>>> benefits on
>>> > > > > > 64-bit architectures to using the native word sizes.
>>> > > > >
>>> > > > > Concretely, common 64-bit architectures don't do that, as 
>>> > > > > 32-bit
>>> is an
>>> > > > > extremely common integer size even in high-performance code.
>>> > > > >
>>> > > > > Regards
>>> > > > >
>>> > > > > Antoine.
>>> > > > >
>>> > > > >
>>> > >
>>>
>>

Reply via email to