Just for reference [1] has a dashboard of the current issues: https://cwiki.apache.org/confluence/display/ARROW/Arrow+0.15.0+Release
On Thu, Sep 5, 2019 at 3:43 PM Wes McKinney <wesmck...@gmail.com> wrote: > hi all, > > It doesn't seem like we're going to be in a position to release at the > beginning of next week. I hope that one more week of work (or less) > will be enough to get us there. Aside from merging the alignment > changes, we need to make sure that our packaging jobs required for the > release candidate are all working. > > If folks could remove issues from the 0.15.0 backlog that they don't > think they will finish by end of next week that would help focus > efforts (there are currently 78 issues in 0.15.0 still). I am looking > to tackle a few small features related to dictionaries while the > release window is still open. > > - Wes > > On Tue, Aug 27, 2019 at 3:48 PM Wes McKinney <wesmck...@gmail.com> wrote: > > > > hi, > > > > I think we should try to release the week of September 9, so > > development work should be completed by end of next week. > > > > Does that seem reasonable? > > > > I plan to get up a patch for the protocol alignment changes for C++ in > > the next couple of days -- I think that getting the alignment work > > done is the main barrier to releasing. > > > > Thanks > > Wes > > > > On Mon, Aug 19, 2019 at 12:25 PM Ji Liu <niki...@aliyun.com.invalid> > wrote: > > > > > > Hi, Wes, on the java side, I can think of several bugs that need to be > fixed or reminded. > > > > > > i. ARROW-6040: Dictionary entries are required in IPC streams even > when empty[1] > > > This one is under review now, however through this PR we find that > there seems a bug in java reading and writing dictionaries in IPC which is > Inconsistent with spec[2] since it assumes all dictionaries are at the > start of stream (see details in PR comments, and this fix may not catch up > with version 0.15). @Micah Kornfield > > > > > > ii. ARROW-1875: Write 64-bit ints as strings in integration test JSON > files[3] > > > Java side code already checked in, other implementations seems not. > > > > > > iii. ARROW-6202: OutOfMemory in JdbcAdapter[4] > > > Caused by trying to load all records in one contiguous batch, fixed by > providing iterator API for iteratively reading in ARROW-6219[5]. > > > > > > Thanks, > > > Ji Liu > > > > > > [1] https://github.com/apache/arrow/pull/4960 > > > [2] https://arrow.apache.org/docs/ipc.html > > > [3] https://issues.apache.org/jira/browse/ARROW-1875 > > > [4] https://issues.apache.org/jira/browse/ARROW-6202[5] > https://issues.apache.org/jira/browse/ARROW-6219 > > > > > > > > > > > > ------------------------------------------------------------------ > > > From:Wes McKinney <wesmck...@gmail.com> > > > Send Time:2019年8月19日(星期一) 23:03 > > > To:dev <dev@arrow.apache.org> > > > Subject:Re: Timeline for 0.15.0 release > > > > > > I'm going to work some on organizing the 0.15.0 backlog some this > > > week, if anyone wants to help with grooming (particularly for > > > languages other than C++/Python where I'm focusing) that would be > > > helpful. There have been almost 500 JIRA issues opened since the > > > 0.14.0 release, so we should make sure to check whether there's any > > > regressions or other serious bugs that we should try to fix for > > > 0.15.0. > > > > > > On Thu, Aug 15, 2019 at 6:23 PM Wes McKinney <wesmck...@gmail.com> > wrote: > > > > > > > > The Windows wheel issue in 0.14.1 seems to be > > > > > > > > https://issues.apache.org/jira/browse/ARROW-6015 > > > > > > > > I think the root cause could be the Windows changes in > > > > > > > > > https://github.com/apache/arrow/commit/223ae744cc2a12c60cecb5db593263a03c13f85a > > > > > > > > I would be appreciative if a volunteer would look into what was wrong > > > > with the 0.14.1 wheels on Windows. Otherwise 0.15.0 Windows wheels > > > > will be broken, too > > > > > > > > The bad wheels can be found at > > > > > > > > https://bintray.com/apache/arrow/python#files/python%2F0.14.1 > > > > > > > > On Thu, Aug 15, 2019 at 1:28 PM Antoine Pitrou <solip...@pitrou.net> > wrote: > > > > > > > > > > On Thu, 15 Aug 2019 11:17:07 -0700 > > > > > Micah Kornfield <emkornfi...@gmail.com> wrote: > > > > > > > > > > > > > > In C++ they are > > > > > > > independent, we could have 32-bit array lengths and > variable-length > > > > > > > types with 64-bit offsets if we wanted (we just wouldn't be > able to > > > > > > > have a List child with more than INT32_MAX elements). > > > > > > > > > > > > I think the point is we could do this in C++ but we don't. I'm > not sure we > > > > > > would have introduced the "Large" types if we did. > > > > > > > > > > 64-bit offsets take twice as much space as 32-bit offsets, so if > you're > > > > > storing lots of small-ish lists or strings, 32-bit offsets are > > > > > preferrable. So even with 64-bit array lengths from the start it > would > > > > > still be beneficial to have types with 32-bit offsets. > > > > > > > > > > > Going with the limited address space in Java and calling it a > reference > > > > > > implementation seems suboptimal. If a consumer uses a "Large" > type > > > > > > presumably it is because they need the ability to store more > than INT32_MAX > > > > > > child elements in a column, otherwise it is just wasting space > [1]. > > > > > > > > > > Probably. Though if the individual elements (lists or strings) are > > > > > large, not much space is wasted in proportion, so it may be > simpler in > > > > > such a case to always create a "Large" type array. > > > > > > > > > > > [1] I suppose theoretically there might be some performance > benefits on > > > > > > 64-bit architectures to using the native word sizes. > > > > > > > > > > Concretely, common 64-bit architectures don't do that, as 32-bit > is an > > > > > extremely common integer size even in high-performance code. > > > > > > > > > > Regards > > > > > > > > > > Antoine. > > > > > > > > > > > > > >