[jira] [Created] (ARROW-6247) [Java] Provide a common interface for float4 and float8 vectors

2019-08-14 Thread Liya Fan (JIRA)
Liya Fan created ARROW-6247: --- Summary: [Java] Provide a common interface for float4 and float8 vectors Key: ARROW-6247 URL: https://issues.apache.org/jira/browse/ARROW-6247 Project: Apache Arrow

Re: [DISCUSS][Java] Provide an interface for numeric vectors

2019-08-14 Thread Fan Liya
Hi Micah, Thanks for the good points. I agree with you that we should improve the efficiency of algorithms. This is related to another improvement: reduce the if/switch statements in the code. To account for the edge cases, can we remove the set methods, and leaving only the get method? This is

Re: [Java] CI builds failing on master

2019-08-14 Thread Fan Liya
Hi Ji, Thanks for fixing this. Best, Liya Fan On Thu, Aug 15, 2019 at 12:50 PM Micah Kornfield wrote: > I just merged this. Thank you Ji Liu. > > On Wed, Aug 14, 2019 at 4:50 PM Ji Liu wrote: > > > Hi, Wes, as described in JIRA, this was introduced by our recent two > > patches, I have just

Re: Timeline for 0.15.0 release

2019-08-14 Thread Micah Kornfield
Hi Wes, > > Do these need to be dependent on the 64-bit array length discussion? We could hack something that can read the lower 32-bit range, so I guess not, but this leaves a bad taste in my mouth. I think there is likely still enough time to have the discussion and get these implemented, one

Re: [Java] CI builds failing on master

2019-08-14 Thread Micah Kornfield
I just merged this. Thank you Ji Liu. On Wed, Aug 14, 2019 at 4:50 PM Ji Liu wrote: > Hi, Wes, as described in JIRA, this was introduced by our recent two > patches, I have just submitted a PR[1] to fix this. Thanks for tracking > this issue. > > > Thanks, > Ji Liu > > [1]

Re: [DISCUSS][Java] Provide an interface for numeric vectors

2019-08-14 Thread Micah Kornfield
Hi Liya Fan, I'm not sure if this is a good idea. First, floating point operations have more edge cases than integer arithmetic (e.g. dealing with NaNs). Second, and I apologize that I've been remiss in thinking this through on reviews, but I think we should be thinking about how to make

Re: [DISCUSS][JAVA] Make FixedSizedListVector inherit from ListVector

2019-08-14 Thread Ji Liu
My original thoughts is that introduce a new interface makes the hierarchy a little confused (FixedListVector->BaseListVector, ListVector->BaseRepeatedVector->BaseListVector) and should try to avoid introducing new classes. And you are right, FixedSizeListVector should not include

Re: [ANNOUNCE] New Arrow PMC member: Sebastien Binet

2019-08-14 Thread Fan Liya
Congratulations, Sebastien! Best, Liya Fan On Thu, Aug 15, 2019 at 10:47 AM Ji Liu wrote: > Congrats Sebastian! > > > -- > From:Micah Kornfield > Send Time:2019年8月15日(星期四) 10:46 > To:dev@arrow.apache.org > Subject:Re:

[jira] [Created] (ARROW-6246) [Website] Add link to R documentation site

2019-08-14 Thread Neal Richardson (JIRA)
Neal Richardson created ARROW-6246: -- Summary: [Website] Add link to R documentation site Key: ARROW-6246 URL: https://issues.apache.org/jira/browse/ARROW-6246 Project: Apache Arrow Issue

Re: [ANNOUNCE] New Arrow PMC member: Sebastien Binet

2019-08-14 Thread Ji Liu
Congrats Sebastian! -- From:Micah Kornfield Send Time:2019年8月15日(星期四) 10:46 To:dev@arrow.apache.org Subject:Re: [ANNOUNCE] New Arrow PMC member: Sebastien Binet Congrats. Well deserved. On Wednesday, August 14, 2019, paddy

Re: [ANNOUNCE] New Arrow PMC member: Sebastien Binet

2019-08-14 Thread Micah Kornfield
Congrats. Well deserved. On Wednesday, August 14, 2019, paddy horan wrote: > Congrats Sebastian! > > Get Outlook for iOS > > From: Wes McKinney > Sent: Tuesday, August 13, 2019 4:54 PM > To: dev@arrow.apache.org > Subject: [ANNOUNCE] New

Re: [ANNOUNCE] New Arrow PMC member: Sebastien Binet

2019-08-14 Thread paddy horan
Congrats Sebastian! Get Outlook for iOS From: Wes McKinney Sent: Tuesday, August 13, 2019 4:54 PM To: dev@arrow.apache.org Subject: [ANNOUNCE] New Arrow PMC member: Sebastien Binet The Project Management Committee (PMC) for Apache Arrow

[jira] [Created] (ARROW-6245) [DISCUSS][Java] Provide an interface for numeric vectors

2019-08-14 Thread Liya Fan (JIRA)
Liya Fan created ARROW-6245: --- Summary: [DISCUSS][Java] Provide an interface for numeric vectors Key: ARROW-6245 URL: https://issues.apache.org/jira/browse/ARROW-6245 Project: Apache Arrow Issue

[DISCUSS][Java] Provide an interface for numeric vectors

2019-08-14 Thread Fan Liya
Dear all, We want to provide an interface for all vectors with numeric types (small int, float4, float8, etc). This interface will make it convenient for many operations on a vector, like average, sum, variance, etc. With this interface, the client code will be greatly simplified, with many

Re: [Java] CI builds failing on master

2019-08-14 Thread Ji Liu
Hi, Wes, as described in JIRA, this was introduced by our recent two patches, I have just submitted a PR[1] to fix this. Thanks for tracking this issue. Thanks, Ji Liu [1] https://github.com/apache/arrow/pull/5090 -- From:Wes

[VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements (2nd vote)

2019-08-14 Thread Wes McKinney
hi all, As we've been discussing [1], there is a need to introduce 4 bytes of padding into the preamble of the "encapsulated IPC message" format to ensure that the Flatbuffers metadata payload begins on an 8-byte aligned memory offset. The alternative to this would be for Arrow implementations

Re: [DISCUSS] Add GetFlightSchema to Flight RPC

2019-08-14 Thread Ryan Murray
Hi All, Does this require a vote? If yes what is the process for initiating one & if no I hope this is enough time for feedback and I would like to remove the draft designation from the PR Best, Ryan On Wed, Aug 7, 2019 at 9:31 AM Ryan Murray wrote: > As per everyone's feedback I have renamed

Re: [Discuss][Java] 64-bit lengths for ValueVectors

2019-08-14 Thread Wes McKinney
On Sun, Aug 11, 2019 at 9:40 PM Micah Kornfield wrote: > > Hi Wes and Jacques, > See responses below. > > With regards to the reference implementation point. It is a good point. I'm > > on vacation this week. Unless you're pushing hard on this, can we pick this > > up and discuss more next week?

[jira] [Created] (ARROW-6244) [C++] Implement Partition DataSource

2019-08-14 Thread Francois Saint-Jacques (JIRA)
Francois Saint-Jacques created ARROW-6244: - Summary: [C++] Implement Partition DataSource Key: ARROW-6244 URL: https://issues.apache.org/jira/browse/ARROW-6244 Project: Apache Arrow

[Java] CI builds failing on master

2019-08-14 Thread Wes McKinney
We've got some Java-related build failures occurring on master https://travis-ci.org/apache/arrow/jobs/571998256 Since we build the Java library in some of the C++/Python builds sorting this out is fairly urgent so we can continue to merge patches. I opened

[jira] [Created] (ARROW-6242) [C++] Implements basic Dataset/Scanner/ScannerBuilder

2019-08-14 Thread Francois Saint-Jacques (JIRA)
Francois Saint-Jacques created ARROW-6242: - Summary: [C++] Implements basic Dataset/Scanner/ScannerBuilder Key: ARROW-6242 URL: https://issues.apache.org/jira/browse/ARROW-6242 Project: Apache

[jira] [Created] (ARROW-6241) [Java] Failures on master

2019-08-14 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-6241: --- Summary: [Java] Failures on master Key: ARROW-6241 URL: https://issues.apache.org/jira/browse/ARROW-6241 Project: Apache Arrow Issue Type: Bug

[jira] [Created] (ARROW-6240) [Ruby] Arrow::Decimal128Array returns BigDecimal

2019-08-14 Thread Sutou Kouhei (JIRA)
Sutou Kouhei created ARROW-6240: --- Summary: [Ruby] Arrow::Decimal128Array returns BigDecimal Key: ARROW-6240 URL: https://issues.apache.org/jira/browse/ARROW-6240 Project: Apache Arrow Issue

[jira] [Created] (ARROW-6239) [Python][Parquet] Add examples of using HDFS filesystem and Parquet files together

2019-08-14 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-6239: --- Summary: [Python][Parquet] Add examples of using HDFS filesystem and Parquet files together Key: ARROW-6239 URL: https://issues.apache.org/jira/browse/ARROW-6239

Re: Timeline for 0.15.0 release

2019-08-14 Thread Antoine Pitrou
Agreed with Wes. Regards Antoine. Le 14/08/2019 à 20:30, Wes McKinney a écrit : > For the record, I don't think we should hold a major release hostage > if we aren't able to complete various feature milestones in time. > Since it's been about 5-6 weeks since 0.14.0 we're coming close to the

Re: Timeline for 0.15.0 release

2019-08-14 Thread Wes McKinney
For the record, I don't think we should hold a major release hostage if we aren't able to complete various feature milestones in time. Since it's been about 5-6 weeks since 0.14.0 we're coming close to the desired 8-10 week timeline for major releases, so if we need to have 0.16.0 prior to 1.0.0,

[jira] [Created] (ARROW-6237) [R] Add option to set CXXFLAGS when compiling R package with $ARROW_R_CXXFLAGS

2019-08-14 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-6237: --- Summary: [R] Add option to set CXXFLAGS when compiling R package with $ARROW_R_CXXFLAGS Key: ARROW-6237 URL: https://issues.apache.org/jira/browse/ARROW-6237 Project:

Re: Timeline for 0.15.0 release

2019-08-14 Thread Wes McKinney
On Wed, Aug 14, 2019 at 11:43 AM Micah Kornfield wrote: > > > > > is there anything else that has come up that > > definitely needs to happen before we can release again? > > We need to decide on a way forward for LargeList, LargeBinary, etc, types... > Do these need to be dependent on the

Re: Timeline for 0.15.0 release

2019-08-14 Thread Micah Kornfield
> > is there anything else that has come up that > definitely needs to happen before we can release again? We need to decide on a way forward for LargeList, LargeBinary, etc, types... On Tue, Aug 13, 2019 at 8:27 PM Wes McKinney wrote: > hi folks, > > Since there have been a number of fairly

Re: Timeline for 0.15.0 release

2019-08-14 Thread Wes McKinney
Is there a JIRA for the issue that caused us to pull the 0.14.1 Windows Python wheel installers? If we want to have working wheels for 0.15.0 we'll need a volunteer to help address whatever was wrong with 0.14.1. On Tue, Aug 13, 2019 at 10:26 PM Wes McKinney wrote: > > hi folks, > > Since there

[jira] [Created] (ARROW-6236) [R] Deduplicate strings using Arrow hash tables instead of passing all values through R's global hash table

2019-08-14 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-6236: --- Summary: [R] Deduplicate strings using Arrow hash tables instead of passing all values through R's global hash table Key: ARROW-6236 URL:

[jira] [Created] (ARROW-6235) [R] Conversion from arrow::BinaryArray to R character vector not implemented

2019-08-14 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-6235: --- Summary: [R] Conversion from arrow::BinaryArray to R character vector not implemented Key: ARROW-6235 URL: https://issues.apache.org/jira/browse/ARROW-6235 Project:

[jira] [Created] (ARROW-6234) [Java] ListVector hashCode() is not correct

2019-08-14 Thread Ji Liu (JIRA)
Ji Liu created ARROW-6234: - Summary: [Java] ListVector hashCode() is not correct Key: ARROW-6234 URL: https://issues.apache.org/jira/browse/ARROW-6234 Project: Apache Arrow Issue Type: Bug

Re: [DISCUSS][JAVA] Make FixedSizedListVector inherit from ListVector

2019-08-14 Thread Micah Kornfield
> > You are right, the mainly difference between FixSizedListVector and > ListVector is the offsetBuffer, but I think this could be avoided through > allocateNewSafe() overwrite which calls allocateOffsetBuffer() in > BaseRepeatedValueVector, in this way, offsetBuffer in FixSizedListVector > will

[jira] [Created] (ARROW-6233) Python package on PyPI no longer supports Windows?

2019-08-14 Thread Zhenyi Zhou (JIRA)
Zhenyi Zhou created ARROW-6233: -- Summary: Python package on PyPI no longer supports Windows? Key: ARROW-6233 URL: https://issues.apache.org/jira/browse/ARROW-6233 Project: Apache Arrow Issue