Re: [Discuss] C++ filenames: hyphens or underscores?

2019-08-06 Thread Philipp Moritz
I also have a small preference for underscores but would also be fine with dashes. It seems to be more common (therefore blends better with vendored code) and agrees with the styleguide and is closest to the exiting code. Also as an aside, having file_names names like variable_names is nice.

Re: [Discuss] C++ filenames: hyphens or underscores?

2019-08-06 Thread Micah Kornfield
I also have a preference for underscore but can get used to anything. I agree with the points François made above about the recommendation of the style guide and the smaller change to the existing code base. On Tue, Aug 6, 2019 at 6:52 PM Francois Saint-Jacques < fsaintjacq...@gmail.com> wrote:

[jira] [Created] (ARROW-6155) [Java] Extract a super interface for vectors whose elements reside in continuous memory segments

2019-08-06 Thread Liya Fan (JIRA)
Liya Fan created ARROW-6155: --- Summary: [Java] Extract a super interface for vectors whose elements reside in continuous memory segments Key: ARROW-6155 URL: https://issues.apache.org/jira/browse/ARROW-6155

Re: [Discuss] C++ filenames: hyphens or underscores?

2019-08-06 Thread Francois Saint-Jacques
My vote would go with underscore to minimize changes and minimize exceptions to the google style guide reference. I also suggests that we add this to the linters somehow, if it's not too much trouble. François On Tue, Aug 6, 2019 at 9:35 PM Sutou Kouhei wrote: > > Hi, > > I like hyphens. > >

Re: [Discuss] C++ filenames: hyphens or underscores?

2019-08-06 Thread Sutou Kouhei
Hi, I like hyphens. Because many Linux commands use hyphens than underscores. Here are counts on my Debian GNU/Linux machine: % ls /usr/bin/ | grep -- - | wc -l 956 % ls /usr/bin/ | grep _ | wc -l 343 Thanks, -- kou In <20190806140340.2a7ffab2@fsol> "[Discuss] C++ filenames: hyphens or

[jira] [Created] (ARROW-6154) Too many open files (os error 24)

2019-08-06 Thread Yesh (JIRA)
Yesh created ARROW-6154: --- Summary: Too many open files (os error 24) Key: ARROW-6154 URL: https://issues.apache.org/jira/browse/ARROW-6154 Project: Apache Arrow Issue Type: Bug Components:

[jira] [Created] (ARROW-6153) [R] Address parquet deprecation warning

2019-08-06 Thread Neal Richardson (JIRA)
Neal Richardson created ARROW-6153: -- Summary: [R] Address parquet deprecation warning Key: ARROW-6153 URL: https://issues.apache.org/jira/browse/ARROW-6153 Project: Apache Arrow Issue Type:

Re: [Discuss] C++ filenames: hyphens or underscores?

2019-08-06 Thread Wes McKinney
I note that a change from underscores to hyphens would significantly affect the Parquet, Plasma, and Gandiva libraries so I think we need to hear from other developers of those subprojects. Underscores are definitely less disruptive to the status quo On Tue, Aug 6, 2019 at 4:18 PM Wes McKinney

[jira] [Created] (ARROW-6152) [C++][Parquet] Write arrow::Array directly into parquet::TypedColumnWriter

2019-08-06 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-6152: --- Summary: [C++][Parquet] Write arrow::Array directly into parquet::TypedColumnWriter Key: ARROW-6152 URL: https://issues.apache.org/jira/browse/ARROW-6152 Project:

Re: [Discuss] C++ filenames: hyphens or underscores?

2019-08-06 Thread Wes McKinney
I have a slight gut preference for underscores but I am OK with changing everything to hyphens. The hyphens will probably grow on me as it means pressing the "shift" key less frequently. Is there any technical argument for using one over the other? My understanding is that `git blame` is pretty

[jira] [Created] (ARROW-6151) [R] See if possible to generate r/inst/NOTICE.txt rather than duplicate information

2019-08-06 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-6151: --- Summary: [R] See if possible to generate r/inst/NOTICE.txt rather than duplicate information Key: ARROW-6151 URL: https://issues.apache.org/jira/browse/ARROW-6151

[VOTE] Alter Arrow binary protocol to address 8-byte Flatbuffer alignment requirements

2019-08-06 Thread Wes McKinney
hi all, As we've been discussing for the last 5 weeks or so [1], there is a need to introduce 4 bytes of padding into the preamble of the "encapsulated IPC message" format to ensure that the Flatbuffers metadata payload begins on an 8-byte aligned memory offset. The alternative to this would be

[jira] [Created] (ARROW-6150) Intermittent Pyarrow HDFS IO error

2019-08-06 Thread Saurabh Bajaj (JIRA)
Saurabh Bajaj created ARROW-6150: Summary: Intermittent Pyarrow HDFS IO error Key: ARROW-6150 URL: https://issues.apache.org/jira/browse/ARROW-6150 Project: Apache Arrow Issue Type: Bug

Arrow sync call tomorrow (August 7) at 12:00 US/Eastern, 16:00 UTC

2019-08-06 Thread Neal Richardson
Hi all, Reminder that the biweekly Arrow call is tomorrow at https://meet.google.com/vtm-teks-phx. All are welcome to join. Notes will be sent out to the mailing list afterwards. Neal

[jira] [Created] (ARROW-6149) [Parquet] Decimal comparisons used for min/max statistics are not correct

2019-08-06 Thread Philip Felton (JIRA)
Philip Felton created ARROW-6149: Summary: [Parquet] Decimal comparisons used for min/max statistics are not correct Key: ARROW-6149 URL: https://issues.apache.org/jira/browse/ARROW-6149 Project:

[jira] [Created] (ARROW-6148) Missing debian build dependencies

2019-08-06 Thread Francois Saint-Jacques (JIRA)
Francois Saint-Jacques created ARROW-6148: - Summary: Missing debian build dependencies Key: ARROW-6148 URL: https://issues.apache.org/jira/browse/ARROW-6148 Project: Apache Arrow

[jira] [Created] (ARROW-6147) [Go] implement a Flight client

2019-08-06 Thread Sebastien Binet (JIRA)
Sebastien Binet created ARROW-6147: -- Summary: [Go] implement a Flight client Key: ARROW-6147 URL: https://issues.apache.org/jira/browse/ARROW-6147 Project: Apache Arrow Issue Type: New

[jira] [Created] (ARROW-6146) [Go] implement a Plasma client

2019-08-06 Thread Sebastien Binet (JIRA)
Sebastien Binet created ARROW-6146: -- Summary: [Go] implement a Plasma client Key: ARROW-6146 URL: https://issues.apache.org/jira/browse/ARROW-6146 Project: Apache Arrow Issue Type: New

[Discuss] C++ filenames: hyphens or underscores?

2019-08-06 Thread Antoine Pitrou
Hello, The filenames in the C++ source tree are a bit ad hoc and inconsistent. Sometimes they use hyphens for word separation, sometimes underscores. In ARROW-4648 it was proposed that we unify C++ file naming, therefore there are two possible options: only hyphens, or only underscores. What

[jira] [Created] (ARROW-6145) [Java] UnionVector created by MinorType#getNewVector could not keep field type info properly

2019-08-06 Thread Ji Liu (JIRA)
Ji Liu created ARROW-6145: - Summary: [Java] UnionVector created by MinorType#getNewVector could not keep field type info properly Key: ARROW-6145 URL: https://issues.apache.org/jira/browse/ARROW-6145

[jira] [Created] (ARROW-6144) Implement random function in Gandiva

2019-08-06 Thread Prudhvi Porandla (JIRA)
Prudhvi Porandla created ARROW-6144: --- Summary: Implement random function in Gandiva Key: ARROW-6144 URL: https://issues.apache.org/jira/browse/ARROW-6144 Project: Apache Arrow Issue Type:

Re: [Discuss][Java] 64-bit lengths for ValueVectors

2019-08-06 Thread Fan Liya
Hi Micah, Thanks a lot for doing this. I am a little concerned about if there is any negative performance impact on the current 32-bit-length based applications. Can we do some performance comparison on our existing benchmarks? Best, Liya Fan On Tue, Aug 6, 2019 at 3:35 PM Micah Kornfield

[Discuss][Java] 64-bit lengths for ValueVectors

2019-08-06 Thread Micah Kornfield
There have been some previous discussions on the mailing about supporting 64-bit lengths for Java ValueVectors (this is what the IPC specification and C++ support). I created a PR [1] that changes all APIs that I could find that take an index to take an "long" instead of an "int" (and similarly