[jira] [Resolved] (ARROW-6194) [Java] Add non-static approach in DictionaryEncoder making it easy to extend and reuse

2019-08-13 Thread Micah Kornfield (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved ARROW-6194. Resolution: Fixed Fix Version/s: 0.15.0 Issue resolved by pull request 5055

[jira] [Created] (ARROW-6232) [C++] Rename Argsort kernel to SortIndices

2019-08-13 Thread Sutou Kouhei (JIRA)
Sutou Kouhei created ARROW-6232: --- Summary: [C++] Rename Argsort kernel to SortIndices Key: ARROW-6232 URL: https://issues.apache.org/jira/browse/ARROW-6232 Project: Apache Arrow Issue Type:

[jira] [Updated] (ARROW-6232) [C++] Rename Argsort kernel to SortIndices

2019-08-13 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6232: -- Labels: pull-request-available (was: ) > [C++] Rename Argsort kernel to SortIndices >

[jira] [Commented] (ARROW-6230) [R] Reading in parquent files are 20x slower than reading fst files in R

2019-08-13 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906835#comment-16906835 ] Wes McKinney commented on ARROW-6230: - Oh, you're running into

[jira] [Commented] (ARROW-6230) [R] Reading in parquent files are 20x slower than reading fst files in R

2019-08-13 Thread Zhuo Jia Dai (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906828#comment-16906828 ] Zhuo Jia Dai commented on ARROW-6230: - Actually, I can't even read it in Python on the same machine

[jira] [Commented] (ARROW-6230) [R] Reading in parquent files are 20x slower than reading fst files in R

2019-08-13 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906826#comment-16906826 ] Wes McKinney commented on ARROW-6230: - cc [~romainfrancois] [~npr] > [R] Reading in parquent files

[jira] [Commented] (ARROW-6230) [R] Reading in parquent files are 20x slower than reading fst files in R

2019-08-13 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906825#comment-16906825 ] Wes McKinney commented on ARROW-6230: - For the record this file takes on the same order of magnitude

[jira] [Created] (ARROW-6231) [Python] Consider assigning default column names when reading CSV file and header_rows=0

2019-08-13 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-6231: --- Summary: [Python] Consider assigning default column names when reading CSV file and header_rows=0 Key: ARROW-6231 URL: https://issues.apache.org/jira/browse/ARROW-6231

[jira] [Updated] (ARROW-6230) [R] Reading in parquent files are 20x slower than reading fst files in R

2019-08-13 Thread Zhuo Jia Dai (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhuo Jia Dai updated ARROW-6230: Description: *Problem* Loading any of the data I mentioned below is 20x slower than the fst

[jira] [Updated] (ARROW-6206) [Java][Docs] Document environment variables/java properties

2019-08-13 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6206: -- Labels: pull-request-available (was: ) > [Java][Docs] Document environment variables/java

[jira] [Comment Edited] (ARROW-6230) [R] Reading in parquent files are 20x slower than reading fst files in R

2019-08-13 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906812#comment-16906812 ] Wes McKinney edited comment on ARROW-6230 at 8/14/19 2:40 AM: -- Thanks for

[jira] [Commented] (ARROW-6230) [R] Reading in parquent files are 20x slower than reading fst files in R

2019-08-13 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906812#comment-16906812 ] Wes McKinney commented on ARROW-6230: - Thanks for the example. I'm interested to see what the time is

[jira] [Updated] (ARROW-6230) [R] Reading in parquent files are 20x slower than reading fst files in R

2019-08-13 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-6230: Summary: [R] Reading in parquent files are 20x slower than reading fst files in R (was: Reading

[jira] [Resolved] (ARROW-6087) [Rust] [DataFusion] Implement parallel execution for CSV scan

2019-08-13 Thread Andy Grove (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove resolved ARROW-6087. --- Resolution: Fixed Fix Version/s: 0.15.0 This was resolved in 

[jira] [Commented] (ARROW-6204) [GLib] Add garrow_array_is_in_chunked_array()

2019-08-13 Thread Yosuke Shiro (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906774#comment-16906774 ] Yosuke Shiro commented on ARROW-6204: - Yes. I'll create a pull request. Thanks! > [GLib] Add

[jira] [Created] (ARROW-6230) Reading in parquent files are 20x slower than reading fst files in R

2019-08-13 Thread Zhuo Jia Dai (JIRA)
Zhuo Jia Dai created ARROW-6230: --- Summary: Reading in parquent files are 20x slower than reading fst files in R Key: ARROW-6230 URL: https://issues.apache.org/jira/browse/ARROW-6230 Project: Apache

[jira] [Commented] (ARROW-6180) [C++] Create InputStream that references a segment of a RandomAccessFile

2019-08-13 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906659#comment-16906659 ] Wes McKinney commented on ARROW-6180: - You can let me take care of this so we aren't going back and

[jira] [Commented] (ARROW-5952) [Python] Segfault when reading empty table with category as pandas dataframe

2019-08-13 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906655#comment-16906655 ] Wes McKinney commented on ARROW-5952: - I marked for 0.15.0. This probably isn't too difficult to fix,

[jira] [Updated] (ARROW-5952) [Python] Segfault when reading empty table with category as pandas dataframe

2019-08-13 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-5952: Fix Version/s: (was: 1.0.0) 0.15.0 > [Python] Segfault when reading empty

[jira] [Commented] (ARROW-6180) [C++] Create InputStream that references a segment of a RandomAccessFile

2019-08-13 Thread Deepak Majeti (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906653#comment-16906653 ] Deepak Majeti commented on ARROW-6180: -- [~wesmckinn], [~pitrou] looks like the issue can also happen

[jira] [Commented] (ARROW-6180) [C++] Create InputStream that references a segment of a RandomAccessFile

2019-08-13 Thread Antoine Pitrou (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906647#comment-16906647 ] Antoine Pitrou commented on ARROW-6180: --- I see. That sounds reasonable to me. > [C++] Create

[jira] [Commented] (ARROW-6180) [C++] Create InputStream that references a segment of a RandomAccessFile

2019-08-13 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906643#comment-16906643 ] Wes McKinney commented on ARROW-6180: - The idea is to have an private implementation of

[jira] [Commented] (ARROW-5952) [Python] Segfault when reading empty table with category as pandas dataframe

2019-08-13 Thread Daniel Nugent (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906634#comment-16906634 ] Daniel Nugent commented on ARROW-5952: -- FWIW, I tried to figure out where this was getting handled

[jira] [Resolved] (ARROW-6177) [C++] Add Array::Validate()

2019-08-13 Thread Sutou Kouhei (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sutou Kouhei resolved ARROW-6177. - Resolution: Fixed Fix Version/s: 0.15.0 Issue resolved by pull request 5067

[jira] [Updated] (ARROW-6186) [Packaging][C++] Plasma headers not included for ubuntu-xenial libplasma-dev debian package

2019-08-13 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6186: -- Labels: debian packaging pull-request-available (was: debian packaging) > [Packaging][C++]

[jira] [Commented] (ARROW-6180) [C++] Create InputStream that references a segment of a RandomAccessFile

2019-08-13 Thread Zherui Cao (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906614#comment-16906614 ] Zherui Cao commented on ARROW-6180: --- [~wesmckinn] If I don't change BufferedInputStream, I need to make

[jira] [Comment Edited] (ARROW-6180) [C++] Create InputStream that references a segment of a RandomAccessFile

2019-08-13 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16904683#comment-16904683 ] Wes McKinney edited comment on ARROW-6180 at 8/13/19 8:23 PM: -- The way I

[jira] [Commented] (ARROW-6180) [C++] Create InputStream that references a segment of a RandomAccessFile

2019-08-13 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906584#comment-16906584 ] Wes McKinney commented on ARROW-6180: - [~pitrou] the problem is that multiple threads are creating

[jira] [Closed] (ARROW-6109) [Integration] Docker image for integration testing can't be built on windows

2019-08-13 Thread Paddy Horan (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paddy Horan closed ARROW-6109. -- Resolution: Won't Fix Will be fixed upstream in docker at some point. > [Integration] Docker image

[jira] [Updated] (ARROW-3246) [Python][Parquet] direct reading/writing of pandas categoricals in parquet

2019-08-13 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-3246: -- Labels: parquet pull-request-available (was: parquet) > [Python][Parquet] direct

[jira] [Resolved] (ARROW-517) [C++] Verbose Array::Equals

2019-08-13 Thread Antoine Pitrou (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-517. -- Resolution: Fixed Fix Version/s: 0.15.0 Issue resolved by pull request 4782

[jira] [Commented] (ARROW-6211) [Java] Remove dependency on RangeEqualsVisitor from ValueVector interface

2019-08-13 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906415#comment-16906415 ] Bryan Cutler commented on ARROW-6211: - This sounds good to me then, I agree it would be useful to

[jira] [Commented] (ARROW-6202) [Java] Exception in thread "main" org.apache.arrow.memory.OutOfMemoryException: Unable to allocate buffer of size 4 due to memory limit. Current allocation: 2147483646

2019-08-13 Thread Micah Kornfield (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906377#comment-16906377 ] Micah Kornfield commented on ARROW-6202: Sorry should clarify above. With an 800mb of data I

[jira] [Commented] (ARROW-6206) [Java][Docs] Document environment variables/java properties

2019-08-13 Thread Micah Kornfield (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906350#comment-16906350 ] Micah Kornfield commented on ARROW-6206: [~tianchen92] thanks for volunteering to do this. >

[jira] [Created] (ARROW-6228) [C++] Add context lines to Diff formatting

2019-08-13 Thread Benjamin Kietzman (JIRA)
Benjamin Kietzman created ARROW-6228: Summary: [C++] Add context lines to Diff formatting Key: ARROW-6228 URL: https://issues.apache.org/jira/browse/ARROW-6228 Project: Apache Arrow

[jira] [Assigned] (ARROW-6206) [Java][Docs] Document environment variables/java properties

2019-08-13 Thread Micah Kornfield (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned ARROW-6206: -- Assignee: Ji Liu > [Java][Docs] Document environment variables/java properties >

[jira] [Commented] (ARROW-6206) [Java][Docs] Document environment variables/java properties

2019-08-13 Thread Micah Kornfield (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906345#comment-16906345 ] Micah Kornfield commented on ARROW-6206: "is there a charter for what java usecases will be

[jira] [Created] (ARROW-6227) [Python] pyarrow.array() shouldn't coerce np.nan to string

2019-08-13 Thread Igor Yastrebov (JIRA)
Igor Yastrebov created ARROW-6227: - Summary: [Python] pyarrow.array() shouldn't coerce np.nan to string Key: ARROW-6227 URL: https://issues.apache.org/jira/browse/ARROW-6227 Project: Apache Arrow

[jira] [Created] (ARROW-6226) [C++] refactor Diff and PrettyPrint to share code

2019-08-13 Thread Benjamin Kietzman (JIRA)
Benjamin Kietzman created ARROW-6226: Summary: [C++] refactor Diff and PrettyPrint to share code Key: ARROW-6226 URL: https://issues.apache.org/jira/browse/ARROW-6226 Project: Apache Arrow

[jira] [Commented] (ARROW-6225) [Website] Update arrow-site/README and any other places to point website contributors in right direction

2019-08-13 Thread Neal Richardson (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906342#comment-16906342 ] Neal Richardson commented on ARROW-6225: I can make an Infra ticket–is that the way to go? >

[jira] [Commented] (ARROW-6225) [Website] Update arrow-site/README and any other places to point website contributors in right direction

2019-08-13 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906339#comment-16906339 ] Wes McKinney commented on ARROW-6225: - Yeah Infra would have to change the default branch >

[jira] [Commented] (ARROW-6225) [Website] Update arrow-site/README and any other places to point website contributors in right direction

2019-08-13 Thread Neal Richardson (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906338#comment-16906338 ] Neal Richardson commented on ARROW-6225: On [https://github.com/apache/arrow-site/pull/9],

[jira] [Assigned] (ARROW-6225) [Website] Update arrow-site/README and any other places to point website contributors in right direction

2019-08-13 Thread Neal Richardson (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-6225: -- Assignee: Neal Richardson > [Website] Update arrow-site/README and any other places

[jira] [Commented] (ARROW-6202) [Java] Exception in thread "main" org.apache.arrow.memory.OutOfMemoryException: Unable to allocate buffer of size 4 due to memory limit. Current allocation: 2147483646

2019-08-13 Thread Micah Kornfield (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906332#comment-16906332 ] Micah Kornfield commented on ARROW-6202: Thanks for the update.  The code on master might still

[jira] [Updated] (ARROW-6219) [Java] Add API for JDBC adapter that can convert less then the full result set at a time.

2019-08-13 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6219: -- Labels: pull-request-available (was: ) > [Java] Add API for JDBC adapter that can convert

[jira] [Closed] (ARROW-6191) [C++] buffer size default value will throw an error

2019-08-13 Thread Zherui Cao (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zherui Cao closed ARROW-6191. - Resolution: Not A Bug > [C++] buffer size default value will throw an error >

[jira] [Commented] (ARROW-6191) [C++] buffer size default value will throw an error

2019-08-13 Thread Zherui Cao (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906250#comment-16906250 ] Zherui Cao commented on ARROW-6191: --- letting caller set the buffer size themselves is ok. If they see

[jira] [Commented] (ARROW-6222) Serialising numpy array yields `pyarrow.lib.ArrowNotImplementedError: list`

2019-08-13 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906229#comment-16906229 ] Wes McKinney commented on ARROW-6222: - Seems like the error messages could be improved in the short

[jira] [Created] (ARROW-6225) [Website] Update arrow-site/README and any other places to point website contributors in right direction

2019-08-13 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-6225: --- Summary: [Website] Update arrow-site/README and any other places to point website contributors in right direction Key: ARROW-6225 URL:

[jira] [Updated] (ARROW-6224) [Python] remaining usages of the 'data' attribute (from previous Column) cause warnings

2019-08-13 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6224: -- Labels: pull-request-available (was: ) > [Python] remaining usages of the 'data' attribute

[jira] [Resolved] (ARROW-6205) ARROW_DEPRECATED warning when including io/interfaces.h from CUDA (.cu) source

2019-08-13 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-6205. - Resolution: Fixed Fix Version/s: 0.15.0 Issue resolved by pull request 5062

[jira] [Updated] (ARROW-6205) [C++] ARROW_DEPRECATED warning when including io/interfaces.h from CUDA (.cu) source

2019-08-13 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-6205: Summary: [C++] ARROW_DEPRECATED warning when including io/interfaces.h from CUDA (.cu) source

[jira] [Commented] (ARROW-6222) Serialising numpy array yields `pyarrow.lib.ArrowNotImplementedError: list`

2019-08-13 Thread Marcel Ackermann (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906204#comment-16906204 ] Marcel Ackermann commented on ARROW-6222: - Thank you very much! Looking forward to list support

[jira] [Commented] (ARROW-6223) [C++] Configuration error with Anaconda Python 3.7.4

2019-08-13 Thread Antoine Pitrou (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906201#comment-16906201 ] Antoine Pitrou commented on ARROW-6223: --- Reported upstream at

[jira] [Created] (ARROW-6224) [Python] remaining usages of the 'data' attribute (from previous Column) cause warnings

2019-08-13 Thread Joris Van den Bossche (JIRA)
Joris Van den Bossche created ARROW-6224: Summary: [Python] remaining usages of the 'data' attribute (from previous Column) cause warnings Key: ARROW-6224 URL:

[jira] [Assigned] (ARROW-6224) [Python] remaining usages of the 'data' attribute (from previous Column) cause warnings

2019-08-13 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reassigned ARROW-6224: Assignee: Joris Van den Bossche > [Python] remaining usages of the 'data'

[jira] [Commented] (ARROW-6222) Serialising numpy array yields `pyarrow.lib.ArrowNotImplementedError: list`

2019-08-13 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906196#comment-16906196 ] Joris Van den Bossche commented on ARROW-6222: -- > Is there currently any way to serialize a

[jira] [Issue Comment Deleted] (ARROW-5295) [Python] accept pyarrow values / scalars in constructor functions ?

2019-08-13 Thread Marcel Ackermann (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcel Ackermann updated ARROW-5295: Comment: was deleted (was: This would be required for serializing dataframe that contain

[jira] [Updated] (ARROW-6223) [C++] Configuration error with Anaconda Python 3.7.4

2019-08-13 Thread Antoine Pitrou (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-6223: -- Affects Version/s: 0.14.1 > [C++] Configuration error with Anaconda Python 3.7.4 >

[jira] [Updated] (ARROW-6223) [C++] Configuration error with Anaconda Python 3.7.4

2019-08-13 Thread Antoine Pitrou (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-6223: -- Description: {code} CMake Error at cmake_modules/FindPythonLibsNew.cmake:127 (message):

[jira] [Updated] (ARROW-6223) [C++] Configuration error with Anaconda Python 3.7.4

2019-08-13 Thread Antoine Pitrou (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-6223: -- Priority: Blocker (was: Major) > [C++] Configuration error with Anaconda Python 3.7.4 >

[jira] [Commented] (ARROW-5295) [Python] accept pyarrow values / scalars in constructor functions ?

2019-08-13 Thread Marcel Ackermann (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906193#comment-16906193 ] Marcel Ackermann commented on ARROW-5295: - This would be required for serializing dataframe that

[jira] [Updated] (ARROW-6223) [C++] Configuration error with Anaconda Python 3.7.4

2019-08-13 Thread Antoine Pitrou (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-6223: -- Component/s: Python C++ > [C++] Configuration error with Anaconda Python

[jira] [Resolved] (ARROW-6181) [R] Only allow R package to install without libarrow on linux

2019-08-13 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-6181. - Resolution: Fixed Fix Version/s: 0.15.0 Issue resolved by pull request 5045

[jira] [Created] (ARROW-6223) [C++] Configuration error with Anaconda Python 3.7.4

2019-08-13 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-6223: - Summary: [C++] Configuration error with Anaconda Python 3.7.4 Key: ARROW-6223 URL: https://issues.apache.org/jira/browse/ARROW-6223 Project: Apache Arrow

[jira] [Commented] (ARROW-5295) [Python] accept pyarrow values / scalars in constructor functions ?

2019-08-13 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906190#comment-16906190 ] Joris Van den Bossche commented on ARROW-5295: -- Additional case (from ARROW-6222): pyarrow

[jira] [Commented] (ARROW-6222) Serialising numpy array yields `pyarrow.lib.ArrowNotImplementedError: list`

2019-08-13 Thread Marcel Ackermann (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906189#comment-16906189 ] Marcel Ackermann commented on ARROW-6222: - Thank you for the explanation [~jorisvandenbossche].

[jira] [Commented] (ARROW-6222) Serialising numpy array yields `pyarrow.lib.ArrowNotImplementedError: list`

2019-08-13 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906184#comment-16906184 ] Joris Van den Bossche commented on ARROW-6222: -- Focusing on the cases independent of pytorch

[jira] [Resolved] (ARROW-5746) [Website] Move website source out of apache/arrow

2019-08-13 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-5746. - Resolution: Fixed Fix Version/s: 0.15.0 Issue resolved by pull request 5015

[jira] [Commented] (ARROW-3762) [C++] Parquet arrow::Table reads error when overflowing capacity of BinaryArray

2019-08-13 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906173#comment-16906173 ] Wes McKinney commented on ARROW-3762: - Reopened. There have been a lot of patches affecting the

[jira] [Reopened] (ARROW-3762) [C++] Parquet arrow::Table reads error when overflowing capacity of BinaryArray

2019-08-13 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reopened ARROW-3762: - > [C++] Parquet arrow::Table reads error when overflowing capacity of > BinaryArray >

[jira] [Updated] (ARROW-3762) [C++] Parquet arrow::Table reads error when overflowing capacity of BinaryArray

2019-08-13 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-3762: Fix Version/s: 0.15.0 > [C++] Parquet arrow::Table reads error when overflowing capacity of >

[jira] [Updated] (ARROW-6177) [C++] Add Array::Validate()

2019-08-13 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-6177: Summary: [C++] Add Array::Validate() (was: [C++] Add Arrow::Validate()) > [C++] Add

[jira] [Updated] (ARROW-6222) Serialising numpy array yields `pyarrow.lib.ArrowNotImplementedError: list`

2019-08-13 Thread Marcel Ackermann (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcel Ackermann updated ARROW-6222: Description: I want to serialize pytorch tensors, but as they are not implemented in arrow

[jira] [Comment Edited] (ARROW-5610) [Python] Define extension type API in Python to "receive" or "send" a foreign extension type

2019-08-13 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905256#comment-16905256 ] Joris Van den Bossche edited comment on ARROW-5610 at 8/13/19 1:03 PM:

[jira] [Commented] (ARROW-6222) Serialising numpy array yields `pyarrow.lib.ArrowNotImplementedError: list`

2019-08-13 Thread Marcel Ackermann (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906167#comment-16906167 ] Marcel Ackermann commented on ARROW-6222: - Sure: {code:python} import torch import pyarrow

[jira] [Commented] (ARROW-6222) Serialising numpy array yields `pyarrow.lib.ArrowNotImplementedError: list`

2019-08-13 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906156#comment-16906156 ] Joris Van den Bossche commented on ARROW-6222: -- Can you try to make a reproducible example?

[jira] [Updated] (ARROW-6222) Serialising numpy array yields `pyarrow.lib.ArrowNotImplementedError: list`

2019-08-13 Thread Marcel Ackermann (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcel Ackermann updated ARROW-6222: Description: I want to serialize pytorch tensors, but as they are not implemented in arrow

[jira] [Commented] (ARROW-5566) [Python] Overhaul type unification from Python sequence in arrow::py::InferArrowType

2019-08-13 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906151#comment-16906151 ] Joris Van den Bossche commented on ARROW-5566: -- Ah, yes, indeed, I would expect so. Can you

[jira] [Created] (ARROW-6222) Serialising numpy array yields `pyarrow.lib.ArrowNotImplementedError: list`

2019-08-13 Thread Marcel Ackermann (JIRA)
Marcel Ackermann created ARROW-6222: --- Summary: Serialising numpy array yields `pyarrow.lib.ArrowNotImplementedError: list` Key: ARROW-6222 URL: https://issues.apache.org/jira/browse/ARROW-6222

[jira] [Commented] (ARROW-3762) [C++] Parquet arrow::Table reads error when overflowing capacity of BinaryArray

2019-08-13 Thread Igor Yastrebov (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906096#comment-16906096 ] Igor Yastrebov commented on ARROW-3762: --- Interestingly enough, converting string columns to

[jira] [Commented] (ARROW-6211) [Java] Remove dependency on RangeEqualsVisitor from ValueVector interface

2019-08-13 Thread Pindikura Ravindra (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906084#comment-16906084 ] Pindikura Ravindra commented on ARROW-6211: --- It'll be useful for generic visitors to carry

[jira] [Commented] (ARROW-5566) [Python] Overhaul type unification from Python sequence in arrow::py::InferArrowType

2019-08-13 Thread Igor Yastrebov (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906052#comment-16906052 ] Igor Yastrebov commented on ARROW-5566: --- [~jorisvandenbossche] should it fail on 

[jira] [Commented] (ARROW-5566) [Python] Overhaul type unification from Python sequence in arrow::py::InferArrowType

2019-08-13 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906041#comment-16906041 ] Joris Van den Bossche commented on ARROW-5566: -- [~Igor Yastrebov] that's the expected

[jira] [Commented] (ARROW-6206) [Java][Docs] Document environment variables/java properties

2019-08-13 Thread Jim Northrup (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906033#comment-16906033 ] Jim Northrup commented on ARROW-6206: - NIO is not going to go away, and java is not going to stop

[jira] [Commented] (ARROW-6211) [Java] Remove dependency on RangeEqualsVisitor from ValueVector interface

2019-08-13 Thread Ji Liu (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906003#comment-16906003 ] Ji Liu commented on ARROW-6211: --- [~pravindra] One more question, 'IN value' above is unnecessary? > [Java]

[jira] [Resolved] (ARROW-6218) [Java] Add UINT type test in integration to avoid potential overflow

2019-08-13 Thread Praveen Kumar Desabandu (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Praveen Kumar Desabandu resolved ARROW-6218. Resolution: Fixed Fix Version/s: 1.0.0 Issue resolved by pull

[jira] [Commented] (ARROW-6202) [Java] Exception in thread "main" org.apache.arrow.memory.OutOfMemoryException: Unable to allocate buffer of size 4 due to memory limit. Current allocation: 2147483646

2019-08-13 Thread Jim Northrup (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905982#comment-16905982 ] Jim Northrup commented on ARROW-6202: - for the record we have pre-tensorflow column counts of about

[jira] [Commented] (ARROW-6202) [Java] Exception in thread "main" org.apache.arrow.memory.OutOfMemoryException: Unable to allocate buffer of size 4 due to memory limit. Current allocation: 2147483646

2019-08-13 Thread Jim Northrup (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905974#comment-16905974 ] Jim Northrup commented on ARROW-6202: - full program execution below, stderr. record count 6.6 million

[jira] [Commented] (ARROW-5566) [Python] Overhaul type unification from Python sequence in arrow::py::InferArrowType

2019-08-13 Thread Igor Yastrebov (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905960#comment-16905960 ] Igor Yastrebov commented on ARROW-5566: --- [~wesmckinn] I have found another issue of this type: when

[jira] [Commented] (ARROW-6211) [Java] Remove dependency on RangeEqualsVisitor from ValueVector interface

2019-08-13 Thread Pindikura Ravindra (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905880#comment-16905880 ] Pindikura Ravindra commented on ARROW-6211: --- yes, I think it should. > [Java] Remove