[jira] [Commented] (ARROW-3910) [Python] Set date_as_object to True in *.to_pandas as default after deduplicating logic implemented

2019-01-16 Thread Jim Crist (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-3910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744144#comment-16744144 ] Jim Crist commented on ARROW-3910: -- I'm just trying to determine if we should also output objects for

[jira] [Commented] (ARROW-3910) [Python] Set date_as_object to True in *.to_pandas as default after deduplicating logic implemented

2019-01-15 Thread Jim Crist (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-3910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743412#comment-16743412 ] Jim Crist commented on ARROW-3910: -- Should `to_pandas_dtype()` return object dtype for date types as

[jira] [Commented] (ARROW-3009) Python ORC failing on 0.10.0

2018-08-07 Thread Jim Crist (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571868#comment-16571868 ] Jim Crist commented on ARROW-3009: -- We run tests in the dask repository pulling one of the test datasets

[jira] [Commented] (ARROW-3009) Python ORC failing on 0.10.0

2018-08-07 Thread Jim Crist (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571856#comment-16571856 ] Jim Crist commented on ARROW-3009: -- Looks like it should be `pyarrow_wrap_batch`. Will issue a PR. >

[jira] [Created] (ARROW-3009) Python ORC failing on 0.10.0

2018-08-07 Thread Jim Crist (JIRA)
Jim Crist created ARROW-3009: Summary: Python ORC failing on 0.10.0 Key: ARROW-3009 URL: https://issues.apache.org/jira/browse/ARROW-3009 Project: Apache Arrow Issue Type: Improvement

[jira] [Created] (ARROW-2444) Better handle reading empty parquet files

2018-04-10 Thread Jim Crist (JIRA)
Jim Crist created ARROW-2444: Summary: Better handle reading empty parquet files Key: ARROW-2444 URL: https://issues.apache.org/jira/browse/ARROW-2444 Project: Apache Arrow Issue Type:

[jira] [Commented] (ARROW-2083) Support skipping builds

2018-02-05 Thread Jim Crist (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-2083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352711#comment-16352711 ] Jim Crist commented on ARROW-2083: -- I added something similar for dask to only build the hdfs tests on

[jira] [Commented] (ARROW-448) [Python] Load HdfsClient default options from core-site.xml or hdfs-site.xml, if available

2018-02-02 Thread Jim Crist (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351100#comment-16351100 ] Jim Crist commented on ARROW-448: - > Does this rely on any environment variables to work (maybe

[jira] [Created] (ARROW-2085) HadoopFileSystem.isdir and .isfile should return False if the path doesn't exist

2018-02-02 Thread Jim Crist (JIRA)
Jim Crist created ARROW-2085: Summary: HadoopFileSystem.isdir and .isfile should return False if the path doesn't exist Key: ARROW-2085 URL: https://issues.apache.org/jira/browse/ARROW-2085 Project:

[jira] [Updated] (ARROW-2081) Hdfs client isn't fork-safe

2018-02-01 Thread Jim Crist (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Crist updated ARROW-2081: - Component/s: Python C++ > Hdfs client isn't fork-safe > --- > >

[jira] [Created] (ARROW-2081) Hdfs client isn't fork-safe

2018-02-01 Thread Jim Crist (JIRA)
Jim Crist created ARROW-2081: Summary: Hdfs client isn't fork-safe Key: ARROW-2081 URL: https://issues.apache.org/jira/browse/ARROW-2081 Project: Apache Arrow Issue Type: Bug

[jira] [Comment Edited] (ARROW-2079) Possibly use `_common_metadata` for schema if `_metadata` isn't available

2018-02-01 Thread Jim Crist (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-2079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349345#comment-16349345 ] Jim Crist edited comment on ARROW-2079 at 2/1/18 9:58 PM: -- cc [~xhochy] was

[jira] [Created] (ARROW-2079) Possibly use `_common_metadata` for schema if `_metadata` isn't available

2018-02-01 Thread Jim Crist (JIRA)
Jim Crist created ARROW-2079: Summary: Possibly use `_common_metadata` for schema if `_metadata` isn't available Key: ARROW-2079 URL: https://issues.apache.org/jira/browse/ARROW-2079 Project: Apache

[jira] [Commented] (ARROW-1999) [Python] from_numpy_dtype returns wrong types

2018-01-25 Thread Jim Crist (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-1999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16340237#comment-16340237 ] Jim Crist commented on ARROW-1999: -- This looks to be because the cython code is wrapping

[jira] [Commented] (ARROW-448) [Python] Load HdfsClient default options from core-site.xml or hdfs-site.xml, if available

2018-01-25 Thread Jim Crist (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16340219#comment-16340219 ] Jim Crist commented on ARROW-448: - Is this necessary? Per the lihbdfs docs, if no host is provided, the

[jira] [Created] (ARROW-2036) NativeFile should support standard IOBase methods

2018-01-25 Thread Jim Crist (JIRA)
Jim Crist created ARROW-2036: Summary: NativeFile should support standard IOBase methods Key: ARROW-2036 URL: https://issues.apache.org/jira/browse/ARROW-2036 Project: Apache Arrow Issue Type:

[jira] [Commented] (ARROW-2031) HadoopFileSystem isn't pickleable

2018-01-24 Thread Jim Crist (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338622#comment-16338622 ] Jim Crist commented on ARROW-2031: -- Ah, good catch, sorry about that. Searching for "hdfs" didn't turn

[jira] [Created] (ARROW-2029) [Python] Program crash on `HdfsFile.tell` if file is closed

2018-01-24 Thread Jim Crist (JIRA)
Jim Crist created ARROW-2029: Summary: [Python] Program crash on `HdfsFile.tell` if file is closed Key: ARROW-2029 URL: https://issues.apache.org/jira/browse/ARROW-2029 Project: Apache Arrow

[jira] [Commented] (ARROW-2025) [Python/C++] HDFS Client disconnect closes all open clients

2018-01-24 Thread Jim Crist (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-2025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16337955#comment-16337955 ] Jim Crist commented on ARROW-2025: -- Actually, we should just use `hdfsBuilderSetForceNewInstance` to

[jira] [Commented] (ARROW-2025) [Python/C++] HDFS Client disconnect closes all open clients

2018-01-24 Thread Jim Crist (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-2025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16337932#comment-16337932 ] Jim Crist commented on ARROW-2025: -- Looking closer, I think this may be due to the libhdfs fs cache. From

[jira] [Created] (ARROW-2025) [Python/C++] HDFS Client disconnect closes all open clients

2018-01-24 Thread Jim Crist (JIRA)
Jim Crist created ARROW-2025: Summary: [Python/C++] HDFS Client disconnect closes all open clients Key: ARROW-2025 URL: https://issues.apache.org/jira/browse/ARROW-2025 Project: Apache Arrow

[jira] [Created] (ARROW-1982) [Python] Return parquet statistics min/max as values instead of strings

2018-01-10 Thread Jim Crist (JIRA)
Jim Crist created ARROW-1982: Summary: [Python] Return parquet statistics min/max as values instead of strings Key: ARROW-1982 URL: https://issues.apache.org/jira/browse/ARROW-1982 Project: Apache Arrow

[jira] [Created] (ARROW-1980) [Python] Race condition in `write_to_dataset`

2018-01-09 Thread Jim Crist (JIRA)
Jim Crist created ARROW-1980: Summary: [Python] Race condition in `write_to_dataset` Key: ARROW-1980 URL: https://issues.apache.org/jira/browse/ARROW-1980 Project: Apache Arrow Issue Type: Bug

[jira] [Created] (ARROW-1920) Add support for reading ORC files

2017-12-13 Thread Jim Crist (JIRA)
Jim Crist created ARROW-1920: Summary: Add support for reading ORC files Key: ARROW-1920 URL: https://issues.apache.org/jira/browse/ARROW-1920 Project: Apache Arrow Issue Type: New Feature