[jira] [Commented] (ARROW-13240) [C++][Parquet] Page statistics not written in v2?

2023-01-06 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655568#comment-17655568 ] Micah Kornfield commented on ARROW-13240: -

[jira] [Created] (ARROW-18253) [C++][Parquet] Improve bounds checking on some inputs

2022-11-04 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-18253: --- Summary: [C++][Parquet] Improve bounds checking on some inputs Key: ARROW-18253 URL: https://issues.apache.org/jira/browse/ARROW-18253 Project: Apache Arrow

[jira] [Commented] (ARROW-17983) [Parquet][C++][Python] "List index overflow" when read parquet file

2022-10-17 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17619234#comment-17619234 ] Micah Kornfield commented on ARROW-17983: - IIRC, I think offset type here is inferred from the

[jira] [Closed] (ARROW-10784) [Python] Loading pyarrow.compute isn't thread safe

2022-10-15 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield closed ARROW-10784. --- Resolution: Cannot Reproduce > [Python] Loading pyarrow.compute isn't thread safe >

[jira] [Commented] (ARROW-10784) [Python] Loading pyarrow.compute isn't thread safe

2022-10-15 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17618185#comment-17618185 ] Micah Kornfield commented on ARROW-10784: - yes, haven't had any luck with a repro > [Python]

[jira] [Commented] (ARROW-17535) [Python] List arrays aren't supported in to_pandas calls

2022-10-15 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17618152#comment-17618152 ] Micah Kornfield commented on ARROW-17535: - Yeah, so I agree with the conclusion that scalar

[jira] [Assigned] (ARROW-16326) [C++][Python] Add GCS Timeout parameter for GCS FileSystem.

2022-09-02 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned ARROW-16326: --- Assignee: Micah Kornfield > [C++][Python] Add GCS Timeout parameter for GCS

[jira] [Commented] (ARROW-16326) [C++][Python] Add GCS Timeout parameter for GCS FileSystem.

2022-09-02 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17599613#comment-17599613 ] Micah Kornfield commented on ARROW-16326: - This was actually done in the PR.

[jira] [Resolved] (ARROW-16326) [C++][Python] Add GCS Timeout parameter for GCS FileSystem.

2022-09-02 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved ARROW-16326. - Resolution: Fixed > [C++][Python] Add GCS Timeout parameter for GCS FileSystem. >

[jira] [Commented] (ARROW-17459) [C++] Support nested data conversions for chunked array

2022-09-01 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17599111#comment-17599111 ] Micah Kornfield commented on ARROW-17459: - Its probably a case of different batch sizes.

[jira] [Commented] (ARROW-17459) [C++] Support nested data conversions for chunked array

2022-09-01 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17599053#comment-17599053 ] Micah Kornfield commented on ARROW-17459: - [~arthurpassos] awesome, nice work. IMO, I don't

[jira] [Commented] (ARROW-17459) [C++] Support nested data conversions for chunked array

2022-08-31 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598742#comment-17598742 ] Micah Kornfield commented on ARROW-17459: - You would have to follow this up the stack from the

[jira] [Commented] (ARROW-17459) [C++] Support nested data conversions for chunked array

2022-08-31 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598501#comment-17598501 ] Micah Kornfield commented on ARROW-17459: - Yes, I think there are some code changes, we

[jira] [Commented] (ARROW-17459) [C++] Support nested data conversions for chunked array

2022-08-30 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598058#comment-17598058 ] Micah Kornfield commented on ARROW-17459: - i.e. LargeBinary, LargeString, LargeList these are

[jira] [Commented] (ARROW-17459) [C++] Support nested data conversions for chunked array

2022-08-30 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598038#comment-17598038 ] Micah Kornfield commented on ARROW-17459: - 1. ChunkedArrays have a Flatten method that will do

[jira] [Created] (ARROW-17535) [Python] List arrays aren't supported in to_pandas calls

2022-08-25 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-17535: --- Summary: [Python] List arrays aren't supported in to_pandas calls Key: ARROW-17535 URL: https://issues.apache.org/jira/browse/ARROW-17535 Project: Apache Arrow

[jira] [Commented] (ARROW-17069) [Python][R] GCSFIleSystem reports cannot resolve host on public buckets

2022-07-14 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566710#comment-17566710 ] Micah Kornfield commented on ARROW-17069: - Also does it help if you increase retry_limit_seconds

[jira] [Commented] (ARROW-17069) [Python][R] GCSFIleSystem reports cannot resolve host on public buckets

2022-07-14 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566709#comment-17566709 ] Micah Kornfield commented on ARROW-17069: - That is surprising but not an expert on GCS and why

[jira] [Commented] (ARROW-7494) [Java] Remove reader index and writer index from ArrowBuf

2022-07-03 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17561933#comment-17561933 ] Micah Kornfield commented on ARROW-7494: It still seems like a valid improvement to me, but it

[jira] [Commented] (ARROW-16339) [C++][Parquet] Parquet FileMetaData key_value_metadata not always mapped to Arrow Schema metadata

2022-05-09 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534121#comment-17534121 ] Micah Kornfield commented on ARROW-16339: - Q1: I also think the answer is yes. Q2: Yes, that

[jira] [Created] (ARROW-16484) [Go][Parquet] Ensure a WriterVersion is written out in parquet go.

2022-05-05 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-16484: --- Summary: [Go][Parquet] Ensure a WriterVersion is written out in parquet go. Key: ARROW-16484 URL: https://issues.apache.org/jira/browse/ARROW-16484 Project:

[jira] [Commented] (ARROW-16433) [Release][C++] parquet-arrow-test test fails on windows

2022-05-02 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17530800#comment-17530800 ] Micah Kornfield commented on ARROW-16433: - Is there a stack trace or exception that can be

[jira] [Created] (ARROW-16326) [C++][Python] Add GCS Timeout parameter for GCS FileSystem.

2022-04-25 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-16326: --- Summary: [C++][Python] Add GCS Timeout parameter for GCS FileSystem. Key: ARROW-16326 URL: https://issues.apache.org/jira/browse/ARROW-16326 Project: Apache

[jira] [Commented] (ARROW-16118) [C++] Reduce memory usage when writing to IPC

2022-04-21 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525893#comment-17525893 ] Micah Kornfield commented on ARROW-16118: - Also, we should be careful how this enabled, since if

[jira] [Created] (ARROW-16270) [C++][Python][FileSystem] Make directory paths returned uniform

2022-04-21 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-16270: --- Summary: [C++][Python][FileSystem] Make directory paths returned uniform Key: ARROW-16270 URL: https://issues.apache.org/jira/browse/ARROW-16270 Project:

[jira] [Commented] (ARROW-12203) [C++][Python] Switch default Parquet version to 2.4

2022-04-20 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525397#comment-17525397 ] Micah Kornfield commented on ARROW-12203: - CC [~willb_google]    This will still potentially

[jira] [Created] (ARROW-16227) [Archery] Make cpp argument list keyword only

2022-04-18 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-16227: --- Summary: [Archery] Make cpp argument list keyword only Key: ARROW-16227 URL: https://issues.apache.org/jira/browse/ARROW-16227 Project: Apache Arrow

[jira] [Created] (ARROW-16226) [C++] Add better coverage for filesystem tell.

2022-04-18 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-16226: --- Summary: [C++] Add better coverage for filesystem tell. Key: ARROW-16226 URL: https://issues.apache.org/jira/browse/ARROW-16226 Project: Apache Arrow

[jira] [Commented] (ARROW-16160) [C++] IPC Stream Reader doesn't check if extra fields are present for RecordBatches

2022-04-08 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17519819#comment-17519819 ] Micah Kornfield commented on ARROW-16160: - It appears in on master branch we get:

[jira] [Commented] (ARROW-16160) [C++] IPC Stream Reader doesn't check if extra fields are present for RecordBatches

2022-04-08 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17519818#comment-17519818 ] Micah Kornfield commented on ARROW-16160: - The opposite direction returns an error

[jira] [Created] (ARROW-16160) [C++] IPC Stream Reader doesn't check if extra fields are present for RecordBatches

2022-04-08 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-16160: --- Summary: [C++] IPC Stream Reader doesn't check if extra fields are present for RecordBatches Key: ARROW-16160 URL: https://issues.apache.org/jira/browse/ARROW-16160

[jira] [Commented] (ARROW-16147) [C++] ParquetFileWriter doesn't call sink_.Close when using GcsRandomAccessFile

2022-04-07 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17519176#comment-17519176 ] Micah Kornfield commented on ARROW-16147: - Thanks for the thorough tests. It isn't clear to me

[jira] [Commented] (ARROW-16102) [C++] Builds that use cpp/cmake_modules/FindgRPCAlt.cmake cannot build GCS support

2022-04-04 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17516633#comment-17516633 ] Micah Kornfield commented on ARROW-16102: -

[jira] [Updated] (ARROW-16102) [C++] Builds that use cpp/cmake_modules/FindgRPCAlt.cmake cannot build GCS support

2022-04-04 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated ARROW-16102: Component/s: C++ > [C++] Builds that use cpp/cmake_modules/FindgRPCAlt.cmake cannot build

[jira] [Updated] (ARROW-16102) [C++] Builds that use cpp/cmake_modules/FindgRPCAlt.cmake cannot build GCS support

2022-04-04 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated ARROW-16102: Summary: [C++] Builds that use cpp/cmake_modules/FindgRPCAlt.cmake cannot build GCS

[jira] [Commented] (ARROW-16102) [C++] Builds that us cpp/cmake_modules/FindgRPCAlt.cmake cannot build GCS support

2022-04-03 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17516576#comment-17516576 ] Micah Kornfield commented on ARROW-16102: - This affects mingw and os/x CI builds. > [C++]

[jira] [Created] (ARROW-16102) [C++] Builds that us cpp/cmake_modules/FindgRPCAlt.cmake cannot build GCS support

2022-04-03 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-16102: --- Summary: [C++] Builds that us cpp/cmake_modules/FindgRPCAlt.cmake cannot build GCS support Key: ARROW-16102 URL: https://issues.apache.org/jira/browse/ARROW-16102

[jira] [Comment Edited] (ARROW-16048) [PyArrow] Null buffers with Pickle protocol.

2022-03-28 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17513531#comment-17513531 ] Micah Kornfield edited comment on ARROW-16048 at 3/28/22, 5:47 PM: --- I

[jira] [Commented] (ARROW-16048) [PyArrow] Null buffers with Pickle protocol.

2022-03-28 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17513531#comment-17513531 ] Micah Kornfield commented on ARROW-16048: - I don't think this affect faithfulness.  Another

[jira] [Created] (ARROW-16048) [PyArrow] Null buffers with Pickle protocol.

2022-03-28 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-16048: --- Summary: [PyArrow] Null buffers with Pickle protocol. Key: ARROW-16048 URL: https://issues.apache.org/jira/browse/ARROW-16048 Project: Apache Arrow

[jira] [Commented] (ARROW-14892) [Python] Add bindings for GCS filesystem

2022-03-15 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17507198#comment-17507198 ] Micah Kornfield commented on ARROW-14892: - Yes, spending some time on it today and hopefully

[jira] [Commented] (ARROW-15855) [Python] Add dictionary_pagesize_limit to Parquet writer

2022-03-07 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17502449#comment-17502449 ] Micah Kornfield commented on ARROW-15855: - This is a duplicate of

[jira] [Created] (ARROW-15783) [Python] Converting arrow MonthDayNanoInterval to pandas fails DCHECK

2022-02-24 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-15783: --- Summary: [Python] Converting arrow MonthDayNanoInterval to pandas fails DCHECK Key: ARROW-15783 URL: https://issues.apache.org/jira/browse/ARROW-15783 Project:

[jira] [Created] (ARROW-15728) [Python] Zstd IPC test is flaky.

2022-02-17 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-15728: --- Summary: [Python] Zstd IPC test is flaky. Key: ARROW-15728 URL: https://issues.apache.org/jira/browse/ARROW-15728 Project: Apache Arrow Issue Type:

[jira] [Created] (ARROW-15727) [Python] Lists of MonthDayNano Interval can't be converted to Pandas

2022-02-17 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-15727: --- Summary: [Python] Lists of MonthDayNano Interval can't be converted to Pandas Key: ARROW-15727 URL: https://issues.apache.org/jira/browse/ARROW-15727 Project:

[jira] [Commented] (ARROW-15492) [Python] handle timestamp type in parquet file for compatibility with older HiveQL

2022-02-10 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17490695#comment-17490695 ] Micah Kornfield commented on ARROW-15492: - OK so it isn't a bug with missing the logical type.  

[jira] [Comment Edited] (ARROW-12509) [C++] More fine-grained control of file creation in filesystem layer

2022-02-08 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17489286#comment-17489286 ] Micah Kornfield edited comment on ARROW-12509 at 2/9/22, 6:52 AM: -- Note

[jira] [Commented] (ARROW-12509) [C++] More fine-grained control of file creation in filesystem layer

2022-02-08 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17489286#comment-17489286 ] Micah Kornfield commented on ARROW-12509: - Note about the use-case on the blocking issue.  In

[jira] [Assigned] (ARROW-14893) [C++] Allow creating GCS filesystem from URI

2022-02-07 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned ARROW-14893: --- Assignee: Micah Kornfield > [C++] Allow creating GCS filesystem from URI >

[jira] [Assigned] (ARROW-14892) [Python] Add bindings for GCS filesystem

2022-02-07 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned ARROW-14892: --- Assignee: Micah Kornfield > [Python] Add bindings for GCS filesystem >

[jira] [Commented] (ARROW-15492) [Python] handle timestamp type in parquet file for compatibility with older HiveQL

2022-02-07 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17488603#comment-17488603 ] Micah Kornfield commented on ARROW-15492: - So this looks like an oversight with int96. The

[jira] [Created] (ARROW-15596) thift_internal.h assumes shared_ptr type in some cases

2022-02-06 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-15596: --- Summary: thift_internal.h assumes shared_ptr type in some cases Key: ARROW-15596 URL: https://issues.apache.org/jira/browse/ARROW-15596 Project: Apache Arrow

[jira] [Assigned] (ARROW-15080) [Python] Allow creation of month_day_nano interval from tuple

2022-02-06 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned ARROW-15080: --- Assignee: Micah Kornfield > [Python] Allow creation of month_day_nano interval

[jira] [Commented] (ARROW-15492) [Python] handle timestamp type in parquet file for compatibility with older HiveQL

2022-02-05 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17487634#comment-17487634 ] Micah Kornfield commented on ARROW-15492: - On the exposing the write field, per the other Jira I

[jira] [Comment Edited] (ARROW-9311) [JS] Use feature enum in javascript

2022-02-05 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17487632#comment-17487632 ] Micah Kornfield edited comment on ARROW-9311 at 2/6/22, 6:52 AM: - There

[jira] [Comment Edited] (ARROW-9311) [JS] Use feature enum in javascript

2022-02-05 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17487632#comment-17487632 ] Micah Kornfield edited comment on ARROW-9311 at 2/6/22, 6:52 AM: - There

[jira] [Commented] (ARROW-9311) [JS] Use feature enum in javascript

2022-02-05 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17487632#comment-17487632 ] Micah Kornfield commented on ARROW-9311: There was a [Feature

[jira] [Commented] (ARROW-15548) [C++][Parquet] Field-level metadata are not supported? (ColumnMetadata.key_value_metadata)

2022-02-05 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17487624#comment-17487624 ] Micah Kornfield commented on ARROW-15548: - I don't think it is a bad idea to have something

[jira] [Commented] (ARROW-12203) [C++][Python] Switch default Parquet version to 2.4

2022-02-05 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17487623#comment-17487623 ] Micah Kornfield commented on ARROW-12203: - 8.0 release is targeted in the April time frame? >

[jira] [Created] (ARROW-15511) [Python] GIL not held for Ndarray1DIndexer on

2022-01-31 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-15511: --- Summary: [Python] GIL not held for Ndarray1DIndexer on Key: ARROW-15511 URL: https://issues.apache.org/jira/browse/ARROW-15511 Project: Apache Arrow

[jira] [Commented] (ARROW-5569) [C++] import avro C++ code to code base.

2021-12-14 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17459655#comment-17459655 ] Micah Kornfield commented on ARROW-5569: [~willjones127] agreed.  If you want to pursue support

[jira] [Commented] (ARROW-12203) [C++][Python] Switch default Parquet version to 2.4

2021-12-14 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17459648#comment-17459648 ] Micah Kornfield commented on ARROW-12203: - Unfortunately not yet, I think if we could wait until

[jira] [Commented] (ARROW-15080) [Python] Allow creation of month_day_nano interval from tuple

2021-12-14 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17459646#comment-17459646 ] Micah Kornfield commented on ARROW-15080: - I wasn't sure if we wanted this, if you think it is a

[jira] [Commented] (ARROW-12706) [Python] Drop python 3.6 and numpy 1.16 support

2021-12-14 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17459645#comment-17459645 ] Micah Kornfield commented on ARROW-12706: - My understanding is that most Google have committed

[jira] [Commented] (ARROW-15073) [C++][Parquet][Python] LZ4- and zstd- compressed parquet files are unreadable by (py)spark

2021-12-13 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17458302#comment-17458302 ] Micah Kornfield commented on ARROW-15073: - If LZ4 gets translated to LZ4_RAW depending on the

[jira] [Comment Edited] (ARROW-15073) [C++][Parquet][Python] LZ4- and zstd- compressed parquet files are unreadable by (py)spark

2021-12-13 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17458302#comment-17458302 ] Micah Kornfield edited comment on ARROW-15073 at 12/13/21, 10:56 AM: -

[jira] [Comment Edited] (ARROW-15073) [C++][Parquet][Python] LZ4- and zstd- compressed parquet files are unreadable by (py)spark

2021-12-11 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17457700#comment-17457700 ] Micah Kornfield edited comment on ARROW-15073 at 12/11/21, 6:07 PM:

[jira] [Commented] (ARROW-15073) [C++][Parquet][Python] LZ4- and zstd- compressed parquet files are unreadable by (py)spark

2021-12-11 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17457700#comment-17457700 ] Micah Kornfield commented on ARROW-15073: - This is expected LZ4 has always had compatibility

[jira] [Commented] (ARROW-14960) [C++] Google style guide allows mutable references now, what do?

2021-12-06 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454381#comment-17454381 ] Micah Kornfield commented on ARROW-14960: - I think the size of Google's code base meant that

[jira] [Commented] (ARROW-8214) [C++] Flatbuffers based serialization protocol for Expressions

2021-11-26 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17449665#comment-17449665 ] Micah Kornfield commented on ARROW-8214: I think the IR model seems like the right way to go with

[jira] [Commented] (ARROW-11829) [C++] Update developer style guide on usage of shared_ptr

2021-11-26 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17449664#comment-17449664 ] Micah Kornfield commented on ARROW-11829: - Yes, that is the thread.  Sorry, hope to make some

[jira] [Commented] (ARROW-14770) Direct (individualized) access to definition levels, repetition levels, and numeric data of a column

2021-11-22 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17447632#comment-17447632 ] Micah Kornfield commented on ARROW-14770: - FWIW writing V2 data pages isn't production ready in

[jira] [Commented] (ARROW-11901) [Java] Investigate potential performance improvement of compression codec

2021-11-17 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17445041#comment-17445041 ] Micah Kornfield commented on ARROW-11901: - {quote}It's not about eliminating anything, it's

[jira] [Resolved] (ARROW-14601) [Java] Error comments for Minor Type of TIMESTAMPSEC

2021-11-05 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved ARROW-14601. - Resolution: Fixed Issue resolved by pull request 11618

[jira] [Assigned] (ARROW-14601) [Java] Error comments for Minor Type of TIMESTAMPSEC

2021-11-05 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned ARROW-14601: --- Assignee: Kun Liu > [Java] Error comments for Minor Type of TIMESTAMPSEC >

[jira] [Updated] (ARROW-14601) [Java] Error comments for Minor Type of TIMESTAMPSEC

2021-11-05 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated ARROW-14601: Affects Version/s: 6.0.0 > [Java] Error comments for Minor Type of TIMESTAMPSEC >

[jira] [Updated] (ARROW-14601) [Java] Error comments for Minor Type of TIMESTAMPSEC

2021-11-05 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated ARROW-14601: Component/s: Java > [Java] Error comments for Minor Type of TIMESTAMPSEC >

[jira] [Updated] (ARROW-14601) [Java] Error comments for Minor Type of TIMESTAMPSEC

2021-11-05 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated ARROW-14601: Fix Version/s: 7.0.0 > [Java] Error comments for Minor Type of TIMESTAMPSEC >

[jira] [Assigned] (ARROW-12970) [Python] Efficient "row accessor" for a pyarrow RecordBatch / Table

2021-11-05 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned ARROW-12970: --- Assignee: Micah Kornfield > [Python] Efficient "row accessor" for a pyarrow

[jira] [Commented] (ARROW-11901) [Java] Investigate potential performance improvement of compression codec

2021-11-05 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17439561#comment-17439561 ] Micah Kornfield commented on ARROW-11901: - {quote} As Samuel pointed out, it might be a valid

[jira] [Resolved] (ARROW-14547) Reading FixedSizeListArray from Parquet with nulls

2021-11-02 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved ARROW-14547. - Resolution: Fixed > Reading FixedSizeListArray from Parquet with nulls >

[jira] [Commented] (ARROW-14547) Reading FixedSizeListArray from Parquet with nulls

2021-11-02 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437162#comment-17437162 ] Micah Kornfield commented on ARROW-14547: - Duplicate of

[jira] [Commented] (ARROW-11901) [Java] Investigate potential performance improvement of compression codec

2021-10-26 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17434646#comment-17434646 ] Micah Kornfield commented on ARROW-11901: - Does the presets library add a lot of value? Could

[jira] [Commented] (ARROW-14303) [C++][Parquet] Do not duplicate Schema metadata in Parquet schema metadata and serialized ARROW:schema value

2021-10-23 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17433403#comment-17433403 ] Micah Kornfield commented on ARROW-14303: - Its been a while since I looked at this code, but I

[jira] [Commented] (ARROW-14422) [Python] Allow parquet::WriterProperties::created_by to be set via pyarrow.ParquetWriter for compatibility with older parquet-mr

2021-10-23 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17433402#comment-17433402 ] Micah Kornfield commented on ARROW-14422: - [fastparquet created_by

[jira] [Commented] (ARROW-12976) [Python] Arrow-to-Python conversion is slow

2021-10-23 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17433399#comment-17433399 ] Micah Kornfield commented on ARROW-12976: - Yeah, given #1 and #2, I think I'll try to simply

[jira] [Commented] (ARROW-14422) [Python] Allow parquet::WriterProperties::created_by to be set via pyarrow.ParquetWriter for compatibility with older parquet-mr

2021-10-23 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17433398#comment-17433398 ] Micah Kornfield commented on ARROW-14422: - {quote}Maintaining some regression test between

[jira] [Updated] (ARROW-14345) [C++] Implement streaming reads for GCS FileSystem

2021-10-20 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated ARROW-14345: Fix Version/s: (was: 6.0.0) 7.0.0 > [C++] Implement streaming

[jira] [Resolved] (ARROW-14345) [C++] Implement streaming reads for GCS FileSystem

2021-10-20 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved ARROW-14345. - Fix Version/s: 6.0.0 Resolution: Fixed Issue resolved by pull request 11436

[jira] [Commented] (ARROW-12976) [Python] Arrow-to-Python conversion is slow

2021-10-15 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17429520#comment-17429520 ] Micah Kornfield commented on ARROW-12976: - One thing we discussed on the sync call is if a more

[jira] [Commented] (ARROW-12976) [Python] Arrow-to-Python conversion is slow

2021-10-09 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426744#comment-17426744 ] Micah Kornfield commented on ARROW-12976: - [~apitrou] [~jorisvandenbossche] going to see if I

[jira] [Assigned] (ARROW-12976) [Python] Arrow-to-Python conversion is slow

2021-10-09 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned ARROW-12976: --- Assignee: Micah Kornfield > [Python] Arrow-to-Python conversion is slow >

[jira] [Resolved] (ARROW-13604) [Java] Remove deprecation annotations for APIs representing unsupported operations

2021-10-07 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved ARROW-13604. - Fix Version/s: 6.0.0 Resolution: Fixed Issue resolved by pull request 10911

[jira] [Resolved] (ARROW-13257) [Java][Dataset] Allow passing empty columns for projection

2021-10-07 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved ARROW-13257. - Resolution: Fixed Issue resolved by pull request 10652

[jira] [Commented] (ARROW-14196) [C++][Parquet] Default to compliant nested types in Parquet writer

2021-10-07 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425938#comment-17425938 ] Micah Kornfield commented on ARROW-14196: - Also CC [~jpivarski] on thoughts on how this might

[jira] [Commented] (ARROW-14196) [C++][Parquet] Default to compliant nested types in Parquet writer

2021-10-07 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425937#comment-17425937 ] Micah Kornfield commented on ARROW-14196: - I think similar to other default changes we have

[jira] [Commented] (ARROW-14196) [C++][Parquet] Default to compliant nested types in Parquet writer

2021-10-07 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425933#comment-17425933 ] Micah Kornfield commented on ARROW-14196: - Yeah, I think it is really only the name change that

[jira] [Commented] (ARROW-13151) [Python] Unable to read single child field of struct column from Parquet

2021-10-06 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425351#comment-17425351 ] Micah Kornfield commented on ARROW-13151: - With the PR that is up this now seems to work:

[jira] [Comment Edited] (ARROW-13151) [Python] Unable to read single child field of struct column from Parquet

2021-10-06 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420211#comment-17420211 ] Micah Kornfield edited comment on ARROW-13151 at 10/7/21, 5:38 AM: ---

[jira] [Comment Edited] (ARROW-13806) [Python] Add conversion to/from Pandas/Python for Month, Day Nano Interval Type

2021-10-04 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17423775#comment-17423775 ] Micah Kornfield edited comment on ARROW-13806 at 10/4/21, 6:52 AM: ---

  1   2   3   4   5   6   >