[jira] [Created] (PARQUET-1524) [C++] remove dependency on getopt in parquet tools

2019-02-05 Thread Micah Kornfield (JIRA)
Micah Kornfield created PARQUET-1524: Summary: [C++] remove dependency on getopt in parquet tools Key: PARQUET-1524 URL: https://issues.apache.org/jira/browse/PARQUET-1524 Project: Parquet

[jira] [Commented] (PARQUET-1524) [C++] remove dependency on getopt in parquet tools

2019-02-05 Thread Micah Kornfield (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760555#comment-16760555 ] Micah Kornfield commented on PARQUET-1524: -- Created to replace ARROW-4456 > [C++] remove

[jira] [Created] (PARQUET-1582) [C++] Add ToString method ColumnDescriptor

2019-05-17 Thread Micah Kornfield (JIRA)
Micah Kornfield created PARQUET-1582: Summary: [C++] Add ToString method ColumnDescriptor Key: PARQUET-1582 URL: https://issues.apache.org/jira/browse/PARQUET-1582 Project: Parquet Issue

[jira] [Created] (PARQUET-1581) [C++] Fix undefined behavior in encoding.cc when num_dictionary_values is 0.

2019-05-17 Thread Micah Kornfield (JIRA)
Micah Kornfield created PARQUET-1581: Summary: [C++] Fix undefined behavior in encoding.cc when num_dictionary_values is 0. Key: PARQUET-1581 URL: https://issues.apache.org/jira/browse/PARQUET-1581

[jira] [Commented] (PARQUET-1621) [C++] Add encrypted parquet files to apache parquet-testing repository

2019-08-28 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918296#comment-16918296 ] Micah Kornfield commented on PARQUET-1621: -- It looks like the PR is merged, can should this be

[jira] [Commented] (PARQUET-1403) [C++] Coerce Arrow half-precision float to float32

2019-11-20 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16978970#comment-16978970 ] Micah Kornfield commented on PARQUET-1403: -- it seems like this should belong in the Arrow

[jira] [Commented] (PARQUET-1700) [C++] Stream API: Add support for repeated fields

2019-11-25 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16982125#comment-16982125 ] Micah Kornfield commented on PARQUET-1700: -- it might be nice to have something reusable

[jira] [Created] (PARQUET-1788) [C++] ColumnWriter has undefined behavior when writing arrow chunks

2020-02-05 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-1788: Summary: [C++] ColumnWriter has undefined behavior when writing arrow chunks Key: PARQUET-1788 URL: https://issues.apache.org/jira/browse/PARQUET-1788

[jira] [Assigned] (PARQUET-1841) [C++] Experiment to see if using SIMD shuffle operations for DecodeSpaced improves performance

2020-04-16 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned PARQUET-1841: Assignee: Micah Kornfield > [C++] Experiment to see if using SIMD shuffle

[jira] [Commented] (PARQUET-1841) [C++] Experiment to see if using SIMD shuffle operations for DecodeSpaced improves performance

2020-04-16 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085431#comment-17085431 ] Micah Kornfield commented on PARQUET-1841: -- For AVX512 enabled processor the mask_expand like

[jira] [Commented] (PARQUET-1841) [C++] Experiment to see if using SIMD shuffle operations for DecodeSpaced improves performance

2020-04-16 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085433#comment-17085433 ] Micah Kornfield commented on PARQUET-1841: -- To get this assigned to you you will need someone

[jira] [Comment Edited] (PARQUET-1841) [C++] Experiment to see if using SIMD shuffle operations for DecodeSpaced improves performance

2020-04-16 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085431#comment-17085431 ] Micah Kornfield edited comment on PARQUET-1841 at 4/17/20, 4:52 AM:

[jira] [Created] (PARQUET-1841) [C++] Experiment to see if using SIMD shuffle operations for DecodeSpaced improves performance

2020-04-13 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-1841: Summary: [C++] Experiment to see if using SIMD shuffle operations for DecodeSpaced improves performance Key: PARQUET-1841 URL:

[jira] [Created] (PARQUET-1838) [C++] Expose an API that allows direct writing of RLE information for rep/def levels when writing parquet files

2020-04-11 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-1838: Summary: [C++] Expose an API that allows direct writing of RLE information for rep/def levels when writing parquet files Key: PARQUET-1838 URL:

[jira] [Created] (PARQUET-1837) [C++] Expose an API that surface RLE information for rep/def levels when reading parquet files

2020-04-11 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-1837: Summary: [C++] Expose an API that surface RLE information for rep/def levels when reading parquet files Key: PARQUET-1837 URL:

[jira] [Updated] (PARQUET-1838) [C++] Expose an API that allows direct writing of RLE information for rep/def levels when writing parquet files

2020-04-11 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-1838: - Description: When  writing data to parquet it can potentially be more efficient to

[jira] [Created] (PARQUET-1840) DecodeSpaced copies more values then necessary

2020-04-12 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-1840: Summary: DecodeSpaced copies more values then necessary Key: PARQUET-1840 URL: https://issues.apache.org/jira/browse/PARQUET-1840 Project: Parquet

[jira] [Updated] (PARQUET-1840) DecodeSpaced copies/touches more values then necessary

2020-04-12 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-1840: - Summary: DecodeSpaced copies/touches more values then necessary (was: DecodeSpaced

[jira] [Updated] (PARQUET-1840) [C++] DecodeSpaced copies/touches more values then necessary

2020-04-12 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-1840: - Summary: [C++] DecodeSpaced copies/touches more values then necessary (was:

[jira] [Updated] (PARQUET-1840) [C++] DecodeSpaced copies more values then necessary

2020-04-12 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-1840: - Summary: [C++] DecodeSpaced copies more values then necessary (was: [C++]

[jira] [Commented] (PARQUET-1841) [C++] Experiment to see if using SIMD shuffle operations for DecodeSpaced improves performance

2020-04-22 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089347#comment-17089347 ] Micah Kornfield commented on PARQUET-1841: -- I've been using

[jira] [Assigned] (PARQUET-1839) values_read not updated in ReadBatchSpaced

2020-05-03 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned PARQUET-1839: Assignee: Micah Kornfield > values_read not updated in ReadBatchSpaced >

[jira] [Created] (PARQUET-1899) [C++] Deprecated ReadBatchSpaced in parquet/column_reader

2020-08-19 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-1899: Summary: [C++] Deprecated ReadBatchSpaced in parquet/column_reader Key: PARQUET-1899 URL: https://issues.apache.org/jira/browse/PARQUET-1899 Project: Parquet

[jira] [Commented] (PARQUET-1904) [C++] Export file_offset in RowGroupMetaData

2020-08-26 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17185582#comment-17185582 ] Micah Kornfield commented on PARQUET-1904: -- [~wesm] [~uwe] I don't have access in the parquet

[jira] [Moved] (PARQUET-1904) [C++] Export file_offset in RowGroupMetaData

2020-08-26 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield moved ARROW-9824 to PARQUET-1904: - Component/s: (was: C++) parquet-cpp

[jira] [Resolved] (PARQUET-1904) [C++] Export file_offset in RowGroupMetaData

2020-08-26 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-1904. -- Resolution: Fixed > [C++] Export file_offset in RowGroupMetaData >

[jira] [Created] (PARQUET-1933) [Format] Clarify encodings and data page guidance.

2020-10-21 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-1933: Summary: [Format] Clarify encodings and data page guidance. Key: PARQUET-1933 URL: https://issues.apache.org/jira/browse/PARQUET-1933 Project: Parquet

[jira] [Updated] (PARQUET-1935) [C++][Parquet] nullptr access violation when writing arrays of non-nullable values

2020-10-23 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-1935: - Component/s: parquet-cpp > [C++][Parquet] nullptr access violation when writing arrays

[jira] [Moved] (PARQUET-1935) [C++][Parquet] nullptr access violation when writing arrays of non-nullable values

2020-10-23 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield moved ARROW-10377 to PARQUET-1935: -- Key: PARQUET-1935 (was: ARROW-10377)

[jira] [Commented] (PARQUET-1935) [C++][Parquet] nullptr access violation when writing arrays of non-nullable values

2020-10-23 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17219933#comment-17219933 ] Micah Kornfield commented on PARQUET-1935: -- One workaround for this is to detect the last

[jira] [Assigned] (PARQUET-1882) Writing an all-null column and then reading it with buffered_stream aborts the process

2020-07-11 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned PARQUET-1882: Assignee: Micah Kornfield > Writing an all-null column and then reading it with

[jira] [Created] (PARQUET-1877) [C++] Reconcile container size with string size for memory issues

2020-06-16 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-1877: Summary: [C++] Reconcile container size with string size for memory issues Key: PARQUET-1877 URL: https://issues.apache.org/jira/browse/PARQUET-1877 Project:

[jira] [Commented] (PARQUET-1946) Parquet File not readable by Google big query (works with Spark)

2020-11-29 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17240479#comment-17240479 ] Micah Kornfield commented on PARQUET-1946: -- Are you using V2 datapages?  BQ doesn't yet

[jira] [Commented] (PARQUET-1946) Parquet File not readable by Google big query (works with Spark)

2020-12-06 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17244962#comment-17244962 ] Micah Kornfield commented on PARQUET-1946: -- I'm not an expert on the tool.  Looking through

[jira] [Commented] (PARQUET-1936) WriteBatchSpaced writes incorrect value for parquet when input contains NULL list

2020-11-15 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17232503#comment-17232503 ] Micah Kornfield commented on PARQUET-1936: -- [~Ruta Dhaneshwar] sure, do you maybe want to make

[jira] [Commented] (PARQUET-1935) [C++][Parquet] nullptr access violation when writing arrays of non-nullable values

2020-10-30 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223994#comment-17223994 ] Micah Kornfield commented on PARQUET-1935: -- Yes this is possibly a 2.0.1 bug fix candidate if

[jira] [Commented] (PARQUET-1936) WriteBatchSpaced writes incorrect value for parquet when input contains NULL list

2020-10-30 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223995#comment-17223995 ] Micah Kornfield commented on PARQUET-1936: -- [~Ruta Dhaneshwar] part of this might be related

[jira] [Commented] (PARQUET-1958) Forced UTF8 encoding of BYTE_ARRAY on stream::read/write

2021-01-22 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17270548#comment-17270548 ] Micah Kornfield commented on PARQUET-1958: -- I actually am not sure that the check is needed at

[jira] [Assigned] (PARQUET-2056) [C++] Add ability for retrieving dictionary and indices separately for ColumnReader

2021-06-15 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned PARQUET-2056: Assignee: Jinpeng Zhou (was: Micah Kornfield) > [C++] Add ability for

[jira] [Resolved] (PARQUET-2056) [C++] Add ability for retrieving dictionary and indices separately for ColumnReader

2021-06-17 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-2056. -- Fix Version/s: cpp-5.0.0 Resolution: Fixed Issue resolved by pull request

[jira] [Moved] (PARQUET-2056) [C++] Add ability for retrieving dictionary and indices separately for ColumnReader

2021-06-08 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield moved ARROW-13012 to PARQUET-2056: -- Key: PARQUET-2056 (was: ARROW-13012) Workflow:

[jira] [Updated] (PARQUET-2056) [C++] Add ability for retrieving dictionary and indices separately for ColumnReader

2021-06-08 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-2056: - Component/s: parquet-cpp > [C++] Add ability for retrieving dictionary and indices

[jira] [Commented] (PARQUET-1990) [C++] ConvertedType::NA is written out in some cases

2021-03-31 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17312462#comment-17312462 ] Micah Kornfield commented on PARQUET-1990: -- Nice find on the reverted format change.   I added

[jira] [Commented] (PARQUET-1991) Reserve ConvertedType==24 due to bug in parquet-cpp implementation

2021-03-31 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17312478#comment-17312478 ] Micah Kornfield commented on PARQUET-1991: -- I'm OK with won't fix.  I was thinking thrift

[jira] [Resolved] (PARQUET-1122) [C++] Support 2-level list encoding in Arrow decoding

2021-04-02 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-1122. -- Resolution: Implemented > [C++] Support 2-level list encoding in Arrow decoding >

[jira] [Commented] (PARQUET-1122) [C++] Support 2-level list encoding in Arrow decoding

2021-04-02 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17314180#comment-17314180 ] Micah Kornfield commented on PARQUET-1122: -- Yes, all common types should not be readable (some

[jira] [Resolved] (PARQUET-1990) [C++] ConvertedType::NA is written out in some cases

2021-03-31 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-1990. -- Resolution: Fixed Issue resolved by pull request 9863

[jira] [Updated] (PARQUET-2003) Decimal Statistics emitted by parquet-cpp are broken

2021-03-18 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-2003: - Summary: Decimal Statistics emitted by parquet-cpp are broken (was: Decimal

[jira] [Created] (PARQUET-2003) Decimal Statistics admitted for parquet-cpp are broken

2021-03-18 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-2003: Summary: Decimal Statistics admitted for parquet-cpp are broken Key: PARQUET-2003 URL: https://issues.apache.org/jira/browse/PARQUET-2003 Project: Parquet

[jira] [Commented] (PARQUET-1995) [C++][Parquet] Crash at parquet::TypedColumnWriterImpl<>::WriteBatchSpaced

2021-03-09 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17298550#comment-17298550 ] Micah Kornfield commented on PARQUET-1995: -- Well it seems like a bug someplace. I'm not sure

[jira] [Commented] (PARQUET-1995) [C++][Parquet] Crash at parquet::TypedColumnWriterImpl<>::WriteBatchSpaced

2021-03-09 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17298523#comment-17298523 ] Micah Kornfield commented on PARQUET-1995: -- we've also had some other bugs related

[jira] [Commented] (PARQUET-1995) [C++][Parquet] Crash at parquet::TypedColumnWriterImpl<>::WriteBatchSpaced

2021-03-09 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17298515#comment-17298515 ] Micah Kornfield commented on PARQUET-1995: -- This is a little bit hard to diagnose, especially

[jira] [Updated] (PARQUET-1990) [C++] ConvertedType::NA is written out in some cases

2021-02-28 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-1990: - Summary: [C++] ConvertedType::NA is written out in some cases (was: [C++]

[jira] [Created] (PARQUET-1991) Reserve ConvertedType==24 due to bug in parquet-cpp implementation

2021-02-28 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-1991: Summary: Reserve ConvertedType==24 due to bug in parquet-cpp implementation Key: PARQUET-1991 URL: https://issues.apache.org/jira/browse/PARQUET-1991

[jira] [Created] (PARQUET-1990) [C++] ConvertedType::NA is attempted to be written out in some cases

2021-02-28 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-1990: Summary: [C++] ConvertedType::NA is attempted to be written out in some cases Key: PARQUET-1990 URL: https://issues.apache.org/jira/browse/PARQUET-1990

[jira] [Updated] (PARQUET-1990) [C++] ConvertedType::NA is attempted to be written out in some cases

2021-02-28 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-1990: - Description: This makes it an invalid thrift enum.  ::NA is a placeholder enum

[jira] [Assigned] (PARQUET-1655) [C++] Decimal comparisons used for min/max statistics are not correct

2021-02-23 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned PARQUET-1655: Assignee: Micah Kornfield > [C++] Decimal comparisons used for min/max

[jira] [Commented] (PARQUET-1985) Improve integration tests between implementations

2021-02-17 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286274#comment-17286274 ] Micah Kornfield commented on PARQUET-1985: -- I think trying to shoe horn structured data into

[jira] [Commented] (PARQUET-1987) Document how a schema can have columns splitted over different files

2021-02-20 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287833#comment-17287833 ] Micah Kornfield commented on PARQUET-1987: -- CC [~raduteodorescu] > Document how a schema can

[jira] [Commented] (PARQUET-1987) Document how a schema can have columns splitted over different files

2021-02-20 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287834#comment-17287834 ] Micah Kornfield commented on PARQUET-1987: -- This was discussed a little bit on

[jira] [Commented] (PARQUET-1361) [C++] 1.4.1 library allows creation of parquet file w/NULL values for INT types

2021-08-22 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17402882#comment-17402882 ] Micah Kornfield commented on PARQUET-1361: -- Sorry for the late reply, but I think this is an

[jira] [Updated] (PARQUET-2089) [C++] RowGroupMetaData file_offset set incorrectly

2021-09-13 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-2089: - Summary: [C++] RowGroupMetaData file_offset set incorrectly (was: RowGroupMetaData

[jira] [Moved] (PARQUET-2089) RowGroupMetaData file_offset set incorrectly

2021-09-13 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield moved ARROW-13609 to PARQUET-2089: -- Component/s: (was: C++) parquet-cpp

[jira] [Assigned] (PARQUET-2089) [C++] RowGroupMetaData file_offset set incorrectly

2021-09-13 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned PARQUET-2089: Assignee: Micah Kornfield > [C++] RowGroupMetaData file_offset set incorrectly

[jira] [Assigned] (PARQUET-2090) [C++] Parquet writes incorrect file_offset

2021-09-13 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned PARQUET-2090: Assignee: Micah Kornfield > [C++] Parquet writes incorrect file_offset >

[jira] [Resolved] (PARQUET-2090) [C++] Parquet writes incorrect file_offset

2021-09-13 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-2090. -- Resolution: Invalid > [C++] Parquet writes incorrect file_offset >

[jira] [Commented] (PARQUET-2090) [C++] Parquet writes incorrect file_offset

2021-09-13 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17414694#comment-17414694 ] Micah Kornfield commented on PARQUET-2090: -- CC [~zeroshade] > [C++] Parquet writes incorrect

[jira] [Commented] (PARQUET-2089) RowGroupMetaData file_offset set incorrectly

2021-09-13 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17414695#comment-17414695 ] Micah Kornfield commented on PARQUET-2089: -- CC [~zeroshade] > RowGroupMetaData file_offset

[jira] [Commented] (PARQUET-2092) [Go] Fix in go implementation

2021-09-14 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17415048#comment-17415048 ] Micah Kornfield commented on PARQUET-2092: -- I'm going to move this to the Arrow tracker. 

[jira] [Commented] (PARQUET-2092) [Go] Fix in go implementation

2021-09-14 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17415050#comment-17415050 ] Micah Kornfield commented on PARQUET-2092: -- Hmm, it doesn't look like I have permissions to

[jira] [Commented] (PARQUET-2092) [Go] Fix in go implementation

2021-09-14 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17415056#comment-17415056 ] Micah Kornfield commented on PARQUET-2092: -- [~zeroshade] the move option if you are allowed to

[jira] [Moved] (PARQUET-2090) [C++] Parquet writes incorrect file_offset

2021-09-13 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield moved ARROW-13941 to PARQUET-2090: -- Component/s: (was: Parquet) parquet-cpp

[jira] [Commented] (PARQUET-2090) [C++] Parquet writes incorrect file_offset

2021-09-13 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17414716#comment-17414716 ] Micah Kornfield commented on PARQUET-2090: -- [~csun]  according the

[jira] [Commented] (PARQUET-2092) [Go] Fix in go implementation

2021-09-14 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17415060#comment-17415060 ] Micah Kornfield commented on PARQUET-2092: -- OK, would you mind closing this and opening up an

[jira] [Commented] (PARQUET-2095) [C++] Read Parquet file with MapArray

2021-09-25 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420158#comment-17420158 ] Micah Kornfield commented on PARQUET-2095: -- Hi it isn't clear if this is reporting a bug or

[jira] [Commented] (PARQUET-2095) [C++] Read Parquet file with MapArray

2021-10-07 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425958#comment-17425958 ] Micah Kornfield commented on PARQUET-2095: -- Hi [~longshanpdd] did the above response fix your

[jira] [Commented] (PARQUET-2095) [C++] Read Parquet file with MapArray

2021-09-26 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420316#comment-17420316 ] Micah Kornfield commented on PARQUET-2095: -- Can you run ValidateFull on the array? This would

[jira] [Created] (PARQUET-2099) [C++] Statistics::num_values() is misleading

2021-09-30 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-2099: Summary: [C++] Statistics::num_values() is misleading Key: PARQUET-2099 URL: https://issues.apache.org/jira/browse/PARQUET-2099 Project: Parquet

[jira] [Resolved] (PARQUET-2095) [C++] Read Parquet file with MapArray

2021-10-23 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-2095. -- Resolution: Not A Problem > [C++] Read Parquet file with MapArray >

[jira] [Updated] (PARQUET-1361) [C++] 1.4.1 library allows creation of parquet file w/NULL values for INT types

2022-01-03 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-1361: - Component/s: parquet-mr > [C++] 1.4.1 library allows creation of parquet file w/NULL

[jira] [Moved] (PARQUET-2066) [C++][Parquet] num_rows is incorrect for nested types

2021-07-16 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield moved ARROW-13349 to PARQUET-2066: -- Component/s: (was: Parquet) (was:

[jira] [Created] (PARQUET-2067) [C++] null_count and num_nulls incorrect for repeated columns

2021-07-16 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-2067: Summary: [C++] null_count and num_nulls incorrect for repeated columns Key: PARQUET-2067 URL: https://issues.apache.org/jira/browse/PARQUET-2067 Project:

[jira] [Resolved] (PARQUET-2130) Crash on non-standard map key name in debug

2022-03-04 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-2130. -- Fix Version/s: cpp-8.0.0 Resolution: Fixed Issue resolved by pull request

[jira] [Resolved] (PARQUET-2131) Number values decoded DCHECKs should be exceptions

2022-03-04 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-2131. -- Fix Version/s: cpp-8.0.0 Resolution: Fixed Issue resolved by pull request

[jira] [Updated] (PARQUET-2118) [C++] thift_internal.h assumes shared_ptr type in some cases

2022-02-06 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-2118: - Component/s: parquet-cpp > [C++] thift_internal.h assumes shared_ptr type in some

[jira] [Moved] (PARQUET-2118) thift_internal.h assumes shared_ptr type in some cases

2022-02-06 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield moved ARROW-15596 to PARQUET-2118: -- Key: PARQUET-2118 (was: ARROW-15596) Workflow:

[jira] [Updated] (PARQUET-2118) [C++] thift_internal.h assumes shared_ptr type in some cases

2022-02-06 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-2118: - Summary: [C++] thift_internal.h assumes shared_ptr type in some cases (was:

[jira] [Commented] (PARQUET-2133) Support Int8 and Int16 as basic type

2022-04-08 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17519738#comment-17519738 ] Micah Kornfield commented on PARQUET-2133: -- before we start working on it it should probably

[jira] [Commented] (PARQUET-2345) The Parquet Spec doesn't specify whether multiple columns are allowed to have the same name.

2023-10-01 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770902#comment-17770902 ] Micah Kornfield commented on PARQUET-2345: -- I've at least seen in the wild two columns

[jira] [Commented] (PARQUET-1711) [parquet-protobuf] stack overflow when work with well known json type

2022-05-29 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17543672#comment-17543672 ] Micah Kornfield commented on PARQUET-1711: -- the way one could handle this is allow users to

[jira] [Resolved] (PARQUET-2163) Parquet C++ Float Runtime Error in Decimal Schema

2022-07-06 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-2163. -- Fix Version/s: cpp-9.0.0 Resolution: Fixed Issue resolved by pull request

[jira] [Commented] (PARQUET-1711) [parquet-protobuf] stack overflow when work with well known json type

2022-07-23 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17570331#comment-17570331 ] Micah Kornfield commented on PARQUET-1711: -- {quote}[~emkornfield] Can we expect a fix any time

[jira] [Commented] (PARQUET-2122) Adding Bloom filter to small Parquet file bloats in size X1700

2022-05-09 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534123#comment-17534123 ] Micah Kornfield commented on PARQUET-2122: -- I believe the answer is the Bloom filter

[jira] [Commented] (PARQUET-2175) Skip method skips levels and not rows for repeated fields

2022-08-24 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17584388#comment-17584388 ] Micah Kornfield commented on PARQUET-2175: -- I think the current signature is

[jira] [Commented] (PARQUET-1222) Specify a well-defined sorting order for float and double types

2022-09-29 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17611356#comment-17611356 ] Micah Kornfield commented on PARQUET-1222: -- I'd propose the following "fix": - Add a new

[jira] [Commented] (PARQUET-1222) Specify a well-defined sorting order for float and double types

2022-10-08 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17614581#comment-17614581 ] Micah Kornfield commented on PARQUET-1222: -- Elevating the specification level seems fine. I

[jira] [Assigned] (PARQUET-2172) [C++] Make field return const NodePtr& instead of forcing copy of shared_ptr

2022-08-12 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned PARQUET-2172: Assignee: Micah Kornfield > [C++] Make field return const NodePtr& instead of

[jira] [Created] (PARQUET-2172) [C++] Make field return const NodePtr& instead of forcing copy of shared_ptr

2022-08-12 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-2172: Summary: [C++] Make field return const NodePtr& instead of forcing copy of shared_ptr Key: PARQUET-2172 URL: https://issues.apache.org/jira/browse/PARQUET-2172

[jira] [Updated] (PARQUET-2172) [C++] Make field return const NodePtr& instead of forcing copy of shared_ptr

2022-08-12 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-2172: - Fix Version/s: cpp-10.0.0 > [C++] Make field return const NodePtr& instead of forcing

[jira] [Resolved] (PARQUET-2172) [C++] Make field return const NodePtr& instead of forcing copy of shared_ptr

2022-08-12 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-2172. -- Resolution: Fixed > [C++] Make field return const NodePtr& instead of forcing copy

  1   2   >