date:20170425

[jira] [Created] (PARQUET-965) [C++] FIXED_LEN_BYTE_ARRAY types are unhandled in the Arrow reader

2017-04-25 Thread Wes McKinney (JIRA)

Wes McKinney created PARQUET-965:


 Summary: [C++] FIXED_LEN_BYTE_ARRAY types are unhandled in the 
Arrow reader
 Key: PARQUET-965
 URL: https://issues.apache.org/jira/browse/PARQUET-965
 Project: Parquet
  Issue Type: Bug
  Components: parquet-cpp
Reporter: Wes McKinney
 Fix For: cpp-1.1.0


Currently, a {{dynamic_cast}} to a {{ByteArrayType}} reader is failing, 
resulting in a segfault. We should check the Parquet column type and use either 
a BYTE_ARRAY path or FIXED_LEN_BYTE_ARRAY path. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (PARQUET-964) Using ProtoParquet with Hive / AWS Athena: ParquetDecodingException: totalValueCount '0' <= 0

2017-04-25 Thread Constantin Muraru (JIRA)


[ 
https://issues.apache.org/jira/browse/PARQUET-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983884#comment-15983884
 ] 

Constantin Muraru edited comment on PARQUET-964 at 4/26/17 12:30 AM:
-

Thanks [~julienledem]! I'll create a test on my branch which exposes the issue 
more clearly and maybe we can discuss on that.

But as a hint, given the parquet schema below, the totalValueCount is 0 when 
columnReaderImpl.path = ColumnDescriptor {{\[first_array, array, inner_field\] 
INT32}} is absent.

{code}
message TestProtobuf.ListOfList {
  optional binary top_field (UTF8);
  required group first_array (LIST) {
repeated group array {
  optional int32 inner_field;
  required group second_array (LIST) {
repeated int32 array;
  }
}
  }
}
{code}

{code}
ListOfList message = ListOfList.newBuilder()
.setTopField("top_field")

.addFirstArray(ListOfListOuterClass.MyInnerMessage.newBuilder().addSecondArray(2))
 // inner_field missing here
.build();
{code}

!parquet_totalValueCount.png|width=800px!


was (Author: costimuraru):
Thanks [~julienledem]! I'll create a test on my branch which exposes the issue 
more clearly and maybe we can discuss on that.

But as a hint, given the parquet schema below, the totalValueCount is 0 when 
columnReaderImpl.path = ColumnDescriptor {{[first_array, array, inner_field] 
INT32}} is absent.

{code}
message TestProtobuf.ListOfList {
  optional binary top_field (UTF8);
  required group first_array (LIST) {
repeated group array {
  optional int32 inner_field;
  required group second_array (LIST) {
repeated int32 array;
  }
}
  }
}
{code}

{code}
ListOfList message = ListOfList.newBuilder()
.setTopField("top_field")

.addFirstArray(ListOfListOuterClass.MyInnerMessage.newBuilder().addSecondArray(2))
 // inner_field missing here
.build();
{code}

> Using ProtoParquet with Hive / AWS Athena: ParquetDecodingException: 
> totalValueCount '0' <= 0
> -
>
> Key: PARQUET-964
> URL: https://issues.apache.org/jira/browse/PARQUET-964
> Project: Parquet
>  Issue Type: Bug
>Reporter: Constantin Muraru
> Attachments: ListOfList.proto, ListOfListProtoParquetConverter.java, 
> parquet_totalValueCount.png
>
>
> Hi folks!
> We're working on adding support for ProtoParquet to work with Hive / AWS 
> Athena (Presto) \[1\]. The problem we've encountered appears whenever we 
> declare a repeated field (array) or a map in the protobuf schema and we then 
> try to convert it to parquet. The conversion works fine, but when we try to 
> query the data with Hive/Presto, we get some freaky errors.
> We've noticed though that AvroToParquet works great, even when we declare 
> such fields (arrays, maps)! 
> Comparing the parquet schema generated by protobuf vs avro, we've noticed a 
> few differences.
> Take the simple schema below (protobuf):
> {code}
> message ListOfList {
> string top_field = 1;
> repeated MyInnerMessage first_array = 2;
> }
> message MyInnerMessage {
> int32 inner_field = 1;
> repeated int32 second_array = 2;
> }
> {code}
> After using ProtoParquetWriter, the resulting parquet schema is the following:
> {code}
> message TestProtobuf.ListOfList {
>   optional binary top_field (UTF8);
>   repeated group first_array {
> optional int32 inner_field;
> repeated int32 second_array;
>   }
> }
> {code}
> When we try to query this data, we get parsing errors from Hive/Athena. The 
> parsing errors are related to the array/map fields.
> However, if we create a similar avro schema, the parquet result of the 
> AvroParquetWriter is the following:
> {code}
> message TestProtobuf.ListOfList {
>   required binary top_field (UTF8);
>   required group first_array (LIST) {
> repeated group array {
>   required int32 inner_field;
>   required group second_array (LIST) {
> repeated int32 array;
>   }
> }
>   }
> }
> {code}
> This works beautifully with Hive/Athena. Too bad our systems are stuck with 
> protobuf :-) .
> You can see the additional wrappers which are missing from protobuf: 
> {{required group first_array (LIST)}}.
> Our goal is to make the ProtoParquetWriter generate a parquet schema similar 
> to what Avro is doing. We basically want to add these wrappers around 
> lists/maps.
> Everything seemed to work great, until we've bumped into an issue. We tuned 
> ProtoParquetWriter to generate the same parquet schema as AvroParquetWriter. 
> However, one difference between protobuf and avro is that in protobuf we can 
> have a bunch of Optional fields. 
> {code}
> message TestProtobuf.ListOfList {
>   optional binary top_field (UTF8);
>   required group first_array (LIST) {
> repeated group

[jira] [Updated] (PARQUET-964) Using ProtoParquet with Hive / AWS Athena: ParquetDecodingException: totalValueCount '0' <= 0

2017-04-25 Thread Constantin Muraru (JIRA)


 [ 
https://issues.apache.org/jira/browse/PARQUET-964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Constantin Muraru updated PARQUET-964:
--
Attachment: parquet_totalValueCount.png

> Using ProtoParquet with Hive / AWS Athena: ParquetDecodingException: 
> totalValueCount '0' <= 0
> -
>
> Key: PARQUET-964
> URL: https://issues.apache.org/jira/browse/PARQUET-964
> Project: Parquet
>  Issue Type: Bug
>Reporter: Constantin Muraru
> Attachments: ListOfList.proto, ListOfListProtoParquetConverter.java, 
> parquet_totalValueCount.png
>
>
> Hi folks!
> We're working on adding support for ProtoParquet to work with Hive / AWS 
> Athena (Presto) \[1\]. The problem we've encountered appears whenever we 
> declare a repeated field (array) or a map in the protobuf schema and we then 
> try to convert it to parquet. The conversion works fine, but when we try to 
> query the data with Hive/Presto, we get some freaky errors.
> We've noticed though that AvroToParquet works great, even when we declare 
> such fields (arrays, maps)! 
> Comparing the parquet schema generated by protobuf vs avro, we've noticed a 
> few differences.
> Take the simple schema below (protobuf):
> {code}
> message ListOfList {
> string top_field = 1;
> repeated MyInnerMessage first_array = 2;
> }
> message MyInnerMessage {
> int32 inner_field = 1;
> repeated int32 second_array = 2;
> }
> {code}
> After using ProtoParquetWriter, the resulting parquet schema is the following:
> {code}
> message TestProtobuf.ListOfList {
>   optional binary top_field (UTF8);
>   repeated group first_array {
> optional int32 inner_field;
> repeated int32 second_array;
>   }
> }
> {code}
> When we try to query this data, we get parsing errors from Hive/Athena. The 
> parsing errors are related to the array/map fields.
> However, if we create a similar avro schema, the parquet result of the 
> AvroParquetWriter is the following:
> {code}
> message TestProtobuf.ListOfList {
>   required binary top_field (UTF8);
>   required group first_array (LIST) {
> repeated group array {
>   required int32 inner_field;
>   required group second_array (LIST) {
> repeated int32 array;
>   }
> }
>   }
> }
> {code}
> This works beautifully with Hive/Athena. Too bad our systems are stuck with 
> protobuf :-) .
> You can see the additional wrappers which are missing from protobuf: 
> {{required group first_array (LIST)}}.
> Our goal is to make the ProtoParquetWriter generate a parquet schema similar 
> to what Avro is doing. We basically want to add these wrappers around 
> lists/maps.
> Everything seemed to work great, until we've bumped into an issue. We tuned 
> ProtoParquetWriter to generate the same parquet schema as AvroParquetWriter. 
> However, one difference between protobuf and avro is that in protobuf we can 
> have a bunch of Optional fields. 
> {code}
> message TestProtobuf.ListOfList {
>   optional binary top_field (UTF8);
>   required group first_array (LIST) {
> repeated group array {
>   optional int32 inner_field;
>   required group second_array (LIST) {
> repeated int32 array;
>   }
> }
>   }
> }
> {code}
> Notice the: *optional* int32 inner_field (for avro that was *required*).
> When testing with some real proto-parquet data, we get an error every time 
> inner_field is not populated, but the second_array is.
> {noformat}
> parquet-tools cat /tmp/test23.parquet
> org.apache.parquet.io.ParquetDecodingException: Can not read value at 0 in 
> block -1 in file file:/tmp/test23.parquet
>   at 
> org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:223)
>   at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:122)
>   at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:126)
>   at 
> org.apache.parquet.tools.command.CatCommand.execute(CatCommand.java:79)
>   at org.apache.parquet.proto.tools.Main.main(Main.java:214)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
> Caused by: org.apache.parquet.io.ParquetDecodingException: totalValueCount 
> '0' <= 0
>   at 
> org.apache.parquet.column.impl.ColumnReaderImpl.(ColumnReaderImpl.java:349)
>   at 
> org.apache.parquet.column.impl.ColumnReadStoreImpl.newMemColumnReader(ColumnReadStoreImpl.java:82)
>   at 
>

[jira] [Commented] (PARQUET-964) Using ProtoParquet with Hive / AWS Athena: ParquetDecodingException: totalValueCount '0' <= 0

2017-04-25 Thread Constantin Muraru (JIRA)


[ 
https://issues.apache.org/jira/browse/PARQUET-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983884#comment-15983884
 ] 

Constantin Muraru commented on PARQUET-964:
---

Thanks [~julienledem]! I'll create a test on my branch which exposes the issue 
more clearly and maybe we can discuss on that.

But as a hint, given the parquet schema below, the totalValueCount is 0 when 
columnReaderImpl.path = ColumnDescriptor {{[first_array, array, inner_field] 
INT32}} is absent.

{code}
message TestProtobuf.ListOfList {
  optional binary top_field (UTF8);
  required group first_array (LIST) {
repeated group array {
  optional int32 inner_field;
  required group second_array (LIST) {
repeated int32 array;
  }
}
  }
}
{code}

{code}
ListOfList message = ListOfList.newBuilder()
.setTopField("top_field")

.addFirstArray(ListOfListOuterClass.MyInnerMessage.newBuilder().addSecondArray(2))
 // inner_field missing here
.build();
{code}

> Using ProtoParquet with Hive / AWS Athena: ParquetDecodingException: 
> totalValueCount '0' <= 0
> -
>
> Key: PARQUET-964
> URL: https://issues.apache.org/jira/browse/PARQUET-964
> Project: Parquet
>  Issue Type: Bug
>Reporter: Constantin Muraru
> Attachments: ListOfList.proto, ListOfListProtoParquetConverter.java
>
>
> Hi folks!
> We're working on adding support for ProtoParquet to work with Hive / AWS 
> Athena (Presto) \[1\]. The problem we've encountered appears whenever we 
> declare a repeated field (array) or a map in the protobuf schema and we then 
> try to convert it to parquet. The conversion works fine, but when we try to 
> query the data with Hive/Presto, we get some freaky errors.
> We've noticed though that AvroToParquet works great, even when we declare 
> such fields (arrays, maps)! 
> Comparing the parquet schema generated by protobuf vs avro, we've noticed a 
> few differences.
> Take the simple schema below (protobuf):
> {code}
> message ListOfList {
> string top_field = 1;
> repeated MyInnerMessage first_array = 2;
> }
> message MyInnerMessage {
> int32 inner_field = 1;
> repeated int32 second_array = 2;
> }
> {code}
> After using ProtoParquetWriter, the resulting parquet schema is the following:
> {code}
> message TestProtobuf.ListOfList {
>   optional binary top_field (UTF8);
>   repeated group first_array {
> optional int32 inner_field;
> repeated int32 second_array;
>   }
> }
> {code}
> When we try to query this data, we get parsing errors from Hive/Athena. The 
> parsing errors are related to the array/map fields.
> However, if we create a similar avro schema, the parquet result of the 
> AvroParquetWriter is the following:
> {code}
> message TestProtobuf.ListOfList {
>   required binary top_field (UTF8);
>   required group first_array (LIST) {
> repeated group array {
>   required int32 inner_field;
>   required group second_array (LIST) {
> repeated int32 array;
>   }
> }
>   }
> }
> {code}
> This works beautifully with Hive/Athena. Too bad our systems are stuck with 
> protobuf :-) .
> You can see the additional wrappers which are missing from protobuf: 
> {{required group first_array (LIST)}}.
> Our goal is to make the ProtoParquetWriter generate a parquet schema similar 
> to what Avro is doing. We basically want to add these wrappers around 
> lists/maps.
> Everything seemed to work great, until we've bumped into an issue. We tuned 
> ProtoParquetWriter to generate the same parquet schema as AvroParquetWriter. 
> However, one difference between protobuf and avro is that in protobuf we can 
> have a bunch of Optional fields. 
> {code}
> message TestProtobuf.ListOfList {
>   optional binary top_field (UTF8);
>   required group first_array (LIST) {
> repeated group array {
>   optional int32 inner_field;
>   required group second_array (LIST) {
> repeated int32 array;
>   }
> }
>   }
> }
> {code}
> Notice the: *optional* int32 inner_field (for avro that was *required*).
> When testing with some real proto-parquet data, we get an error every time 
> inner_field is not populated, but the second_array is.
> {noformat}
> parquet-tools cat /tmp/test23.parquet
> org.apache.parquet.io.ParquetDecodingException: Can not read value at 0 in 
> block -1 in file file:/tmp/test23.parquet
>   at 
> org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:223)
>   at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:122)
>   at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:126)
>   at 
> org.apache.parquet.tools.command.CatCommand.execute(CatCommand.java:79)
>   at

[jira] [Commented] (PARQUET-964) Using ProtoParquet with Hive / AWS Athena: ParquetDecodingException: totalValueCount '0' <= 0

2017-04-25 Thread Julien Le Dem (JIRA)


[ 
https://issues.apache.org/jira/browse/PARQUET-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983784#comment-15983784
 ] 

Julien Le Dem commented on PARQUET-964:
---

totalValueCount includes null values so it should never be 0 unless you're 
creating empty parquet files?
separately, It should also not be negative (that indicates an overflow of the 
value since the underlying metadata accepts long)
Could you look into why totalValueCount == 0? This should be the sum of all 
value counts for all pages in that column chunk.


> Using ProtoParquet with Hive / AWS Athena: ParquetDecodingException: 
> totalValueCount '0' <= 0
> -
>
> Key: PARQUET-964
> URL: https://issues.apache.org/jira/browse/PARQUET-964
> Project: Parquet
>  Issue Type: Bug
>Reporter: Constantin Muraru
> Attachments: ListOfList.proto, ListOfListProtoParquetConverter.java
>
>
> Hi folks!
> We're working on adding support for ProtoParquet to work with Hive / AWS 
> Athena (Presto) \[1\]. The problem we've encountered appears whenever we 
> declare a repeated field (array) or a map in the protobuf schema and we then 
> try to convert it to parquet. The conversion works fine, but when we try to 
> query the data with Hive/Presto, we get some freaky errors.
> We've noticed though that AvroToParquet works great, even when we declare 
> such fields (arrays, maps)! 
> Comparing the parquet schema generated by protobuf vs avro, we've noticed a 
> few differences.
> Take the simple schema below (protobuf):
> {code}
> message ListOfList {
> string top_field = 1;
> repeated MyInnerMessage first_array = 2;
> }
> message MyInnerMessage {
> int32 inner_field = 1;
> repeated int32 second_array = 2;
> }
> {code}
> After using ProtoParquetWriter, the resulting parquet schema is the following:
> {code}
> message TestProtobuf.ListOfList {
>   optional binary top_field (UTF8);
>   repeated group first_array {
> optional int32 inner_field;
> repeated int32 second_array;
>   }
> }
> {code}
> When we try to query this data, we get parsing errors from Hive/Athena. The 
> parsing errors are related to the array/map fields.
> However, if we create a similar avro schema, the parquet result of the 
> AvroParquetWriter is the following:
> {code}
> message TestProtobuf.ListOfList {
>   required binary top_field (UTF8);
>   required group first_array (LIST) {
> repeated group array {
>   required int32 inner_field;
>   required group second_array (LIST) {
> repeated int32 array;
>   }
> }
>   }
> }
> {code}
> This works beautifully with Hive/Athena. Too bad our systems are stuck with 
> protobuf :-) .
> You can see the additional wrappers which are missing from protobuf: 
> {{required group first_array (LIST)}}.
> Our goal is to make the ProtoParquetWriter generate a parquet schema similar 
> to what Avro is doing. We basically want to add these wrappers around 
> lists/maps.
> Everything seemed to work great, until we've bumped into an issue. We tuned 
> ProtoParquetWriter to generate the same parquet schema as AvroParquetWriter. 
> However, one difference between protobuf and avro is that in protobuf we can 
> have a bunch of Optional fields. 
> {code}
> message TestProtobuf.ListOfList {
>   optional binary top_field (UTF8);
>   required group first_array (LIST) {
> repeated group array {
>   optional int32 inner_field;
>   required group second_array (LIST) {
> repeated int32 array;
>   }
> }
>   }
> }
> {code}
> Notice the: *optional* int32 inner_field (for avro that was *required*).
> When testing with some real proto-parquet data, we get an error every time 
> inner_field is not populated, but the second_array is.
> {noformat}
> parquet-tools cat /tmp/test23.parquet
> org.apache.parquet.io.ParquetDecodingException: Can not read value at 0 in 
> block -1 in file file:/tmp/test23.parquet
>   at 
> org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:223)
>   at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:122)
>   at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:126)
>   at 
> org.apache.parquet.tools.command.CatCommand.execute(CatCommand.java:79)
>   at org.apache.parquet.proto.tools.Main.main(Main.java:214)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
> Caused by:

[jira] [Commented] (PARQUET-853) [C++] Add option to link with shared boost libraries when building Arrow in the thirdparty toolchain

2017-04-25 Thread Wes McKinney (JIRA)


[ 
https://issues.apache.org/jira/browse/PARQUET-853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983690#comment-15983690
 ] 

Wes McKinney commented on PARQUET-853:
--

We should respect the value of {{PARQUET_BOOST_USE_SHARED}}

> [C++] Add option to link with shared boost libraries when building Arrow in 
> the thirdparty toolchain
> 
>
> Key: PARQUET-853
> URL: https://issues.apache.org/jira/browse/PARQUET-853
> Project: Parquet
>  Issue Type: New Feature
>  Components: parquet-cpp
>Reporter: Wes McKinney
> Fix For: cpp-1.1.0
>
>
> See discussion in https://github.com/apache/parquet-cpp/pull/231



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PARQUET-853) [C++] Add option to link with shared boost libraries when building Arrow in the thirdparty toolchain

2017-04-25 Thread Wes McKinney (JIRA)


 [ 
https://issues.apache.org/jira/browse/PARQUET-853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated PARQUET-853:
-
Fix Version/s: cpp-1.1.0

> [C++] Add option to link with shared boost libraries when building Arrow in 
> the thirdparty toolchain
> 
>
> Key: PARQUET-853
> URL: https://issues.apache.org/jira/browse/PARQUET-853
> Project: Parquet
>  Issue Type: New Feature
>  Components: parquet-cpp
>Reporter: Wes McKinney
> Fix For: cpp-1.1.0
>
>
> See discussion in https://github.com/apache/parquet-cpp/pull/231



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (PARQUET-910) C++: Support TIME logical type in parquet_arrow

2017-04-25 Thread Wes McKinney (JIRA)


 [ 
https://issues.apache.org/jira/browse/PARQUET-910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned PARQUET-910:


Assignee: Wes McKinney

> C++: Support TIME logical type in parquet_arrow
> ---
>
> Key: PARQUET-910
> URL: https://issues.apache.org/jira/browse/PARQUET-910
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-cpp
>Reporter: Uwe L. Korn
>Assignee: Wes McKinney
> Fix For: cpp-1.1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (PARQUET-910) C++: Support TIME logical type in parquet_arrow

2017-04-25 Thread Wes McKinney (JIRA)


 [ 
https://issues.apache.org/jira/browse/PARQUET-910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved PARQUET-910.
--
Resolution: Fixed

Resolved in PARQUET-915 
https://github.com/apache/parquet-cpp/commit/a8dee1fe983e19177059a66be4bf0558a3b8687d

> C++: Support TIME logical type in parquet_arrow
> ---
>
> Key: PARQUET-910
> URL: https://issues.apache.org/jira/browse/PARQUET-910
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-cpp
>Reporter: Uwe L. Korn
>Assignee: Wes McKinney
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PARQUET-910) C++: Support TIME logical type in parquet_arrow

2017-04-25 Thread Wes McKinney (JIRA)


 [ 
https://issues.apache.org/jira/browse/PARQUET-910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated PARQUET-910:
-
Fix Version/s: cpp-1.1.0

> C++: Support TIME logical type in parquet_arrow
> ---
>
> Key: PARQUET-910
> URL: https://issues.apache.org/jira/browse/PARQUET-910
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-cpp
>Reporter: Uwe L. Korn
>Assignee: Wes McKinney
> Fix For: cpp-1.1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (PARQUET-963) [C++] Disallow reading struct types in Arrow reader for now

2017-04-25 Thread Wes McKinney (JIRA)


 [ 
https://issues.apache.org/jira/browse/PARQUET-963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved PARQUET-963.
--
Resolution: Fixed

Issue resolved by pull request 308
[https://github.com/apache/parquet-cpp/pull/308]

> [C++] Disallow reading struct types in Arrow reader for now
> ---
>
> Key: PARQUET-963
> URL: https://issues.apache.org/jira/browse/PARQUET-963
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-cpp
>Reporter: Wes McKinney
>Assignee: Wes McKinney
> Fix For: cpp-1.1.0
>
>
> This bug surfaced in ARROW-601. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (PARQUET-915) Support Arrow Time Types in Schema

2017-04-25 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/PARQUET-915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved PARQUET-915.
-
   Resolution: Fixed
Fix Version/s: cpp-1.1.0

Issue resolved by pull request 311
[https://github.com/apache/parquet-cpp/pull/311]

> Support Arrow Time Types in Schema
> --
>
> Key: PARQUET-915
> URL: https://issues.apache.org/jira/browse/PARQUET-915
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-cpp
>Reporter: Miki Tebeka
>Assignee: Wes McKinney
> Fix For: cpp-1.1.0
>
>
> Support Time with MILLI and MICRO TimeUnit in arrow conversion.
> See also [ARROW-601|https://issues.apache.org/jira/browse/ARROW-601]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (PARQUET-958) [C++] Print Parquet metadata in JSON format

2017-04-25 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/PARQUET-958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved PARQUET-958.
-
   Resolution: Fixed
Fix Version/s: cpp-1.1.0

Issue resolved by pull request 310
[https://github.com/apache/parquet-cpp/pull/310]

> [C++] Print Parquet metadata in JSON format
> ---
>
> Key: PARQUET-958
> URL: https://issues.apache.org/jira/browse/PARQUET-958
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-cpp
>Reporter: Deepak Majeti
>Assignee: Deepak Majeti
> Fix For: cpp-1.1.0
>
>
> Extend the current parquet_reader to print metadata in JSON format



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (PARQUET-965) [C++] FIXED_LEN_BYTE_ARRAY types are unhandled in the Arrow reader

[jira] [Comment Edited] (PARQUET-964) Using ProtoParquet with Hive / AWS Athena: ParquetDecodingException: totalValueCount '0' <= 0

[jira] [Updated] (PARQUET-964) Using ProtoParquet with Hive / AWS Athena: ParquetDecodingException: totalValueCount '0' <= 0

[jira] [Commented] (PARQUET-964) Using ProtoParquet with Hive / AWS Athena: ParquetDecodingException: totalValueCount '0' <= 0

[jira] [Commented] (PARQUET-964) Using ProtoParquet with Hive / AWS Athena: ParquetDecodingException: totalValueCount '0' <= 0

[jira] [Commented] (PARQUET-853) [C++] Add option to link with shared boost libraries when building Arrow in the thirdparty toolchain

[jira] [Updated] (PARQUET-853) [C++] Add option to link with shared boost libraries when building Arrow in the thirdparty toolchain

[jira] [Assigned] (PARQUET-910) C++: Support TIME logical type in parquet_arrow

[jira] [Resolved] (PARQUET-910) C++: Support TIME logical type in parquet_arrow

[jira] [Updated] (PARQUET-910) C++: Support TIME logical type in parquet_arrow

[jira] [Resolved] (PARQUET-963) [C++] Disallow reading struct types in Arrow reader for now

[jira] [Resolved] (PARQUET-915) Support Arrow Time Types in Schema

[jira] [Resolved] (PARQUET-958) [C++] Print Parquet metadata in JSON format

13 matches

Site Navigation

Mail list logo

Footer information