[jira] [Updated] (HIVE-6785) query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe

2014-04-27 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-6785:
---

Assignee: Tongjie Chen

> query fails when partitioned table's table level serde is ParquetHiveSerDe 
> and partition level serde is of different SerDe
> --
>
> Key: HIVE-6785
> URL: https://issues.apache.org/jira/browse/HIVE-6785
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Serializers/Deserializers
>Affects Versions: 0.13.0
>Reporter: Tongjie Chen
>Assignee: Tongjie Chen
> Fix For: 0.14.0
>
> Attachments: HIVE-6785.1.patch.txt, HIVE-6785.2.patch.txt, 
> HIVE-6785.3.patch
>
>
> When a hive table's SerDe is ParquetHiveSerDe, while some partitions are of 
> other SerDe, AND if this table has string column[s], hive generates confusing 
> error message:
> "Failed with exception java.io.IOException:java.lang.ClassCastException: 
> parquet.hive.serde.primitive.ParquetStringInspector cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.SettableTimestampObjectInspector"
> This is confusing because timestamp is mentioned even if it is not been used 
> by the table. The reason is when there is SerDe difference between table and 
> partition, hive tries to convert objectinspector of two SerDes. 
> ParquetHiveSerDe's object inspector for string type is ParquetStringInspector 
> (newly introduced), neither a subclass of WritableStringObjectInspector nor 
> JavaStringObjectInspector, which ObjectInspectorConverters expect for string 
> category objector inspector. There is no break statement in STRING case 
> statement, hence the following TIMESTAMP case statement is executed, 
> generating confusing error message.
> see also in the following parquet issue:
> https://github.com/Parquet/parquet-mr/issues/324
> To fix that it is relatively easy, just make ParquetStringInspector subclass 
> of JavaStringObjectInspector instead of AbstractPrimitiveJavaObjectInspector. 
> But because constructor of class JavaStringObjectInspector is package scope 
> instead of public or protected, we would need to move ParquetStringInspector 
> to the same package with JavaStringObjectInspector.
> Also ArrayWritableObjectInspector's setStructFieldData needs to also accept 
> List data, since the corresponding setStructFieldData and create methods 
> return a list. This is also needed when table SerDe is ParquetHiveSerDe, and 
> partition SerDe is something else.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6785) query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe

2014-04-27 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-6785:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk! Thank you for the contribution!

Thank you Szehon for the review!

> query fails when partitioned table's table level serde is ParquetHiveSerDe 
> and partition level serde is of different SerDe
> --
>
> Key: HIVE-6785
> URL: https://issues.apache.org/jira/browse/HIVE-6785
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Serializers/Deserializers
>Affects Versions: 0.13.0
>Reporter: Tongjie Chen
>Assignee: Tongjie Chen
> Fix For: 0.14.0
>
> Attachments: HIVE-6785.1.patch.txt, HIVE-6785.2.patch.txt, 
> HIVE-6785.3.patch
>
>
> When a hive table's SerDe is ParquetHiveSerDe, while some partitions are of 
> other SerDe, AND if this table has string column[s], hive generates confusing 
> error message:
> "Failed with exception java.io.IOException:java.lang.ClassCastException: 
> parquet.hive.serde.primitive.ParquetStringInspector cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.SettableTimestampObjectInspector"
> This is confusing because timestamp is mentioned even if it is not been used 
> by the table. The reason is when there is SerDe difference between table and 
> partition, hive tries to convert objectinspector of two SerDes. 
> ParquetHiveSerDe's object inspector for string type is ParquetStringInspector 
> (newly introduced), neither a subclass of WritableStringObjectInspector nor 
> JavaStringObjectInspector, which ObjectInspectorConverters expect for string 
> category objector inspector. There is no break statement in STRING case 
> statement, hence the following TIMESTAMP case statement is executed, 
> generating confusing error message.
> see also in the following parquet issue:
> https://github.com/Parquet/parquet-mr/issues/324
> To fix that it is relatively easy, just make ParquetStringInspector subclass 
> of JavaStringObjectInspector instead of AbstractPrimitiveJavaObjectInspector. 
> But because constructor of class JavaStringObjectInspector is package scope 
> instead of public or protected, we would need to move ParquetStringInspector 
> to the same package with JavaStringObjectInspector.
> Also ArrayWritableObjectInspector's setStructFieldData needs to also accept 
> List data, since the corresponding setStructFieldData and create methods 
> return a list. This is also needed when table SerDe is ParquetHiveSerDe, and 
> partition SerDe is something else.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6785) query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe

2014-04-11 Thread Tongjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tongjie Chen updated HIVE-6785:
---

Attachment: HIVE-6785.3.patch

replace deprecated parquet class.

> query fails when partitioned table's table level serde is ParquetHiveSerDe 
> and partition level serde is of different SerDe
> --
>
> Key: HIVE-6785
> URL: https://issues.apache.org/jira/browse/HIVE-6785
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Serializers/Deserializers
>Affects Versions: 0.13.0
>Reporter: Tongjie Chen
> Fix For: 0.14.0
>
> Attachments: HIVE-6785.1.patch.txt, HIVE-6785.2.patch.txt, 
> HIVE-6785.3.patch
>
>
> When a hive table's SerDe is ParquetHiveSerDe, while some partitions are of 
> other SerDe, AND if this table has string column[s], hive generates confusing 
> error message:
> "Failed with exception java.io.IOException:java.lang.ClassCastException: 
> parquet.hive.serde.primitive.ParquetStringInspector cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.SettableTimestampObjectInspector"
> This is confusing because timestamp is mentioned even if it is not been used 
> by the table. The reason is when there is SerDe difference between table and 
> partition, hive tries to convert objectinspector of two SerDes. 
> ParquetHiveSerDe's object inspector for string type is ParquetStringInspector 
> (newly introduced), neither a subclass of WritableStringObjectInspector nor 
> JavaStringObjectInspector, which ObjectInspectorConverters expect for string 
> category objector inspector. There is no break statement in STRING case 
> statement, hence the following TIMESTAMP case statement is executed, 
> generating confusing error message.
> see also in the following parquet issue:
> https://github.com/Parquet/parquet-mr/issues/324
> To fix that it is relatively easy, just make ParquetStringInspector subclass 
> of JavaStringObjectInspector instead of AbstractPrimitiveJavaObjectInspector. 
> But because constructor of class JavaStringObjectInspector is package scope 
> instead of public or protected, we would need to move ParquetStringInspector 
> to the same package with JavaStringObjectInspector.
> Also ArrayWritableObjectInspector's setStructFieldData needs to also accept 
> List data, since the corresponding setStructFieldData and create methods 
> return a list. This is also needed when table SerDe is ParquetHiveSerDe, and 
> partition SerDe is something else.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6785) query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe

2014-04-05 Thread Tongjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tongjie Chen updated HIVE-6785:
---

Fix Version/s: 0.14.0
   Status: Patch Available  (was: Open)

> query fails when partitioned table's table level serde is ParquetHiveSerDe 
> and partition level serde is of different SerDe
> --
>
> Key: HIVE-6785
> URL: https://issues.apache.org/jira/browse/HIVE-6785
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Serializers/Deserializers
>Affects Versions: 0.13.0
>Reporter: Tongjie Chen
> Fix For: 0.14.0
>
> Attachments: HIVE-6785.1.patch.txt, HIVE-6785.2.patch.txt
>
>
> When a hive table's SerDe is ParquetHiveSerDe, while some partitions are of 
> other SerDe, AND if this table has string column[s], hive generates confusing 
> error message:
> "Failed with exception java.io.IOException:java.lang.ClassCastException: 
> parquet.hive.serde.primitive.ParquetStringInspector cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.SettableTimestampObjectInspector"
> This is confusing because timestamp is mentioned even if it is not been used 
> by the table. The reason is when there is SerDe difference between table and 
> partition, hive tries to convert objectinspector of two SerDes. 
> ParquetHiveSerDe's object inspector for string type is ParquetStringInspector 
> (newly introduced), neither a subclass of WritableStringObjectInspector nor 
> JavaStringObjectInspector, which ObjectInspectorConverters expect for string 
> category objector inspector. There is no break statement in STRING case 
> statement, hence the following TIMESTAMP case statement is executed, 
> generating confusing error message.
> see also in the following parquet issue:
> https://github.com/Parquet/parquet-mr/issues/324
> To fix that it is relatively easy, just make ParquetStringInspector subclass 
> of JavaStringObjectInspector instead of AbstractPrimitiveJavaObjectInspector. 
> But because constructor of class JavaStringObjectInspector is package scope 
> instead of public or protected, we would need to move ParquetStringInspector 
> to the same package with JavaStringObjectInspector.
> Also ArrayWritableObjectInspector's setStructFieldData needs to also accept 
> List data, since the corresponding setStructFieldData and create methods 
> return a list. This is also needed when table SerDe is ParquetHiveSerDe, and 
> partition SerDe is something else.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6785) query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe

2014-04-05 Thread Tongjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tongjie Chen updated HIVE-6785:
---

Attachment: HIVE-6785.2.patch.txt

add a new qtest

> query fails when partitioned table's table level serde is ParquetHiveSerDe 
> and partition level serde is of different SerDe
> --
>
> Key: HIVE-6785
> URL: https://issues.apache.org/jira/browse/HIVE-6785
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Serializers/Deserializers
>Affects Versions: 0.13.0
>Reporter: Tongjie Chen
> Fix For: 0.14.0
>
> Attachments: HIVE-6785.1.patch.txt, HIVE-6785.2.patch.txt
>
>
> When a hive table's SerDe is ParquetHiveSerDe, while some partitions are of 
> other SerDe, AND if this table has string column[s], hive generates confusing 
> error message:
> "Failed with exception java.io.IOException:java.lang.ClassCastException: 
> parquet.hive.serde.primitive.ParquetStringInspector cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.SettableTimestampObjectInspector"
> This is confusing because timestamp is mentioned even if it is not been used 
> by the table. The reason is when there is SerDe difference between table and 
> partition, hive tries to convert objectinspector of two SerDes. 
> ParquetHiveSerDe's object inspector for string type is ParquetStringInspector 
> (newly introduced), neither a subclass of WritableStringObjectInspector nor 
> JavaStringObjectInspector, which ObjectInspectorConverters expect for string 
> category objector inspector. There is no break statement in STRING case 
> statement, hence the following TIMESTAMP case statement is executed, 
> generating confusing error message.
> see also in the following parquet issue:
> https://github.com/Parquet/parquet-mr/issues/324
> To fix that it is relatively easy, just make ParquetStringInspector subclass 
> of JavaStringObjectInspector instead of AbstractPrimitiveJavaObjectInspector. 
> But because constructor of class JavaStringObjectInspector is package scope 
> instead of public or protected, we would need to move ParquetStringInspector 
> to the same package with JavaStringObjectInspector.
> Also ArrayWritableObjectInspector's setStructFieldData needs to also accept 
> List data, since the corresponding setStructFieldData and create methods 
> return a list. This is also needed when table SerDe is ParquetHiveSerDe, and 
> partition SerDe is something else.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6785) query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe

2014-04-02 Thread Tongjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tongjie Chen updated HIVE-6785:
---

Description: 
When a hive table's SerDe is ParquetHiveSerDe, while some partitions are of 
other SerDe, AND if this table has string column[s], hive generates confusing 
error message:

"Failed with exception java.io.IOException:java.lang.ClassCastException: 
parquet.hive.serde.primitive.ParquetStringInspector cannot be cast to 
org.apache.hadoop.hive.serde2.objectinspector.primitive.SettableTimestampObjectInspector"

This is confusing because timestamp is mentioned even if it is not been used by 
the table. The reason is when there is SerDe difference between table and 
partition, hive tries to convert objectinspector of two SerDes. 
ParquetHiveSerDe's object inspector for string type is ParquetStringInspector 
(newly introduced), neither a subclass of WritableStringObjectInspector nor 
JavaStringObjectInspector, which ObjectInspectorConverters expect for string 
category objector inspector. There is no break statement in STRING case 
statement, hence the following TIMESTAMP case statement is executed, generating 
confusing error message.

see also in the following parquet issue:
https://github.com/Parquet/parquet-mr/issues/324

To fix that it is relatively easy, just make ParquetStringInspector subclass of 
JavaStringObjectInspector instead of AbstractPrimitiveJavaObjectInspector. But 
because constructor of class JavaStringObjectInspector is package scope instead 
of public or protected, we would need to move ParquetStringInspector to the 
same package with JavaStringObjectInspector.

Also ArrayWritableObjectInspector's setStructFieldData needs to also accept 
List data, since the corresponding setStructFieldData and create methods return 
a list. This is also needed when table SerDe is ParquetHiveSerDe, and partition 
SerDe is something else.




  was:
More specifically, if table contains string type columns. it will result in the 
following exception ""Failed with exception 
java.io.IOException:java.lang.ClassCastException: 
parquet.hive.serde.primitive.ParquetStringInspector cannot be cast to 
org.apache.hadoop.hive.serde2.objectinspector.primitive.SettableTimestampObjectInspector"

see also in the following parquet issue:
https://github.com/Parquet/parquet-mr/issues/324






> query fails when partitioned table's table level serde is ParquetHiveSerDe 
> and partition level serde is of different SerDe
> --
>
> Key: HIVE-6785
> URL: https://issues.apache.org/jira/browse/HIVE-6785
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Serializers/Deserializers
>Affects Versions: 0.13.0
>Reporter: Tongjie Chen
> Attachments: HIVE-6785.1.patch.txt
>
>
> When a hive table's SerDe is ParquetHiveSerDe, while some partitions are of 
> other SerDe, AND if this table has string column[s], hive generates confusing 
> error message:
> "Failed with exception java.io.IOException:java.lang.ClassCastException: 
> parquet.hive.serde.primitive.ParquetStringInspector cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.SettableTimestampObjectInspector"
> This is confusing because timestamp is mentioned even if it is not been used 
> by the table. The reason is when there is SerDe difference between table and 
> partition, hive tries to convert objectinspector of two SerDes. 
> ParquetHiveSerDe's object inspector for string type is ParquetStringInspector 
> (newly introduced), neither a subclass of WritableStringObjectInspector nor 
> JavaStringObjectInspector, which ObjectInspectorConverters expect for string 
> category objector inspector. There is no break statement in STRING case 
> statement, hence the following TIMESTAMP case statement is executed, 
> generating confusing error message.
> see also in the following parquet issue:
> https://github.com/Parquet/parquet-mr/issues/324
> To fix that it is relatively easy, just make ParquetStringInspector subclass 
> of JavaStringObjectInspector instead of AbstractPrimitiveJavaObjectInspector. 
> But because constructor of class JavaStringObjectInspector is package scope 
> instead of public or protected, we would need to move ParquetStringInspector 
> to the same package with JavaStringObjectInspector.
> Also ArrayWritableObjectInspector's setStructFieldData needs to also accept 
> List data, since the corresponding setStructFieldData and create methods 
> return a list. This is also needed when table SerDe is ParquetHiveSerDe, and 
> partition SerDe is something else.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6785) query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe

2014-04-01 Thread Tongjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tongjie Chen updated HIVE-6785:
---

Attachment: HIVE-6785.1.patch.txt

> query fails when partitioned table's table level serde is ParquetHiveSerDe 
> and partition level serde is of different SerDe
> --
>
> Key: HIVE-6785
> URL: https://issues.apache.org/jira/browse/HIVE-6785
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Serializers/Deserializers
>Affects Versions: 0.13.0
>Reporter: Tongjie Chen
> Attachments: HIVE-6785.1.patch.txt
>
>
> More specifically, if table contains string type columns. it will result in 
> the following exception ""Failed with exception 
> java.io.IOException:java.lang.ClassCastException: 
> parquet.hive.serde.primitive.ParquetStringInspector cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.SettableTimestampObjectInspector"
> see also in the following parquet issue:
> https://github.com/Parquet/parquet-mr/issues/324



--
This message was sent by Atlassian JIRA
(v6.2#6252)