[ 
https://issues.apache.org/jira/browse/HIVE-15082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksiy Sayankin updated HIVE-15082:
------------------------------------
    Description: 
*STEP 1. Create test data*

{code:sql}
select * from dual;
{code}

*EXPECTED RESULT:*

{noformat}
Pretty_UnIQUe_StrinG
{noformat}

{code:sql}
create table test_parquet1(login timestamp) stored as parquet;
insert overwrite table test_parquet1 select from_unixtime(unix_timestamp()) 
from dual;
select * from test_parquet1 limit 1;
{code}

*EXPECTED RESULT:*

No exceptions. Current timestamp as result.
{noformat}
2016-10-27 10:58:19
{noformat}
*STEP 2. Store timestamp in array in parquet file*

{code:sql}
create table test_parquet2(x array<timestamp>) stored as parquet;
insert overwrite table test_parquet2 select array(login) from test_parquet1;
select * from test_parquet2;
{code}

*EXPECTED RESULT:*

No exceptions. Current timestamp in brackets as result.
{noformat}
["2016-10-27 10:58:19"]
{noformat}

*ACTUAL RESULT:*

{noformat}
ERROR [main]: CliDriver (SessionState.java:printError(963)) - Failed with 
exception java.io.IOException:parquet.io.ParquetDecodingException: Can not read 
value at 0 in block -1 in file 
hdfs:///user/hive/warehouse/test_parquet2/000000_0
java.io.IOException: parquet.io.ParquetDecodingException: Can not read value at 
0 in block -1 in file hdfs:///user/hive/warehouse/test_parquet2/000000_0
{noformat}


*ROOT-CAUSE:*

Incorrect initialization of {{metadata}} {{HashMap}} causes that it has 
{{null}} value in enumeration 
{{org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter}} when executing 
following line:

{code:java}
  boolean skipConversion = 
Boolean.valueOf(metadata.get(HiveConf.ConfVars.HIVE_PARQUET_TIMESTAMP_SKIP_CONVERSION.varname));
{code}

in element {{ETIMESTAMP_CONVERTER}}.

JVM throws NPE and parquet library can not read data from file and throws 

{noformat}
java.io.IOException:parquet.io.ParquetDecodingException: Can not read value at 
0 in block -1 in file hdfs:///user/hive/warehouse/test_parquet2/000000_0
{noformat}

for its turn.

*SOLUTION:*

Perform initialization in separate method to skip overriding it with {{null}} 
value in block of code

{code:java}
  if (parent != null) {
     setMetadata(parent.getMetadata());
  }
{code}

  was:
*STEP 1. Create test data*

{code:sql}
select * from dual;
{code}

*EXPECTED RESULT:*

{noformat}
Pretty_UnIQUe_StrinG
{noformat}

{code:sql}
create table test_parquet1(login timestamp) stored as parquet;
insert overwrite table test_parquet1 select from_unixtime(unix_timestamp()) 
from dual;
select * from test_parquet1 limit 1;
{code}

*EXPECTED RESULT:*

No exceptions. Current timestamp as result.
{noformat}
2016-10-27 10:58:19
{noformat}
*STEP 2. Store timestamp in array in parquet file*

{code:sql}
create table test_parquet2(x array<timestamp>) stored as parquet;
insert overwrite table test_parquet2 select array(login) from test_parquet1;
select * from test_parquet2;
{code}

*EXPECTED RESULT:*

No exceptions. Current timestamp in brackets as result.
{noformat}
["2016-10-27 10:58:19"]
{noformat}

*ACTUAL RESULT:*

{noformat}
ERROR [main]: CliDriver (SessionState.java:printError(963)) - Failed with 
exception java.io.IOException:parquet.io.ParquetDecodingException: Can not read 
value at 0 in block -1 in file 
hdfs:///user/hive/warehouse/test_parquet2/000000_0
java.io.IOException: parquet.io.ParquetDecodingException: Can not read value at 
0 in block -1 in file hdfs:///user/hive/warehouse/test_parquet2/000000_0
{noformat}

----

*ROOT-CAUSE:*

Incorrect initialization of {{metadata}} {{HashMap}} causes that it has 
{{null}} value in enumeration 
{{org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter}} when executing 
following line:

{code:java}
  boolean skipConversion = 
Boolean.valueOf(metadata.get(HiveConf.ConfVars.HIVE_PARQUET_TIMESTAMP_SKIP_CONVERSION.varname));
{code}

in element {{ETIMESTAMP_CONVERTER}}.

JVM throws NPE and parquet library can not read data from file and throws 

{noformat}
java.io.IOException:parquet.io.ParquetDecodingException: Can not read value at 
0 in block -1 in file hdfs:///user/hive/warehouse/test_parquet2/000000_0
{noformat}

for its turn.

*SOLUTION:*

Perform initialization in separate method to skip overriding it with {{null}} 
value in block of code

{code:java}
  if (parent != null) {
     setMetadata(parent.getMetadata());
  }
{code}


> Hive-1.2 cannot read data from complex data types with TIMESTAMP column, 
> stored in Parquet
> ------------------------------------------------------------------------------------------
>
>                 Key: HIVE-15082
>                 URL: https://issues.apache.org/jira/browse/HIVE-15082
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 1.2.0
>            Reporter: Oleksiy Sayankin
>            Assignee: Oleksiy Sayankin
>             Fix For: 1.2.2
>
>         Attachments: HIVE-15082-branch-1.2.patch, HIVE-15082-branch-1.patch
>
>
> *STEP 1. Create test data*
> {code:sql}
> select * from dual;
> {code}
> *EXPECTED RESULT:*
> {noformat}
> Pretty_UnIQUe_StrinG
> {noformat}
> {code:sql}
> create table test_parquet1(login timestamp) stored as parquet;
> insert overwrite table test_parquet1 select from_unixtime(unix_timestamp()) 
> from dual;
> select * from test_parquet1 limit 1;
> {code}
> *EXPECTED RESULT:*
> No exceptions. Current timestamp as result.
> {noformat}
> 2016-10-27 10:58:19
> {noformat}
> *STEP 2. Store timestamp in array in parquet file*
> {code:sql}
> create table test_parquet2(x array<timestamp>) stored as parquet;
> insert overwrite table test_parquet2 select array(login) from test_parquet1;
> select * from test_parquet2;
> {code}
> *EXPECTED RESULT:*
> No exceptions. Current timestamp in brackets as result.
> {noformat}
> ["2016-10-27 10:58:19"]
> {noformat}
> *ACTUAL RESULT:*
> {noformat}
> ERROR [main]: CliDriver (SessionState.java:printError(963)) - Failed with 
> exception java.io.IOException:parquet.io.ParquetDecodingException: Can not 
> read value at 0 in block -1 in file 
> hdfs:///user/hive/warehouse/test_parquet2/000000_0
> java.io.IOException: parquet.io.ParquetDecodingException: Can not read value 
> at 0 in block -1 in file hdfs:///user/hive/warehouse/test_parquet2/000000_0
> {noformat}
> *ROOT-CAUSE:*
> Incorrect initialization of {{metadata}} {{HashMap}} causes that it has 
> {{null}} value in enumeration 
> {{org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter}} when 
> executing following line:
> {code:java}
>   boolean skipConversion = 
> Boolean.valueOf(metadata.get(HiveConf.ConfVars.HIVE_PARQUET_TIMESTAMP_SKIP_CONVERSION.varname));
> {code}
> in element {{ETIMESTAMP_CONVERTER}}.
> JVM throws NPE and parquet library can not read data from file and throws 
> {noformat}
> java.io.IOException:parquet.io.ParquetDecodingException: Can not read value 
> at 0 in block -1 in file hdfs:///user/hive/warehouse/test_parquet2/000000_0
> {noformat}
> for its turn.
> *SOLUTION:*
> Perform initialization in separate method to skip overriding it with {{null}} 
> value in block of code
> {code:java}
>   if (parent != null) {
>      setMetadata(parent.getMetadata());
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to