[jira] [Updated] (HIVE-14306) Hive Failed to read Parquet Files generated by SparkSQL

Teng Yutong (JIRA) Thu, 21 Jul 2016 01:00:57 -0700

     [ 
https://issues.apache.org/jira/browse/HIVE-14306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Teng Yutong updated HIVE-14306:
-------------------------------
    Description: 
I'm trying to implement the following process:

1. create a hive parquet table A use hive CLI
2. create a external table B whose schema just like A, but point to a exist 
folder which contains one csv file in HDSF
3. execute `insert into A select * from B` using SparkSQL
4. query table A.

wired thing happens in step 3 and 4。

If the 'insert into' statement executed by SparkSQL(1.6.2)，Hive CLI would throw 
me an Exception when querying table A
```
Failed with exception java.io.IOException:parquet.io.ParquetDecodingException: 
Can not read value at 0 in block -1 in file 
hdfs://NEOInciteDataNode-1:8020/user/hive/warehouse/call_center/part-r-00000-b9b6962d-cbab-452b-835b-c10c6221b8fa.gz.parquet
```

But SparkSQL can query table A without trouble...

If the `insert`  statement executed by Hive CLI， query table A in Hive CLI 
would be just fine...

So am I doing something wrong, or this is just a bug?

  was:
I'm trying to implement the following process:

1. create a hive parquet table A use hive CLI
2. create a external table B whose schema just like A, but point to a exist 
folder which contains one csv file in HDSF
3. execute `insert into A select * from B` using SparkSQL
4. query table A.

wired thing happens in step 3 and 4。

If the 'insert into' statement executed by SparkSQL，Hive CLI would throw me an 
Exception when querying table A
```
Failed with exception java.io.IOException:parquet.io.ParquetDecodingException: 
Can not read value at 0 in block -1 in file 
hdfs://NEOInciteDataNode-1:8020/user/hive/warehouse/call_center/part-r-00000-b9b6962d-cbab-452b-835b-c10c6221b8fa.gz.parquet
```

But SparkSQL can query table A without trouble...

If the `insert`  statement executed by Hive CLI， query table A in Hive CLI 
would be just fine...

So am I doing something wrong, or this is just a bug?


> Hive Failed to read Parquet Files generated by SparkSQL
> -------------------------------------------------------
>
>                 Key: HIVE-14306
>                 URL: https://issues.apache.org/jira/browse/HIVE-14306
>             Project: Hive
>          Issue Type: Bug
>          Components: CLI
>    Affects Versions: 1.2.1
>            Reporter: Teng Yutong
>
> I'm trying to implement the following process:
> 1. create a hive parquet table A use hive CLI
> 2. create a external table B whose schema just like A, but point to a exist 
> folder which contains one csv file in HDSF
> 3. execute `insert into A select * from B` using SparkSQL
> 4. query table A.
> wired thing happens in step 3 and 4。
> If the 'insert into' statement executed by SparkSQL(1.6.2)，Hive CLI would 
> throw me an Exception when querying table A
> ```
> Failed with exception 
> java.io.IOException:parquet.io.ParquetDecodingException: Can not read value 
> at 0 in block -1 in file 
> hdfs://NEOInciteDataNode-1:8020/user/hive/warehouse/call_center/part-r-00000-b9b6962d-cbab-452b-835b-c10c6221b8fa.gz.parquet
> ```
> But SparkSQL can query table A without trouble...
> If the `insert`  statement executed by Hive CLI， query table A in Hive CLI 
> would be just fine...
> So am I doing something wrong, or this is just a bug?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14306) Hive Failed to read Parquet Files generated by SparkSQL

Reply via email to