Re: Why custom parquet format hive table execute "ParquetTableScan" physical plan, not "HiveTableScan"?

Xiaoyu Wang Fri, 16 Jan 2015 05:52:31 -0800

Thanks yana!
I will try it!

> 在 2015年1月16日，20:51，yana <yana.kadiy...@gmail.com 
> <mailto:yana.kadiy...@gmail.com>> 写道：
> 
> I think you might need to set 
> spark.sql.hive.convertMetastoreParquet to false if I understand that flag 
> correctly
> 
> Sent on the new Sprint Network from my Samsung Galaxy S®4.
> 
> 
> -------- Original message --------
> From: Xiaoyu Wang
> Date:01/16/2015 5:09 AM (GMT-05:00)
> To: user@spark.apache.org <mailto:user@spark.apache.org>
> Subject: Why custom parquet format hive table execute "ParquetTableScan" 
> physical plan, not "HiveTableScan"?
> 
> Hi all!
> 
> In the Spark SQL1.2.0.
> I create a hive table with custom parquet inputformat and outputformat.
> like this :
> CREATE TABLE test(
>   id string, 
>   msg string)
> CLUSTERED BY ( 
>   id) 
> SORTED BY ( 
>   id ASC) 
> INTO 10 BUCKETS
> ROW FORMAT SERDE
>   'com.a.MyParquetHiveSerDe'
> STORED AS INPUTFORMAT 
>   'com.a.MyParquetInputFormat' 
> OUTPUTFORMAT 
>   'com.a.MyParquetOutputFormat';
> 
> And the spark shell see the plan of "select * from test" is :
> 
> [== Physical Plan ==]
> [!OutputFaker [id#5,msg#6]]
> [ ParquetTableScan [id#12,msg#13], (ParquetRelation 
> hdfs://hadoop/user/hive/warehouse/test.db/test 
> <hdfs://hadoop/user/hive/warehouse/test.db/test>, Some(Configuration: 
> core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, 
> yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml), 
> org.apache.spark.sql.hive.HiveContext@6d15a113, []), []]
> 
> Not HiveTableScan!!!
> So it dosn't execute my custom inputformat!
> Why? How can it execute my custom inputformat?
> 
> Thanks!

Re: Why custom parquet format hive table execute "ParquetTableScan" physical plan, not "HiveTableScan"?

Reply via email to