[
https://issues.apache.org/jira/browse/HAWQ-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15097046#comment-15097046
]
Goden Yao commented on HAWQ-335:
--------------------------------
I noticed your definition has "offset" in quotes. Is it a typo?
Also can you post the hive table definition you have in Hive stored as parquet?
> Cannot query parquet hive table through PXF
> -------------------------------------------
>
> Key: HAWQ-335
> URL: https://issues.apache.org/jira/browse/HAWQ-335
> Project: Apache HAWQ
> Issue Type: Bug
> Components: PXF
> Affects Versions: 2.0.0-beta-incubating
> Reporter: zharui
> Assignee: Goden Yao
>
> I created an external table in hawq that exist in hive with parquet format,
> but I cannot query this table in hawq. The segment processes are idle and
> nothing happened.
> The clause of creating external hive parquet table as below:
> {code}
> create external table zc_parquet800_partitioned
> (
> start_time bigint,
> cdr_id int,
> "offset" int,
> calling varchar(255),
> imsi varchar(255),
> user_ip int,
> tmsi int,
> p_tmsi int,
> imei varchar(255),
> mcc int,
> mnc int,
> lac int,
> rac int,
> cell_id int,
> bsc_ip int,
> opc int,
> dpc int,
> sgsn_sg_ip int,
> ggsn_sg_ip int,
> sgsn_data_ip int,
> ggsn_data_ip int,
> apn varchar(255),
> rat int,
> service_type smallint,
> service_group smallint,
> up_packets int,
> down_packets int,
> up_bytes int,
> down_bytes int,
> up_speed real,
> down_speed real,
> trans_time int,
> first_time timestamp,
> end_time timestamp,
> is_end int,
> user_port int,
> proto_type int,
> dest_ip int,
> dest_port int,
> paging_count smallint,
> assignment_count smallint,
> joiner_id varchar(255),
> operation smallint,
> country smallint,
> loc_prov smallint,
> loc_city smallint,
> roam_prov smallint,
> roam_city smallint,
> sgsn varchar(255),
> bsc_rnc varchar(255),
> terminal_fac smallint,
> terminal_type int,
> terminal_class smallint,
> roaming_type smallint,
> host_operator smallint,
> net_type smallint,
> time int,
> calling_hash int)
> LOCATION ('pxf://ws01.mzhen.cn:51200/zc_parquet800_partitioned?PROFILE=Hive')
> FORMAT 'custom' (formatter='pxfwritable_import');
> {code}
> The catalina logs as below:
> {code}
> Jan 13, 2016 11:26:29 AM WARNING: parquet.hadoop.ParquetRecordReader: Can not
> initialize counter due to context is not a instance of
> TaskInputOutputContext, but is
> org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
> Jan 13, 2016 11:26:29 AM INFO: parquet.hadoop.InternalParquetRecordReader:
> RecordReader initialized will read a total of 1332450 records.
> Jan 13, 2016 11:26:29 AM INFO: parquet.hadoop.InternalParquetRecordReader: at
> row 0. reading next block
> Jan 13, 2016 11:26:30 AM INFO: parquet.hadoop.InternalParquetRecordReader:
> block read in memory in 398 ms. row count = 1332450
> Jan 13, 2016 11:26:58 AM WARNING: parquet.hadoop.ParquetRecordReader: Can not
> initialize counter due to context is not a instance of
> TaskInputOutputContext, but is
> org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
> Jan 13, 2016 11:26:58 AM INFO: parquet.hadoop.InternalParquetRecordReader:
> RecordReader initialized will read a total of 1460760 records.
> Jan 13, 2016 11:26:58 AM INFO: parquet.hadoop.InternalParquetRecordReader: at
> row 0. reading next block
> Jan 13, 2016 11:26:59 AM INFO: parquet.hadoop.InternalParquetRecordReader:
> block read in memory in 441 ms. row count = 1460760
> Jan 13, 2016 11:27:34 AM WARNING: parquet.hadoop.ParquetRecordReader: Can not
> initialize counter due to context is not a instance of
> TaskInputOutputContext, but is
> org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
> Jan 13, 2016 11:27:34 AM INFO: parquet.hadoop.InternalParquetRecordReader:
> RecordReader initialized will read a total of 1396605 records.
> Jan 13, 2016 11:27:34 AM INFO: parquet.hadoop.InternalParquetRecordReader: at
> row 0. reading next block
> Jan 13, 2016 11:27:34 AM INFO: parquet.hadoop.InternalParquetRecordReader:
> block read in memory in 367 ms. row count = 1396605
> Jan 13, 2016 11:28:06 AM WARNING: parquet.hadoop.ParquetRecordReader: Can not
> initialize counter due to context is not a instance of
> TaskInputOutputContext, but is
> org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
> Jan 13, 2016 11:28:06 AM INFO: parquet.hadoop.InternalParquetRecordReader:
> RecordReader initialized will read a total of 1337385 records.
> Jan 13, 2016 11:28:06 AM INFO: parquet.hadoop.InternalParquetRecordReader: at
> row 0. reading next block
> Jan 13, 2016 11:28:06 AM INFO: parquet.hadoop.InternalParquetRecordReader:
> block read in memory in 348 ms. row count = 1337385
> Jan 13, 2016 11:28:32 AM WARNING: parquet.hadoop.ParquetRecordReader: Can not
> initialize counter due to context is not a instance of
> TaskInputOutputContext, but is
> org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
> Jan 13, 2016 11:28:32 AM INFO: parquet.hadoop.InternalParquetRecordReader:
> RecordReader initialized will read a total of 1322580 records.
> Jan 13, 2016 11:28:32 AM INFO: parquet.hadoop.InternalParquetRecordReader: at
> row 0. reading next block
> Jan 13, 2016 11:28:33 AM INFO: parquet.hadoop.InternalParquetRecordReader:
> block read in memory in 459 ms. row count = 1322580
> Jan 13, 2016 11:28:59 AM WARNING: parquet.hadoop.ParquetRecordReader: Can not
> initialize counter due to context is not a instance of
> TaskInputOutputContext, but is
> org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
> Jan 13, 2016 11:28:59 AM INFO: parquet.hadoop.InternalParquetRecordReader:
> RecordReader initialized will read a total of 1431150 records.
> Jan 13, 2016 11:28:59 AM INFO: parquet.hadoop.InternalParquetRecordReader
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)