[
https://issues.apache.org/jira/browse/HAWQ-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Oleksandr Diachenko updated HAWQ-1211:
--------------------------------------
Description:
Steps to reproduce:
1) Create Hive table which uses
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe:
{code}
CREATE TABLE hive_partitioned_table (t0 string, t1 string, num1 int, d1 double)
PARTITIONED BY (prt string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
{code}
2) Make sure it uses LazySimpleSerDe:
{code}
hive> desc extended hive_partitioned_table;
OK
t0 string
t1 string
num1 int
d1 double
prt string
# Partition Information
# col_name data_type comment
prt string
Detailed Table Information Table(tableName:hive_partitioned_table,
dbName:default, owner:adiachenko, createTime:1481318673, lastAccessTime:0,
retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:t0, type:string,
comment:null), FieldSchema(name:t1, type:string, comment:null),
FieldSchema(name:num1, type:int, comment:null), FieldSchema(name:d1,
type:double, comment:null), FieldSchema(name:prt, type:string, comment:null)],
location:hdfs://0.0.0.0:8020/hive/warehouse/hive_partitioned_table,
inputFormat:org.apache.hadoop.mapred.TextInputFormat,
outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat,
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null,
serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
parameters:{serialization.format=,, field.delim=,}), bucketCols:[],
sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[],
skewedColValues:[], skewedColValueLocationMaps:{}),
storedAsSubDirectories:false), partitionKeys:[FieldSchema(name:prt,
type:string, comment:null)], parameters:{transient_lastDdlTime=1481318673},
viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE)
Time taken: 0.039 seconds, Fetched: 12 row(s)
{code}
3) Create external PXF HAWQ table:
{code}
CREATE EXTERNAL TABLE pxf_hive_partitioned_table (t1 text, t2 text, num1
integer, dub1 double precision, t3 text) LOCATION
(E'pxf://localhost:51200/hive_partitioned_table?PROFILE=HiveORC') FORMAT
'CUSTOM' (formatter='pxfwritable_import')
{code}
4) Query external table:
{code}
ERROR: remote component error (500) from '10.64.4.25:51200': type Exception
report message java.lang.Exception:
org.apache.hawq.pxf.api.UserDataException: LAZY_SIMPLE_SERDE serializer isn't
supported by org.apache.hawq.pxf.plugins.hive.HiveORCAccessor description
The server encountered an internal error that prevented it from fulfilling this
request. exception javax.servlet.ServletException: java.lang.Exception:
org.apache.hawq.pxf.api.UserDataException: LAZY_SIMPLE_SERDE serializer isn't
supported by org.apache.hawq.pxf.plugins.hive.HiveORCAccessor (libchurl.c:897)
(seg4 localhost:40000 pid=90834)
DETAIL: External table pxf_hive_partitioned_table
{code}
was:
Steps to reproduce:
1) Create Hive table which uses
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe:
{code}
CREATE TABLE hive_partitioned_table (t0 string, t1 string, num1 int, d1 double)
PARTITIONED BY (prt string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
{code}
2) Make sure it uses LazySimpleSerDe:
{code}
hive> desc extended hive_partitioned_table;
OK
t0 string
t1 string
num1 int
d1 double
prt string
# Partition Information
# col_name data_type comment
prt string
Detailed Table Information Table(tableName:hive_partitioned_table,
dbName:default, owner:adiachenko, createTime:1481318673, lastAccessTime:0,
retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:t0, type:string,
comment:null), FieldSchema(name:t1, type:string, comment:null),
FieldSchema(name:num1, type:int, comment:null), FieldSchema(name:d1,
type:double, comment:null), FieldSchema(name:prt, type:string, comment:null)],
location:hdfs://0.0.0.0:8020/hive/warehouse/hive_partitioned_table,
inputFormat:org.apache.hadoop.mapred.TextInputFormat,
outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat,
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null,
serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
parameters:{serialization.format=,, field.delim=,}), bucketCols:[],
sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[],
skewedColValues:[], skewedColValueLocationMaps:{}),
storedAsSubDirectories:false), partitionKeys:[FieldSchema(name:prt,
type:string, comment:null)], parameters:{transient_lastDdlTime=1481318673},
viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE)
Time taken: 0.039 seconds, Fetched: 12 row(s)
{code}
> HiveOrc doesn't support Hive tables which use
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> ------------------------------------------------------------------------------------------------
>
> Key: HAWQ-1211
> URL: https://issues.apache.org/jira/browse/HAWQ-1211
> Project: Apache HAWQ
> Issue Type: Bug
> Components: PXF
> Reporter: Oleksandr Diachenko
> Assignee: Oleksandr Diachenko
>
> Steps to reproduce:
> 1) Create Hive table which uses
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe:
> {code}
> CREATE TABLE hive_partitioned_table (t0 string, t1 string, num1 int, d1
> double) PARTITIONED BY (prt string) ROW FORMAT DELIMITED FIELDS TERMINATED BY
> ','
> {code}
> 2) Make sure it uses LazySimpleSerDe:
> {code}
> hive> desc extended hive_partitioned_table;
> OK
> t0 string
> t1 string
> num1 int
> d1 double
> prt string
>
> # Partition Information
> # col_name data_type comment
>
> prt string
>
> Detailed Table Information Table(tableName:hive_partitioned_table,
> dbName:default, owner:adiachenko, createTime:1481318673, lastAccessTime:0,
> retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:t0, type:string,
> comment:null), FieldSchema(name:t1, type:string, comment:null),
> FieldSchema(name:num1, type:int, comment:null), FieldSchema(name:d1,
> type:double, comment:null), FieldSchema(name:prt, type:string,
> comment:null)],
> location:hdfs://0.0.0.0:8020/hive/warehouse/hive_partitioned_table,
> inputFormat:org.apache.hadoop.mapred.TextInputFormat,
> outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat,
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null,
> serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
> parameters:{serialization.format=,, field.delim=,}), bucketCols:[],
> sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[],
> skewedColValues:[], skewedColValueLocationMaps:{}),
> storedAsSubDirectories:false), partitionKeys:[FieldSchema(name:prt,
> type:string, comment:null)], parameters:{transient_lastDdlTime=1481318673},
> viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE)
> Time taken: 0.039 seconds, Fetched: 12 row(s)
> {code}
> 3) Create external PXF HAWQ table:
> {code}
> CREATE EXTERNAL TABLE pxf_hive_partitioned_table (t1 text, t2 text,
> num1 integer, dub1 double precision, t3 text) LOCATION
> (E'pxf://localhost:51200/hive_partitioned_table?PROFILE=HiveORC') FORMAT
> 'CUSTOM' (formatter='pxfwritable_import')
> {code}
> 4) Query external table:
> {code}
> ERROR: remote component error (500) from '10.64.4.25:51200': type
> Exception report message java.lang.Exception:
> org.apache.hawq.pxf.api.UserDataException: LAZY_SIMPLE_SERDE serializer isn't
> supported by org.apache.hawq.pxf.plugins.hive.HiveORCAccessor description
> The server encountered an internal error that prevented it from fulfilling
> this request. exception javax.servlet.ServletException:
> java.lang.Exception: org.apache.hawq.pxf.api.UserDataException:
> LAZY_SIMPLE_SERDE serializer isn't supported by
> org.apache.hawq.pxf.plugins.hive.HiveORCAccessor (libchurl.c:897) (seg4
> localhost:40000 pid=90834)
> DETAIL: External table pxf_hive_partitioned_table
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)