[
https://issues.apache.org/jira/browse/HIVE-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14565562#comment-14565562
]
Aihua Xu commented on HIVE-10720:
---------------------------------
{noformat}
java.lang.Exception: java.io.IOException: java.lang.NullPointerException
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:406)
Caused by: java.io.IOException: java.lang.NullPointerException
at
org.apache.hive.hcatalog.pig.HCatLoader.setLocation(HCatLoader.java:188)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.mergeSplitSpecificConf(PigInputFormat.java:138)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:112)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:644)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at
org.apache.hadoop.hive.serde2.ColumnProjectionUtils.toReadColumnIDString(ColumnProjectionUtils.java:190)
at
org.apache.hadoop.hive.serde2.ColumnProjectionUtils.appendReadColumns(ColumnProjectionUtils.java:92)
at
org.apache.hadoop.hive.serde2.ColumnProjectionUtils.setReadColumnIDs(ColumnProjectionUtils.java:57)
at
org.apache.hive.hcatalog.pig.HCatLoader.setLocation(HCatLoader.java:183)
... 10 more
{noformat}
> Pig using HCatLoader to access RCFile and perform join but get incorrect
> result.
> --------------------------------------------------------------------------------
>
> Key: HIVE-10720
> URL: https://issues.apache.org/jira/browse/HIVE-10720
> Project: Hive
> Issue Type: Bug
> Components: HCatalog
> Affects Versions: 1.3.0
> Reporter: Aihua Xu
> Assignee: Aihua Xu
> Attachments: HIVE-10720.patch
>
>
> {noformat}
> Create table tbl1 (key string, value string) stored as rcfile;
> Create table tbl2 (key string, value string);
> insert into tbl1 values('1', 'value1');
> insert into tbl2 values('1', 'value2');
> {noformat}
> Pig script:
> {noformat}
> tbl1 = LOAD 'tbl1' USING org.apache.hive.hcatalog.pig.HCatLoader();
> tbl2 = LOAD 'tbl2' USING org.apache.hive.hcatalog.pig.HCatLoader();
> src_tbl1 = FILTER tbl1 BY (key == '1');
> prj_tbl1 = FOREACH src_tbl1 GENERATE
> key as tbl1_key,
> value as tbl1_value,
> '333' as tbl1_v1;
>
> src_tbl2 = FILTER tbl2 BY (key == '1');
> prj_tbl2 = FOREACH src_tbl2 GENERATE
> key as tbl2_key,
> value as tbl2_value;
>
> result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key);
> prj_result = FOREACH result
> GENERATE prj_tbl1::tbl1_key AS key1,
> prj_tbl1::tbl1_value AS value1,
> prj_tbl1::tbl1_v1 AS v1,
> prj_tbl2::tbl2_key AS key2,
> prj_tbl2::tbl2_value AS value2;
>
> dump prj_result;
> {noformat}
> We could see different invalid results or even no result which should return.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)