Krzysztof Indyk created PIG-4705:
------------------------------------

             Summary: Error Schema for data cannot be determined using HCatalog
                 Key: PIG-4705
                 URL: https://issues.apache.org/jira/browse/PIG-4705
             Project: Pig
          Issue Type: Bug
          Components: tez
    Affects Versions: 0.15.0
         Environment: HDP 2.3.2
            Reporter: Krzysztof Indyk


When we use {{HCatalog}} as source and destination of data for {{Pig}} on 
{{Tez}} we get  ??ERROR 1115: Schema for data cannot be determined??.
Pig works fine when we use map reduce or use HCatalog only as one of endpoints 
i.e. load data directly from file and store using HCatalog.

The error appears after upgrading from {{Pig 0.14}} on {{Tez 0.5.2}} to {{Pig 
0.15}} on {{Tez 0.7.0}} ( HDP 2.2.6}} to {{HDP 2.3.2}}).

To reproduce:
- create hive tables from hive_tables.hql
- load data to table_input from sample.csv
- run following Pig script on Tez

{code}

data = LOAD 'table_input' USING org.apache.hive.hcatalog.pig.HCatLoader();
items_unique = DISTINCT data;

counted = FOREACH (GROUP items_unique BY col2)
            GENERATE
              group AS name,
              COUNT(items_unique) AS value;
  
STORE counted INTO 'table_output' USING 
org.apache.hive.hcatalog.pig.HCatStorer();
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to