[jira] [Commented] (HIVE-12087) IMPORT TABLE fails

2015-10-11 Thread Willem van Asperen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14952197#comment-14952197
 ] 

Willem van Asperen commented on HIVE-12087:
---

{code}
diff --git 
a/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java 
b/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java
index d349068..1165db4 100644
--- a/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java
+++ b/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java
@@ -1149,10 +1149,10 @@
 
 try {
   Class clazzDistCp = Class.forName("org.apache.hadoop.tools.DistCp");
-  Constructor c = clazzDistCp.getConstructor();
+  Constructor c = clazzDistCp.getConstructor(Configuration.class);
   c.setAccessible(true);
-  Tool distcp = (Tool)c.newInstance();
-  distcp.setConf(conf);
+  Tool distcp = (Tool)c.newInstance(conf);
+  //distcp.setConf(conf);
   rc = distcp.run(params);
 } catch (ClassNotFoundException e) {
   throw new IOException("Cannot find DistCp class package: " + 
e.getMessage());
{code}

> IMPORT TABLE fails
> --
>
> Key: HIVE-12087
> URL: https://issues.apache.org/jira/browse/HIVE-12087
> Project: Hive
>  Issue Type: Bug
>  Components: Import/Export
>Affects Versions: 1.2.1
> Environment: Hortonworks HDP 2.3
>Reporter: Willem van Asperen
>
> IMPORT TABLE fails for larger tables with:
> {code}
> 0: jdbc:hive2://hdpprdhiv01.prd.rsg:10001/> import from 
> '/tmp/export/repository/res_sales_navigator';
> INFO  : Copying data from 
> hdfs://hdpprdmas01.prd.rsg:8020/tmp/export/repository/res_sales_navigator/valid_from=20150825
>  to 
> hdfs://hdpprdmas01.prd.rsg:8020/tmp/export/repository/res_sales_navigator/.hive-staging_hive_2015-10-07_20-55-37_456_5706704167497413401-2/-ext-1
> INFO  : Copying file: 
> hdfs://hdpprdmas01.prd.rsg:8020/tmp/export/repository/res_sales_navigator/valid_from=20150825/part-r-0
> ERROR : Failed with exception Cannot get DistCp constructor: 
> org.apache.hadoop.tools.DistCp.()
> java.io.IOException: Cannot get DistCp constructor: 
> org.apache.hadoop.tools.DistCp.()
>   at 
> org.apache.hadoop.hive.shims.Hadoop23Shims.runDistCp(Hadoop23Shims.java:1160)
>   at org.apache.hadoop.hive.common.FileUtils.copy(FileUtils.java:553)
>   at org.apache.hadoop.hive.ql.exec.CopyTask.execute(CopyTask.java:82)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1653)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1412)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1054)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.CopyTask (state=08S01,code=1)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12087) IMPORT TABLE fails

2015-10-11 Thread Willem van Asperen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Willem van Asperen updated HIVE-12087:
--
Description: 
IMPORT TABLE fails for larger tables with:

{code}
0: jdbc:hive2://hdpprdhiv01.prd.xxx:10001/> import from 
'/tmp/export/repository/res_sales_navigator';
INFO  : Copying data from 
hdfs://hdpprdmas01.prd.xxx:8020/tmp/export/repository/res_sales_navigator/valid_from=20150825
 to 
hdfs://hdpprdmas01.prd.xxx:8020/tmp/export/repository/res_sales_navigator/.hive-staging_hive_2015-10-07_20-55-37_456_5706704167497413401-2/-ext-1
INFO  : Copying file: 
hdfs://hdpprdmas01.prd.xxx:8020/tmp/export/repository/res_sales_navigator/valid_from=20150825/part-r-0
ERROR : Failed with exception Cannot get DistCp constructor: 
org.apache.hadoop.tools.DistCp.()
java.io.IOException: Cannot get DistCp constructor: 
org.apache.hadoop.tools.DistCp.()
at 
org.apache.hadoop.hive.shims.Hadoop23Shims.runDistCp(Hadoop23Shims.java:1160)
at org.apache.hadoop.hive.common.FileUtils.copy(FileUtils.java:553)
at org.apache.hadoop.hive.ql.exec.CopyTask.execute(CopyTask.java:82)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1653)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1412)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1054)
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154)
at 
org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71)
at 
org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at 
org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Error: Error while processing statement: FAILED: Execution Error, return code 1 
from org.apache.hadoop.hive.ql.exec.CopyTask (state=08S01,code=1)
{code}

  was:
IMPORT TABLE fails for larger tables with:

{code}
0: jdbc:hive2://hdpprdhiv01.prd.rsg:10001/> import from 
'/tmp/export/repository/res_sales_navigator';
INFO  : Copying data from 
hdfs://hdpprdmas01.prd.rsg:8020/tmp/export/repository/res_sales_navigator/valid_from=20150825
 to 
hdfs://hdpprdmas01.prd.rsg:8020/tmp/export/repository/res_sales_navigator/.hive-staging_hive_2015-10-07_20-55-37_456_5706704167497413401-2/-ext-1
INFO  : Copying file: 
hdfs://hdpprdmas01.prd.rsg:8020/tmp/export/repository/res_sales_navigator/valid_from=20150825/part-r-0
ERROR : Failed with exception Cannot get DistCp constructor: 
org.apache.hadoop.tools.DistCp.()
java.io.IOException: Cannot get DistCp constructor: 
org.apache.hadoop.tools.DistCp.()
at 
org.apache.hadoop.hive.shims.Hadoop23Shims.runDistCp(Hadoop23Shims.java:1160)
at org.apache.hadoop.hive.common.FileUtils.copy(FileUtils.java:553)
at org.apache.hadoop.hive.ql.exec.CopyTask.execute(CopyTask.java:82)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1653)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1412)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1054)
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154)
at 
org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71)
at 
org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at 
org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218)
 

[jira] [Commented] (HIVE-12087) IMPORT TABLE fails

2015-10-11 Thread Willem van Asperen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14952192#comment-14952192
 ] 

Willem van Asperen commented on HIVE-12087:
---

The method {{runDistCp(Path src, Path dst, Configuration conf)}} of 
Hadoop23Shims, that is used for Hadoop 0.23, is not compatible with the 
distributed copy tool {{org.apache.hadoop.tools.DistCp}}. That {{DistCp}} class 
does not have a parameter-less constructor but an object is constructed by 
calling {{Constructor c = clazzDistCp.getConstructor()}} on that class.

A work-around is to set property {{hive.exec.copyfile.maxsize}} to a large 
enough size so that copying files is not done using distributed copy - and thus 
bypassing this erroneous code.

> IMPORT TABLE fails
> --
>
> Key: HIVE-12087
> URL: https://issues.apache.org/jira/browse/HIVE-12087
> Project: Hive
>  Issue Type: Bug
>  Components: Import/Export
>Affects Versions: 1.2.1
> Environment: Hortonworks HDP 2.3
>Reporter: Willem van Asperen
>
> IMPORT TABLE fails for larger tables with:
> {code}
> 0: jdbc:hive2://hdpprdhiv01.prd.rsg:10001/> import from 
> '/tmp/export/repository/res_sales_navigator';
> INFO  : Copying data from 
> hdfs://hdpprdmas01.prd.rsg:8020/tmp/export/repository/res_sales_navigator/valid_from=20150825
>  to 
> hdfs://hdpprdmas01.prd.rsg:8020/tmp/export/repository/res_sales_navigator/.hive-staging_hive_2015-10-07_20-55-37_456_5706704167497413401-2/-ext-1
> INFO  : Copying file: 
> hdfs://hdpprdmas01.prd.rsg:8020/tmp/export/repository/res_sales_navigator/valid_from=20150825/part-r-0
> ERROR : Failed with exception Cannot get DistCp constructor: 
> org.apache.hadoop.tools.DistCp.()
> java.io.IOException: Cannot get DistCp constructor: 
> org.apache.hadoop.tools.DistCp.()
>   at 
> org.apache.hadoop.hive.shims.Hadoop23Shims.runDistCp(Hadoop23Shims.java:1160)
>   at org.apache.hadoop.hive.common.FileUtils.copy(FileUtils.java:553)
>   at org.apache.hadoop.hive.ql.exec.CopyTask.execute(CopyTask.java:82)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1653)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1412)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1054)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.CopyTask (state=08S01,code=1)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12087) IMPORT TABLE fails

2015-10-11 Thread Willem van Asperen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Willem van Asperen updated HIVE-12087:
--
Attachment: hive-shims23.patch

> IMPORT TABLE fails
> --
>
> Key: HIVE-12087
> URL: https://issues.apache.org/jira/browse/HIVE-12087
> Project: Hive
>  Issue Type: Bug
>  Components: Import/Export
>Affects Versions: 1.2.1
> Environment: Hortonworks HDP 2.3
>Reporter: Willem van Asperen
> Attachments: hive-shims23.patch
>
>
> IMPORT TABLE fails for larger tables with:
> {code}
> 0: jdbc:hive2://hdpprdhiv01.prd.xxx:10001/> import from 
> '/tmp/export/repository/res_sales_navigator';
> INFO  : Copying data from 
> hdfs://hdpprdmas01.prd.xxx:8020/tmp/export/repository/res_sales_navigator/valid_from=20150825
>  to 
> hdfs://hdpprdmas01.prd.xxx:8020/tmp/export/repository/res_sales_navigator/.hive-staging_hive_2015-10-07_20-55-37_456_5706704167497413401-2/-ext-1
> INFO  : Copying file: 
> hdfs://hdpprdmas01.prd.xxx:8020/tmp/export/repository/res_sales_navigator/valid_from=20150825/part-r-0
> ERROR : Failed with exception Cannot get DistCp constructor: 
> org.apache.hadoop.tools.DistCp.()
> java.io.IOException: Cannot get DistCp constructor: 
> org.apache.hadoop.tools.DistCp.()
>   at 
> org.apache.hadoop.hive.shims.Hadoop23Shims.runDistCp(Hadoop23Shims.java:1160)
>   at org.apache.hadoop.hive.common.FileUtils.copy(FileUtils.java:553)
>   at org.apache.hadoop.hive.ql.exec.CopyTask.execute(CopyTask.java:82)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1653)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1412)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1054)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.CopyTask (state=08S01,code=1)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11574) ERROR 2245: Cannot get schema from loadFunc org.apache.hive.hcatalog.pig.HCatLoader

2015-08-16 Thread Willem van Asperen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Willem van Asperen updated HIVE-11574:
--
Description: 
When running a job through Oozie, the user is propagated into running a pig 
job. If that pig job reads using the {{HCatLoader}} or writes out using the 
{{HCatStorer}}, a kerberos delegation ticket request is triggered. This does 
not work as expected in a non-kerberized cluster.

One would expect that the value of property {{hive.metastore.sasl.enabled}} is 
checked before attempting to obtain the TGT. Instead, the function 
{{getHiveMetaClient}} of {{org.apache.hive.hcatalog.pig.PigHCatUtil}} checks if 
a kerberos server principal has been set using the property 
{{hive.metastore.kerberos.principal}}. Since that is set to some default value, 
this results in unexpected behavior.

Setting {{hive.metastore.sasl.enabled}} to false is undone by the above 
mentioned function as soon as a {{hive.metastore.kerberos.principal}} has been 
set. So even though the log-file shows that the property 
{{hive.metastore.sasl.enabled}} comes through as false, the system behaves as 
if sasl is requested and starts the kerberos delegation ticket request. This 
results in the following stack trace:

{quote}
ERROR 2245: Cannot get schema from loadFunc 
org.apache.hive.hcatalog.pig.HCatLoader

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during 
parsing. Cannot get schema from loadFunc org.apache.hive.hcatalog.pig.HCatLoader
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1748)
at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1443)
at org.apache.pig.PigServer.parseAndBuild(PigServer.java:387)
at org.apache.pig.PigServer.executeBatch(PigServer.java:412)
at org.apache.pig.PigServer.executeBatch(PigServer.java:398)
at 
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:171)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:234)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:495)
at org.apache.pig.PigRunner.run(PigRunner.java:49)
at org.apache.oozie.action.hadoop.PigMain.runPigJob(PigMain.java:288)
at org.apache.oozie.action.hadoop.PigMain.run(PigMain.java:228)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39)
at org.apache.oozie.action.hadoop.PigMain.main(PigMain.java:75)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: Failed to parse: Can not retrieve schema from loader 
org.apache.hive.hcatalog.pig.HCatLoader@2acf062d
at 
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:201)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1735)
... 27 more
Caused by: java.lang.RuntimeException: Can not retrieve schema from loader 
org.apache.hive.hcatalog.pig.HCatLoader@2acf062d
at 
org.apache.pig.newplan.logical.relational.LOLoad.init(LOLoad.java:91)
at 
org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:901)
at 
org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3568)
at 
org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1625)
at 
org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
at 
org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
at 
org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
at 
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:191)
... 28 more
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2245: 
Cannot get schema from loadFunc 

[jira] [Updated] (HIVE-11574) ERROR 2245: Cannot get schema from loadFunc org.apache.hive.hcatalog.pig.HCatLoader

2015-08-15 Thread Willem van Asperen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Willem van Asperen updated HIVE-11574:
--
Description: 
When running a job through Oozie, the user is propagated into running a pig 
job. If that pig job reads using the {{HCatLoader}} or writes out using the 
{{HCatStorer}}, a kerberos delegation ticket request is triggered. This does 
not work as expected in a non-kerberized cluster.

One would expect that the value of property {{hive.metastore.sasl.enabled}} is 
checked before attempting to obtain the TGT. Instead, the function 
{{getHiveMetaClient}} of {{PigHCatUtil}} checks if a kerberos server principal 
has been set using the property {{hive.metastore.kerberos.principal}}. Since 
that is set to some default value, this results in unexpected behavior.

Setting {{hive.metastore.sasl.enabled}} to false is undone by the above 
mentioned function as soon as a {{hive.metastore.kerberos.principal}} has been 
set. So even though the log-file shows that the property 
{{hive.metastore.sasl.enabled}} comes through as false, the system behaves as 
if sasl is requested and starts the kerberos delegation ticket request. This 
results in the following stack trace:

{quote}
ERROR 2245: Cannot get schema from loadFunc 
org.apache.hive.hcatalog.pig.HCatLoader

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during 
parsing. Cannot get schema from loadFunc org.apache.hive.hcatalog.pig.HCatLoader
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1748)
at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1443)
at org.apache.pig.PigServer.parseAndBuild(PigServer.java:387)
at org.apache.pig.PigServer.executeBatch(PigServer.java:412)
at org.apache.pig.PigServer.executeBatch(PigServer.java:398)
at 
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:171)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:234)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:495)
at org.apache.pig.PigRunner.run(PigRunner.java:49)
at org.apache.oozie.action.hadoop.PigMain.runPigJob(PigMain.java:288)
at org.apache.oozie.action.hadoop.PigMain.run(PigMain.java:228)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39)
at org.apache.oozie.action.hadoop.PigMain.main(PigMain.java:75)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: Failed to parse: Can not retrieve schema from loader 
org.apache.hive.hcatalog.pig.HCatLoader@2acf062d
at 
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:201)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1735)
... 27 more
Caused by: java.lang.RuntimeException: Can not retrieve schema from loader 
org.apache.hive.hcatalog.pig.HCatLoader@2acf062d
at 
org.apache.pig.newplan.logical.relational.LOLoad.init(LOLoad.java:91)
at 
org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:901)
at 
org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3568)
at 
org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1625)
at 
org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
at 
org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
at 
org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
at 
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:191)
... 28 more
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2245: 
Cannot get schema from loadFunc org.apache.hive.hcatalog.pig.HCatLoader
at 

[jira] [Updated] (HIVE-11574) ERROR 2245: Cannot get schema from loadFunc org.apache.hive.hcatalog.pig.HCatLoader

2015-08-15 Thread Willem van Asperen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Willem van Asperen updated HIVE-11574:
--
Attachment: hive.patch

Suggested patch that also checks if user requires SASL 

 ERROR 2245: Cannot get schema from loadFunc 
 org.apache.hive.hcatalog.pig.HCatLoader
 ---

 Key: HIVE-11574
 URL: https://issues.apache.org/jira/browse/HIVE-11574
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.14.0
 Environment: HDP 2.4.4, CentOS 6.5
Reporter: Willem van Asperen
Priority: Minor
 Attachments: hive.patch


 When running a job through Oozie, the user is propagated into running a pig 
 job. If that pig job reads using the {{HCatLoader}} or writes out using the 
 {{HCatStorer}}, a kerberos delegation ticket request is triggered. This does 
 not work as expected in a non-kerberized cluster.
 One would expect that the value of property {{hive.metastore.sasl.enabled}} 
 is checked before attempting to obtain the TGT. Instead, the function 
 {{getHiveMetaClient}} of {{PigHCatUtil}} checks if a kerberos server 
 principal has been set using the property 
 {{hive.metastore.kerberos.principal}}. Since that is set to some default 
 value, this results in unexpected behavior.
 Setting {{hive.metastore.sasl.enabled}} to false is undone by the above 
 mentioned function as soon as a {{hive.metastore.kerberos.principal}} has 
 been set. So even though the log-file shows that the property 
 {{hive.metastore.sasl.enabled}} comes through as false, the system behaves as 
 if sasl is requested and starts the kerberos delegation ticket request. This 
 results in the following stack trace:
 {{ERROR 2245: Cannot get schema from loadFunc 
 org.apache.hive.hcatalog.pig.HCatLoader
 org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during 
 parsing. Cannot get schema from loadFunc 
 org.apache.hive.hcatalog.pig.HCatLoader
   at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1748)
   at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1443)
   at org.apache.pig.PigServer.parseAndBuild(PigServer.java:387)
   at org.apache.pig.PigServer.executeBatch(PigServer.java:412)
   at org.apache.pig.PigServer.executeBatch(PigServer.java:398)
   at 
 org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:171)
   at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:234)
   at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
   at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
   at org.apache.pig.Main.run(Main.java:495)
   at org.apache.pig.PigRunner.run(PigRunner.java:49)
   at org.apache.oozie.action.hadoop.PigMain.runPigJob(PigMain.java:288)
   at org.apache.oozie.action.hadoop.PigMain.run(PigMain.java:228)
   at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39)
   at org.apache.oozie.action.hadoop.PigMain.main(PigMain.java:75)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
 Caused by: Failed to parse: Can not retrieve schema from loader 
 org.apache.hive.hcatalog.pig.HCatLoader@2acf062d
   at 
 org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:201)
   at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1735)
   ... 27 more
 Caused by: java.lang.RuntimeException: Can not retrieve schema from loader 
 org.apache.hive.hcatalog.pig.HCatLoader@2acf062d
   at 
 org.apache.pig.newplan.logical.relational.LOLoad.init(LOLoad.java:91)
   at 
 org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:901)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3568)
   at