Saurabh Santhosh created SPARK-11633:
----------------------------------------
Summary: HiveContext throws TreeNode Exception : Failed to Copy
Node
Key: SPARK-11633
URL: https://issues.apache.org/jira/browse/SPARK-11633
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 1.4.1
Reporter: Saurabh Santhosh
Priority: Critical
h2. HiveContext#sql is throwing the following exception in a specific scenario :
h2. Exception :
Caused by: org.apache.spark.sql.catalyst.errors.package$TreeNodeException:
Failed to copy node.
Is otherCopyArgs specified correctly for LogicalRDD.
Exception message: wrong number of arguments
ctor: public org.apache.spark.sql.execution.LogicalRDD
(scala.collection.Seq,org.apache.spark.rdd.RDD,org.apache.spark.sql.SQLContext)?
h2. Code :
{code:title=SparkClient.java|borderStyle=solid}
StructField[] fields = new StructField[2];
fields[0] = new StructField("F1", DataTypes.StringType, true, Metadata.empty());
fields[1] = new StructField("F2", DataTypes.StringType, true, Metadata.empty());
JavaRDD<Row> rdd =
javaSparkContext.parallelize(Arrays.asList(RowFactory.create("", "", 0)));
DataFrame df = sparkHiveContext.createDataFrame(rdd, new StructType(fields));
sparkHiveContext.registerDataFrameAsTable(df, "t1");
DataFrame aliasedDf = sparkHiveContext.sql("select f1, F2 as F2 from t1");
sparkHiveContext.registerDataFrameAsTable(aliasedDf, "t2");
sparkHiveContext.registerDataFrameAsTable(aliasedDf, "t3");
sparkHiveContext.sql("select a.F1 from t2 a inner join t3 b on a.F2=b.F2");
{code}
h2. Observations :
* if F1(exact name of field) is used instead of f1, the code works correctly.
* If alias is not used for F2, then also code works irrespective of case of F1.
* if Field F2 is not used in the final query also the code works correctly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]