Re: PhysicalRDD problem?

Nitin Goyal Tue, 09 Dec 2014 22:21:36 -0800

Hi Michael,

I think I have found the exact problem in my case. I see that we have
written something like following in Analyzer.scala :-


  // TODO: pass this in as a parameter.

  val fixedPoint = FixedPoint(100)


and


    Batch("Resolution", fixedPoint,

      ResolveReferences ::

      ResolveRelations ::

      ResolveSortReferences ::

      NewRelationInstances ::

      ImplicitGenerate ::

      StarExpansion ::

      ResolveFunctions ::

      GlobalAggregates ::

      UnresolvedHavingClauseAttributes ::

      TrimGroupingAliases ::

      typeCoercionRules ++

      extendedRules : _*),

Perhaps in my case, it reaches the 100 iterations and break out of while
loop in RuleExecutor.scala and thus, doesn't "resolve" all the attributes.

Exception in my logs :-

14/12/10 04:45:28 INFO HiveContext$$anon$4: Max iterations (100) reached
for batch Resolution

14/12/10 04:45:28 ERROR [Sql]: Servlet.service() for servlet [Sql] in
context with path [] threw exception [Servlet execution threw an exception]
with root cause

org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved
attributes: 'T1.SP AS SP#6566,'T1.DOWN_BYTESHTTPSUBCR AS
DOWN_BYTESHTTPSUBCR#6567, tree:

'Project ['T1.SP AS SP#6566,'T1.DOWN_BYTESHTTPSUBCR AS
DOWN_BYTESHTTPSUBCR#6567]

...

...

...

at
org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1.applyOrElse(Analyzer.scala:80)

at
org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1.applyOrElse(Analyzer.scala:78)

at
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:144)

at
org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:135)

at
org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:78)

at
org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:76)

at
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61)

at
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59)

at
scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:51)

at
scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:60)

at scala.collection.mutable.WrappedArray.foldLeft(WrappedArray.scala:34)

at
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59)

at
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:51)

at scala.collection.immutable.List.foreach(List.scala:318)

at
org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51)

at
org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:411)

at
org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:411)

at
org.apache.spark.sql.CacheManager$$anonfun$cacheQuery$1.apply(CacheManager.scala:86)

at org.apache.spark.sql.CacheManager$class.writeLock(CacheManager.scala:67)

at org.apache.spark.sql.CacheManager$class.cacheQuery(CacheManager.scala:85)

at org.apache.spark.sql.SQLContext.cacheQuery(SQLContext.scala:50)

 at org.apache.spark.sql.SchemaRDD.cache(SchemaRDD.scala:490)


I think the solution here is to have the FixedPoint constructor argument as
configurable/parameterized (also written as TODO). Do we have a plan to do
this in 1.2 release? Or I can take this up as a task for myself if you want
(since this is very crucial for our release).


Thanks

-Nitin

On Wed, Dec 10, 2014 at 1:06 AM, Michael Armbrust <[email protected]>
wrote:

> val newSchemaRDD = sqlContext.applySchema(existingSchemaRDD,
>> existingSchemaRDD.schema)
>>
>
> This line is throwing away the logical information about existingSchemaRDD
> and thus Spark SQL can't know how to push down projections or predicates
> past this operator.
>
> Can you describe more the problems that you see if you don't do this
> reapplication of the schema.
>



-- 
Regards
Nitin Goyal

Re: PhysicalRDD problem?

Reply via email to