Hi Marcelo, yes this is a known Spark SQL bug and we've got PRs to fix it
(2887 2967). Not merged yet because newly merged Hive 0.13.1 support
causes some conflicts. Thanks for reporting this :)
On Tue, Oct 28, 2014 at 6:41 AM, Marcelo Vanzin van...@cloudera.com wrote:
Well, looks like a huge coincidence, but this was just sent to github:
https://github.com/apache/spark/pull/2967
On Mon, Oct 27, 2014 at 3:25 PM, Marcelo Vanzin van...@cloudera.com
wrote:
Hey guys,
I've been using the HiveFromSpark example to test some changes and I
ran into an issue that manifests itself as an NPE inside Hive code
because some configuration object is null.
Tracing back, it seems that `sessionState` being a lazy val in
HiveContext is causing it. That variably is only evaluated in [1],
while the call in [2] causes a Driver to be initialized by [3], which
the tries to use the thread-local session state ([4]) which hasn't
been set yet.
This could be seen as a Hive bug ([3] should probably be calling the
constructor that takes a conf object), but is there a reason why these
fields are lazy in HiveContext? I explicitly called
SessionState.setCurrentSessionState() before the
CommandProcessorFactory call and that seems to fix the issue too.
[1]
https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala#L305
[2]
https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala#L289
[3]
https://github.com/apache/hive/blob/9c63b2fdc35387d735f4c9d08761203711d4974b/ql/src/java/org/apache/hadoop/hive/ql/processors/CommandProcessorFactory.java#L104
[4]
https://github.com/apache/hive/blob/9c63b2fdc35387d735f4c9d08761203711d4974b/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L286
--
Marcelo
--
Marcelo
-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org