Hi Curtis, I believe in windows, the following command needs to be executed: (will need winutils installed)
D:\winutils\bin\winutils.exe chmod 777 D:\tmp\hive On 6 June 2017 at 09:45, Curtis Burkhalter <curtisburkhal...@gmail.com> wrote: > Hello all, > > I'm new to Spark and I'm trying to interact with it using Pyspark. I'm > using the prebuilt version of spark v. 2.1.1 and when I go to the command > line and use the command 'bin\pyspark' I have initialization problems and > get the following message: > > C:\spark\spark-2.1.1-bin-hadoop2.7> bin\pyspark > Python 3.6.0 |Anaconda 4.3.1 (64-bit)| (default, Dec 23 2016, 11:57:41) > [MSC v.1900 64 bit (AMD64)] on win32 > Type "help", "copyright", "credits" or "license" for more information. > Using Spark's default log4j profile: org/apache/spark/log4j- > defaults.properties > Setting default log level to "WARN". > To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use > setLogLevel(newLevel). > 17/06/06 10:30:14 WARN NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 17/06/06 10:30:21 WARN ObjectStore: Version information not found in > metastore. hive.metastore.schema.verification is not enabled so recording > the schema version 1.2.0 > 17/06/06 10:30:21 WARN ObjectStore: Failed to get database default, > returning NoSuchObjectException > Traceback (most recent call last): > File "C:\spark\spark-2.1.1-bin-hadoop2.7\python\pyspark\sql\utils.py", > line 63, in deco > return f(*a, **kw) > File "C:\spark\spark-2.1.1-bin-hadoop2.7\python\lib\py4j-0. > 10.4-src.zip\py4j\protocol.py", line 319, in get_return_value > py4j.protocol.Py4JJavaError: An error occurred while calling > o22.sessionState. > : java.lang.IllegalArgumentException: Error while instantiating > 'org.apache.spark.sql.hive.HiveSessionState': > at org.apache.spark.sql.SparkSession$.org$apache$ > spark$sql$SparkSession$$reflect(SparkSession.scala:981) > at org.apache.spark.sql.SparkSession.sessionState$ > lzycompute(SparkSession.scala:110) > at org.apache.spark.sql.SparkSession.sessionState( > SparkSession.scala:109) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) > at py4j.reflection.ReflectionEngine.invoke( > ReflectionEngine.java:357) > at py4j.Gateway.invoke(Gateway.java:280) > at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand. > java:132) > at py4j.commands.CallCommand.execute(CallCommand.java:79) > at py4j.GatewayConnection.run(GatewayConnection.java:214) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at sun.reflect.NativeConstructorAccessorImpl.newInstance( > NativeConstructorAccessorImpl.java:62) > at sun.reflect.DelegatingConstructorAccessorImpl.newInstance( > DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at org.apache.spark.sql.SparkSession$.org$apache$ > spark$sql$SparkSession$$reflect(SparkSession.scala:978) > ... 13 more > Caused by: java.lang.IllegalArgumentException: Error while instantiating > 'org.apache.spark.sql.hive.HiveExternalCatalog': > at org.apache.spark.sql.internal.SharedState$.org$apache$spark$ > sql$internal$SharedState$$reflect(SharedState.scala:169) > at org.apache.spark.sql.internal.SharedState.<init>( > SharedState.scala:86) > at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply( > SparkSession.scala:101) > at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply( > SparkSession.scala:101) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.sql.SparkSession.sharedState$ > lzycompute(SparkSession.scala:101) > at org.apache.spark.sql.SparkSession.sharedState( > SparkSession.scala:100) > at org.apache.spark.sql.internal.SessionState.<init>( > SessionState.scala:157) > at org.apache.spark.sql.hive.HiveSessionState.<init>( > HiveSessionState.scala:32) > ... 18 more > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at sun.reflect.NativeConstructorAccessorImpl.newInstance( > NativeConstructorAccessorImpl.java:62) > at sun.reflect.DelegatingConstructorAccessorImpl.newInstance( > DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at org.apache.spark.sql.internal.SharedState$.org$apache$spark$ > sql$internal$SharedState$$reflect(SharedState.scala:166) > ... 26 more > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at sun.reflect.NativeConstructorAccessorImpl.newInstance( > NativeConstructorAccessorImpl.java:62) > at sun.reflect.DelegatingConstructorAccessorImpl.newInstance( > DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at org.apache.spark.sql.hive.client.IsolatedClientLoader. > createClient(IsolatedClientLoader.scala:264) > at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata( > HiveUtils.scala:358) > at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata( > HiveUtils.scala:262) > at org.apache.spark.sql.hive.HiveExternalCatalog.<init>( > HiveExternalCatalog.scala:66) > ... 31 more > Caused by: java.lang.RuntimeException: java.lang.RuntimeException: The > root scratch dir: /tmp/hive on HDFS should be writable. Current permissions > are: rw-rw-rw- > at org.apache.hadoop.hive.ql.session.SessionState.start( > SessionState.java:522) > at org.apache.spark.sql.hive.client.HiveClientImpl.<init>( > HiveClientImpl.scala:188) > ... 39 more > Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on > HDFS should be writable. Current permissions are: rw-rw-rw- > at org.apache.hadoop.hive.ql.session.SessionState. > createRootHDFSDir(SessionState.java:612) > at org.apache.hadoop.hive.ql.session.SessionState. > createSessionDirs(SessionState.java:554) > at org.apache.hadoop.hive.ql.session.SessionState.start( > SessionState.java:508) > ... 40 more > > > During handling of the above exception, another exception occurred: > > Traceback (most recent call last): > File "C:\spark\spark-2.1.1-bin-hadoop2.7\bin\..\python\pyspark\shell.py", > line 43, in <module> > spark = SparkSession.builder\ > File "C:\spark\spark-2.1.1-bin-hadoop2.7\python\pyspark\sql\session.py", > line 179, in getOrCreate > session._jsparkSession.sessionState().conf().setConfString(key, value) > File "C:\spark\spark-2.1.1-bin-hadoop2.7\python\lib\py4j-0. > 10.4-src.zip\py4j\java_gateway.py", line 1133, in __call__ > File "C:\spark\spark-2.1.1-bin-hadoop2.7\python\pyspark\sql\utils.py", > line 79, in deco > raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace) > pyspark.sql.utils.IllegalArgumentException: "Error while instantiating > 'org.apache.spark.sql.hive.HiveSessionState':" > >>> > > Any help with what might be going wrong here would be greatly appreciated. > > Best > -- > Curtis Burkhalter > Postdoctoral Research Associate, National Audubon Society > > https://sites.google.com/site/curtisburkhalter/ >