[jira] [Commented] (SPARK-23338) Spark unable to run on HDP deployed Azure Blob File System
[ https://issues.apache.org/jira/browse/SPARK-23338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352421#comment-16352421 ] Subhankar commented on SPARK-23338: --- Thanks for your response Sean. Could you please suggest a workaround for this. Should we raise a concern with Azure regarding this. > Spark unable to run on HDP deployed Azure Blob File System > -- > > Key: SPARK-23338 > URL: https://issues.apache.org/jira/browse/SPARK-23338 > Project: Spark > Issue Type: Bug > Components: Spark Core, Spark Shell >Affects Versions: 2.2.0 > Environment: HDP 2.6.0.3 > Spark2 2.2.0 > HDFS 2.7.3 > CentOS 7.1 >Reporter: Subhankar >Priority: Major > Labels: Azure, BLOB, HDP, azureblob, hadoop, hive, spark > > Hello, > It is impossible to run Spark on the BLOB storage file system deployed on HDP. > I am unable to run Spark as it is giving errors related to HiveSessionState, > HiveExternalCatalog and various Azure File storage exceptions. > I request you to kindly help in case you have a suggestion to address this. > Or is it that my exercise is futile and Spark is not configured to run on > BLOB storage after all. > Thanks in advance. > > Detailed Description: > > h5. *We are unable to access spark/spark2 when we change the file system > storage form HDFS to WASB. We are using HDP 2.6 platform and running Hadoop > 2.7.3. All other services are working fine.* > I have set the following configurations: > *HDFS*: > core-site- > fs.defaultFS = > wasb:[//CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net|mailto://CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net] > fs.AbstractFileSystem.wasb.impl = org.apache.hadoop.fs.azure.Wasb > fs.AbstractFileSystem.wasbs.impl = org.apache.hadoop.fs.azure.Wasbs > fs.azure.selfthrottling.read.factor = 1.0 > fs.azure.selfthrottling.write.factor = 1.0 > [fs.azure.account.key.STORAGE_ACCOUNT_NAME.blob.core.windows.net|http://fs.azure.account.key.storage_account_name.blob.core.windows.net/] > = KEY > [spark.hadoop.fs.azure.account.key.STORAGE_ACCOUNT_NAME.blob.core.windows.net|http://spark.hadoop.fs.azure.account.key.storage_account_name.blob.core.windows.net/] > = KEY > *SPARK2:* > spark.eventLog.dir = > wasb:[//CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net|mailto://CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net]/spark2-history/ > spark.history.fs.logDirectory = > wasb:[//CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net|mailto://CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net]/spark2-history/ > In spite of trying multiple times and irrespective of alternative > configurations, the *spark-shell* command is yielding the below results: > $ spark-shell > Setting default log level to "WARN". > To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use > setLogLevel(newLevel). > java.lang.IllegalArgumentException: Error while instantiating > 'org.apache.spark.sql.hive.HiveSessionState': > at > org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:983) > at > org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:110) > at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:109) > at > org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:878) > at > org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:878) > at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) > at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) > at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) > at scala.collection.mutable.HashMap.foreach(HashMap.scala:99) > at > org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:878) > at org.apache.spark.repl.Main$.createSparkSession(Main.scala:96) > ... 47 elided > Caused by: java.lang.reflect.InvocationTargetException: > java.lang.IllegalArgumentException: Error while instantiating > 'org.apache.spark.sql.hive.HiveExternalCatalog': > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:980) > ... 58 more > Caused by: java.lang.IllegalArgumentException: Error while instantiating > 'org.apache.spark.sql.hive.HiveExternalCatalog': > at > org.apache.spark.sql.intern
[jira] [Created] (SPARK-23338) Spark unable to run on HDP deployed Azure Blob File System
Subhankar created SPARK-23338: - Summary: Spark unable to run on HDP deployed Azure Blob File System Key: SPARK-23338 URL: https://issues.apache.org/jira/browse/SPARK-23338 Project: Spark Issue Type: Bug Components: Spark Core, Spark Shell Affects Versions: 2.2.0 Environment: HDP 2.6.0.3 Spark2 2.2.0 HDFS 2.7.3 CentOS 7.1 Reporter: Subhankar Hello, It is impossible to run Spark on the BLOB storage file system deployed on HDP. I am unable to run Spark as it is giving errors related to HiveSessionState, HiveExternalCatalog and various Azure File storage exceptions. I request you to kindly help in case you have a suggestion to address this. Or is it that my exercise is futile and Spark is not configured to run on BLOB storage after all. Thanks in advance. Detailed Description: h5. *We are unable to access spark/spark2 when we change the file system storage form HDFS to WASB. We are using HDP 2.6 platform and running Hadoop 2.7.3. All other services are working fine.* I have set the following configurations: *HDFS*: core-site- fs.defaultFS = wasb:[//CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net|mailto://CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net] fs.AbstractFileSystem.wasb.impl = org.apache.hadoop.fs.azure.Wasb fs.AbstractFileSystem.wasbs.impl = org.apache.hadoop.fs.azure.Wasbs fs.azure.selfthrottling.read.factor = 1.0 fs.azure.selfthrottling.write.factor = 1.0 [fs.azure.account.key.STORAGE_ACCOUNT_NAME.blob.core.windows.net|http://fs.azure.account.key.storage_account_name.blob.core.windows.net/] = KEY [spark.hadoop.fs.azure.account.key.STORAGE_ACCOUNT_NAME.blob.core.windows.net|http://spark.hadoop.fs.azure.account.key.storage_account_name.blob.core.windows.net/] = KEY *SPARK2:* spark.eventLog.dir = wasb:[//CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net|mailto://CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net]/spark2-history/ spark.history.fs.logDirectory = wasb:[//CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net|mailto://CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net]/spark2-history/ In spite of trying multiple times and irrespective of alternative configurations, the *spark-shell* command is yielding the below results: $ spark-shell Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState': at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:983) at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:110) at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:109) at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:878) at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:878) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) at scala.collection.mutable.HashMap.foreach(HashMap.scala:99) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:878) at org.apache.spark.repl.Main$.createSparkSession(Main.scala:96) ... 47 elided Caused by: java.lang.reflect.InvocationTargetException: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog': at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:980) ... 58 more Caused by: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog': at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:176) at org.apache.spark.sql.internal.SharedState.(SharedState.scala:86) at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101) at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:101) at org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:100) at org.apache.spark.sql.internal.SessionState.(SessionState