[jira] [Commented] (SPARK-23338) Spark unable to run on HDP deployed Azure Blob File System

2018-02-08 Thread Marco Gaido (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356771#comment-16356771
 ] 

Marco Gaido commented on SPARK-23338:
-

[~Subham] questions should be sent to the user mailing list, JIRA is for 
reporting bugs/feature requests. Anyway, your problem seems realted to this: 
https://kitmenke.com/blog/2017/08/05/classcastexception-submitting-spark-apps-to-hdinsight/.
 Hope this can help you.

> Spark unable to run on HDP deployed Azure Blob File System
> --
>
> Key: SPARK-23338
> URL: https://issues.apache.org/jira/browse/SPARK-23338
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, Spark Shell
>Affects Versions: 2.2.0
> Environment: HDP 2.6.0.3
> Spark2 2.2.0
> HDFS 2.7.3
> CentOS 7.1
>Reporter: Subhankar
>Priority: Major
>  Labels: Azure, BLOB, HDP, azureblob, hadoop, hive, spark
>
> Hello,
> It is impossible to run Spark on the BLOB storage file system deployed on HDP.
> I am unable to run Spark as it is giving errors related to HiveSessionState, 
> HiveExternalCatalog and various Azure File storage exceptions.
> I request you to kindly help in case you have a suggestion to address this. 
> Or is it that my exercise is futile and Spark is not configured to run on 
> BLOB storage after all.
> Thanks in advance.
>  
> Detailed Description:
>  
> h5. *We are unable to access spark/spark2 when we change the file system 
> storage form HDFS to WASB. We are using HDP 2.6 platform and running Hadoop 
> 2.7.3. All other services are working fine.*
> I have set the following configurations:
> *HDFS*:
> core-site-
> fs.defaultFS = 
> wasb:[//CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net|mailto://CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net]
> fs.AbstractFileSystem.wasb.impl = org.apache.hadoop.fs.azure.Wasb
> fs.AbstractFileSystem.wasbs.impl = org.apache.hadoop.fs.azure.Wasbs
> fs.azure.selfthrottling.read.factor = 1.0
> fs.azure.selfthrottling.write.factor = 1.0
> [fs.azure.account.key.STORAGE_ACCOUNT_NAME.blob.core.windows.net|http://fs.azure.account.key.storage_account_name.blob.core.windows.net/]
>  = KEY
> [spark.hadoop.fs.azure.account.key.STORAGE_ACCOUNT_NAME.blob.core.windows.net|http://spark.hadoop.fs.azure.account.key.storage_account_name.blob.core.windows.net/]
>  = KEY
> *SPARK2:*
> spark.eventLog.dir = 
> wasb:[//CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net|mailto://CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net]/spark2-history/
> spark.history.fs.logDirectory = 
> wasb:[//CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net|mailto://CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net]/spark2-history/
> In spite of trying multiple times and irrespective of alternative 
> configurations, the *spark-shell* command is yielding the below results:
> $ spark-shell
> Setting default log level to "WARN".
> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
> setLogLevel(newLevel).
> java.lang.IllegalArgumentException: Error while instantiating 
> 'org.apache.spark.sql.hive.HiveSessionState':
> at 
> org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:983)
> at 
> org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:110)
> at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:109)
> at 
> org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:878)
> at 
> org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:878)
> at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
> at 
> org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:878)
> at org.apache.spark.repl.Main$.createSparkSession(Main.scala:96)
> ... 47 elided
> Caused by: java.lang.reflect.InvocationTargetException: 
> java.lang.IllegalArgumentException: Error while instantiating 
> 'org.apache.spark.sql.hive.HiveExternalCatalog':
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at 
> org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:980)
> ... 58 more
> Caused by: 

[jira] [Commented] (SPARK-23338) Spark unable to run on HDP deployed Azure Blob File System

2018-02-05 Thread Subhankar (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352421#comment-16352421
 ] 

Subhankar commented on SPARK-23338:
---

Thanks for your response Sean. Could you please suggest a workaround for this. 
Should we raise a concern with Azure regarding this. 

> Spark unable to run on HDP deployed Azure Blob File System
> --
>
> Key: SPARK-23338
> URL: https://issues.apache.org/jira/browse/SPARK-23338
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, Spark Shell
>Affects Versions: 2.2.0
> Environment: HDP 2.6.0.3
> Spark2 2.2.0
> HDFS 2.7.3
> CentOS 7.1
>Reporter: Subhankar
>Priority: Major
>  Labels: Azure, BLOB, HDP, azureblob, hadoop, hive, spark
>
> Hello,
> It is impossible to run Spark on the BLOB storage file system deployed on HDP.
> I am unable to run Spark as it is giving errors related to HiveSessionState, 
> HiveExternalCatalog and various Azure File storage exceptions.
> I request you to kindly help in case you have a suggestion to address this. 
> Or is it that my exercise is futile and Spark is not configured to run on 
> BLOB storage after all.
> Thanks in advance.
>  
> Detailed Description:
>  
> h5. *We are unable to access spark/spark2 when we change the file system 
> storage form HDFS to WASB. We are using HDP 2.6 platform and running Hadoop 
> 2.7.3. All other services are working fine.*
> I have set the following configurations:
> *HDFS*:
> core-site-
> fs.defaultFS = 
> wasb:[//CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net|mailto://CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net]
> fs.AbstractFileSystem.wasb.impl = org.apache.hadoop.fs.azure.Wasb
> fs.AbstractFileSystem.wasbs.impl = org.apache.hadoop.fs.azure.Wasbs
> fs.azure.selfthrottling.read.factor = 1.0
> fs.azure.selfthrottling.write.factor = 1.0
> [fs.azure.account.key.STORAGE_ACCOUNT_NAME.blob.core.windows.net|http://fs.azure.account.key.storage_account_name.blob.core.windows.net/]
>  = KEY
> [spark.hadoop.fs.azure.account.key.STORAGE_ACCOUNT_NAME.blob.core.windows.net|http://spark.hadoop.fs.azure.account.key.storage_account_name.blob.core.windows.net/]
>  = KEY
> *SPARK2:*
> spark.eventLog.dir = 
> wasb:[//CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net|mailto://CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net]/spark2-history/
> spark.history.fs.logDirectory = 
> wasb:[//CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net|mailto://CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net]/spark2-history/
> In spite of trying multiple times and irrespective of alternative 
> configurations, the *spark-shell* command is yielding the below results:
> $ spark-shell
> Setting default log level to "WARN".
> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
> setLogLevel(newLevel).
> java.lang.IllegalArgumentException: Error while instantiating 
> 'org.apache.spark.sql.hive.HiveSessionState':
> at 
> org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:983)
> at 
> org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:110)
> at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:109)
> at 
> org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:878)
> at 
> org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:878)
> at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
> at 
> org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:878)
> at org.apache.spark.repl.Main$.createSparkSession(Main.scala:96)
> ... 47 elided
> Caused by: java.lang.reflect.InvocationTargetException: 
> java.lang.IllegalArgumentException: Error while instantiating 
> 'org.apache.spark.sql.hive.HiveExternalCatalog':
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at 
> org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:980)
> ... 58 more
> Caused by: java.lang.IllegalArgumentException: Error while instantiating 
> 'org.apache.spark.sql.hive.HiveExternalCatalog':
> at 
> 

[jira] [Commented] (SPARK-23338) Spark unable to run on HDP deployed Azure Blob File System

2018-02-05 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352410#comment-16352410
 ] 

Sean Owen commented on SPARK-23338:
---

This all shows an error from Azure APIs, and ultimately a failure from the 
Azure blob store. THis doesn't sound Spark-related.

> Spark unable to run on HDP deployed Azure Blob File System
> --
>
> Key: SPARK-23338
> URL: https://issues.apache.org/jira/browse/SPARK-23338
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, Spark Shell
>Affects Versions: 2.2.0
> Environment: HDP 2.6.0.3
> Spark2 2.2.0
> HDFS 2.7.3
> CentOS 7.1
>Reporter: Subhankar
>Priority: Major
>  Labels: Azure, BLOB, HDP, azureblob, hadoop, hive, spark
>
> Hello,
> It is impossible to run Spark on the BLOB storage file system deployed on HDP.
> I am unable to run Spark as it is giving errors related to HiveSessionState, 
> HiveExternalCatalog and various Azure File storage exceptions.
> I request you to kindly help in case you have a suggestion to address this. 
> Or is it that my exercise is futile and Spark is not configured to run on 
> BLOB storage after all.
> Thanks in advance.
>  
> Detailed Description:
>  
> h5. *We are unable to access spark/spark2 when we change the file system 
> storage form HDFS to WASB. We are using HDP 2.6 platform and running Hadoop 
> 2.7.3. All other services are working fine.*
> I have set the following configurations:
> *HDFS*:
> core-site-
> fs.defaultFS = 
> wasb:[//CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net|mailto://CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net]
> fs.AbstractFileSystem.wasb.impl = org.apache.hadoop.fs.azure.Wasb
> fs.AbstractFileSystem.wasbs.impl = org.apache.hadoop.fs.azure.Wasbs
> fs.azure.selfthrottling.read.factor = 1.0
> fs.azure.selfthrottling.write.factor = 1.0
> [fs.azure.account.key.STORAGE_ACCOUNT_NAME.blob.core.windows.net|http://fs.azure.account.key.storage_account_name.blob.core.windows.net/]
>  = KEY
> [spark.hadoop.fs.azure.account.key.STORAGE_ACCOUNT_NAME.blob.core.windows.net|http://spark.hadoop.fs.azure.account.key.storage_account_name.blob.core.windows.net/]
>  = KEY
> *SPARK2:*
> spark.eventLog.dir = 
> wasb:[//CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net|mailto://CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net]/spark2-history/
> spark.history.fs.logDirectory = 
> wasb:[//CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net|mailto://CONTAINER@STORAGE_ACCOUNT_NAME.blob.core.windows.net]/spark2-history/
> In spite of trying multiple times and irrespective of alternative 
> configurations, the *spark-shell* command is yielding the below results:
> $ spark-shell
> Setting default log level to "WARN".
> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
> setLogLevel(newLevel).
> java.lang.IllegalArgumentException: Error while instantiating 
> 'org.apache.spark.sql.hive.HiveSessionState':
> at 
> org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:983)
> at 
> org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:110)
> at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:109)
> at 
> org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:878)
> at 
> org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:878)
> at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
> at 
> org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:878)
> at org.apache.spark.repl.Main$.createSparkSession(Main.scala:96)
> ... 47 elided
> Caused by: java.lang.reflect.InvocationTargetException: 
> java.lang.IllegalArgumentException: Error while instantiating 
> 'org.apache.spark.sql.hive.HiveExternalCatalog':
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at 
> org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:980)
> ... 58 more
> Caused by: java.lang.IllegalArgumentException: Error while instantiating 
> 'org.apache.spark.sql.hive.HiveExternalCatalog':
> at 
>