[jira] [Commented] (SPARK-21697) NPE & ExceptionInInitializerError trying to load UTF from HDFS

2018-01-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328928#comment-16328928
 ] 

Steve Loughran commented on SPARK-21697:


No, it's spark's ability to have hdfs:// URLs on the classpath.

The classpath is being scanned for commons logging properties, which is forcing 
in HDFS which is then NPEing as the logging code is being called before 
commons-logging is fully set up. Kind of a recursive class init problem 
triggered by a scan for commons-logging.properties.

> NPE & ExceptionInInitializerError trying to load UTF from HDFS
> --
>
> Key: SPARK-21697
> URL: https://issues.apache.org/jira/browse/SPARK-21697
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.1
> Environment: Spark Client mode, Hadoop 2.6.0
>Reporter: Steve Loughran
>Priority: Minor
>
> Reported on [the 
> PR|https://github.com/apache/spark/pull/17342#issuecomment-321438157] for 
> SPARK-12868: trying to load a UDF of HDFS is triggering an 
> {{ExceptionInInitializerError}}, caused by an NPE which should only happen if 
> the commons-logging {{LOG}} log is null.
> Hypothesis: the commons logging scan for {{commons-logging.properties}} is 
> happening in the classpath with the HDFS JAR; this is triggering a D/L of the 
> JAR, which needs to force in commons-logging, and, as that's not inited yet, 
> NPEs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21697) NPE & ExceptionInInitializerError trying to load UTF from HDFS

2018-01-17 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328777#comment-16328777
 ] 

Sean Owen commented on SPARK-21697:
---

Isn't this an HDFS problem? what could Spark do about it? 

> NPE & ExceptionInInitializerError trying to load UTF from HDFS
> --
>
> Key: SPARK-21697
> URL: https://issues.apache.org/jira/browse/SPARK-21697
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.1
> Environment: Spark Client mode, Hadoop 2.6.0
>Reporter: Steve Loughran
>Priority: Minor
>
> Reported on [the 
> PR|https://github.com/apache/spark/pull/17342#issuecomment-321438157] for 
> SPARK-12868: trying to load a UDF of HDFS is triggering an 
> {{ExceptionInInitializerError}}, caused by an NPE which should only happen if 
> the commons-logging {{LOG}} log is null.
> Hypothesis: the commons logging scan for {{commons-logging.properties}} is 
> happening in the classpath with the HDFS JAR; this is triggering a D/L of the 
> JAR, which needs to force in commons-logging, and, as that's not inited yet, 
> NPEs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21697) NPE & ExceptionInInitializerError trying to load UTF from HDFS

2018-01-16 Thread h2s (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328304#comment-16328304
 ] 

h2s commented on SPARK-21697:
-

my environment : spark-2.2.0 hadoop-2.10.

delete  spark/jars/commons-logging-1.1.3.jar,then it works well

> NPE & ExceptionInInitializerError trying to load UTF from HDFS
> --
>
> Key: SPARK-21697
> URL: https://issues.apache.org/jira/browse/SPARK-21697
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.1
> Environment: Spark Client mode, Hadoop 2.6.0
>Reporter: Steve Loughran
>Priority: Minor
>
> Reported on [the 
> PR|https://github.com/apache/spark/pull/17342#issuecomment-321438157] for 
> SPARK-12868: trying to load a UDF of HDFS is triggering an 
> {{ExceptionInInitializerError}}, caused by an NPE which should only happen if 
> the commons-logging {{LOG}} log is null.
> Hypothesis: the commons logging scan for {{commons-logging.properties}} is 
> happening in the classpath with the HDFS JAR; this is triggering a D/L of the 
> JAR, which needs to force in commons-logging, and, as that's not inited yet, 
> NPEs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21697) NPE & ExceptionInInitializerError trying to load UTF from HDFS

2017-10-17 Thread Bang Xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207098#comment-16207098
 ] 

Bang Xiao commented on SPARK-21697:
---

in the case describe above, i added "spark jars : file:///xxx.jar" in 
conf/spark-defaults.conf, then when i use "add jar"  through SparkSQL CLI in 
yarn-client mode, it occurs the error. 
if i added "spark jars : file:///xxx.jar, hdfs://xxx.jar" in 
conf/spark-defaults.conf.  i can "add jar hdfs:///.jar" successfully 
through SparkSQL CLI in yarn-client mode

> NPE & ExceptionInInitializerError trying to load UTF from HDFS
> --
>
> Key: SPARK-21697
> URL: https://issues.apache.org/jira/browse/SPARK-21697
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.1
> Environment: Spark Client mode, Hadoop 2.6.0
>Reporter: Steve Loughran
>Priority: Minor
>
> Reported on [the 
> PR|https://github.com/apache/spark/pull/17342#issuecomment-321438157] for 
> SPARK-12868: trying to load a UDF of HDFS is triggering an 
> {{ExceptionInInitializerError}}, caused by an NPE which should only happen if 
> the commons-logging {{LOG}} log is null.
> Hypothesis: the commons logging scan for {{commons-logging.properties}} is 
> happening in the classpath with the HDFS JAR; this is triggering a D/L of the 
> JAR, which needs to force in commons-logging, and, as that's not inited yet, 
> NPEs



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21697) NPE & ExceptionInInitializerError trying to load UTF from HDFS

2017-08-11 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16123208#comment-16123208
 ] 

Steve Loughran commented on SPARK-21697:


What would a test to replicate look like?

# Create MiniDFS cluster (better: find a test with one already spun up @ class 
level)
# copy JAR to it (issue: can't rely on the local test suite being in a JAR due 
to SBT & IDEs doing things differently/faster than maven)
# add JAR to CP/create a CP which *only* has the JAR in
# Load something from the CP which triggers download. 

If you can assume that some common library (junit.jar?) is always in a JAR then 
the JAR could be uploaded by: locating its URL, translate to local path & then 
use FileSystm.copyFromLocalFile() to upload.

Or: create/find a UDF JAR, copy to MiniDFSCluster, start spark SQL with the 
HDFS URL. This would verify the desired codepath & be best to make sure its 
gone away

> NPE & ExceptionInInitializerError trying to load UTF from HDFS
> --
>
> Key: SPARK-21697
> URL: https://issues.apache.org/jira/browse/SPARK-21697
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.1
> Environment: Spark Client mode, Hadoop 2.6.0
>Reporter: Steve Loughran
>Priority: Minor
>
> Reported on [the 
> PR|https://github.com/apache/spark/pull/17342#issuecomment-321438157] for 
> SPARK-12868: trying to load a UDF of HDFS is triggering an 
> {{ExceptionInInitializerError}}, caused by an NPE which should only happen if 
> the commons-logging {{LOG}} log is null.
> Hypothesis: the commons logging scan for {{commons-logging.properties}} is 
> happening in the classpath with the HDFS JAR; this is triggering a D/L of the 
> JAR, which needs to force in commons-logging, and, as that's not inited yet, 
> NPEs



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21697) NPE & ExceptionInInitializerError trying to load UTF from HDFS

2017-08-11 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16123193#comment-16123193
 ] 

Steve Loughran commented on SPARK-21697:


PS: right now, probably doesn't work at all

> NPE & ExceptionInInitializerError trying to load UTF from HDFS
> --
>
> Key: SPARK-21697
> URL: https://issues.apache.org/jira/browse/SPARK-21697
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.1
> Environment: Spark Client mode, Hadoop 2.6.0
>Reporter: Steve Loughran
>Priority: Minor
>
> Reported on [the 
> PR|https://github.com/apache/spark/pull/17342#issuecomment-321438157] for 
> SPARK-12868: trying to load a UDF of HDFS is triggering an 
> {{ExceptionInInitializerError}}, caused by an NPE which should only happen if 
> the commons-logging {{LOG}} log is null.
> Hypothesis: the commons logging scan for {{commons-logging.properties}} is 
> happening in the classpath with the HDFS JAR; this is triggering a D/L of the 
> JAR, which needs to force in commons-logging, and, as that's not inited yet, 
> NPEs



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21697) NPE & ExceptionInInitializerError trying to load UTF from HDFS

2017-08-11 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16123063#comment-16123063
 ] 

Steve Loughran commented on SPARK-21697:


# I don't see anything which can be done in HDFS here; it's in the libraries 
below it.
# Recent Hadoop releases => SLF4J, which *may* not have this problem. But as 
log4j looks for log4j.properties, and as dependent libraries may use 
commons-logging, there's no guarantee of that.

What to do?
# Classloader games: bring up the log infra then add the HDFS JARs to the CP. 
Maybe requires knowledge of what to force in before anything else. e.g: using 
new CP, do a stat of every JAR path, then inject them into the CP. Risky, as 
nobody really understands classpaths.
# Force D/L the remote artifact to local temp FS before execution, as YARN does 
itself. Do it for HFDS, WASB, S3x, ..., all filesystems known by Hadoop FS. 
(side issue, is there a way to enumerate this? Probably not, except for merging 
the list of service-discovered entries and those with an {{fs.SCHEMA.imp}} 
entry. 

I think that' #2 is potentially the simplest and so most viable. It's not quite 
as elegant as saying "this is a supported URL you can directly use in the CP", 
but its the one that is going to avoid these problems




> NPE & ExceptionInInitializerError trying to load UTF from HDFS
> --
>
> Key: SPARK-21697
> URL: https://issues.apache.org/jira/browse/SPARK-21697
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.1
> Environment: Spark Client mode, Hadoop 2.6.0
>Reporter: Steve Loughran
>Priority: Minor
>
> Reported on [the 
> PR|https://github.com/apache/spark/pull/17342#issuecomment-321438157] for 
> SPARK-12868: trying to load a UDF of HDFS is triggering an 
> {{ExceptionInInitializerError}}, caused by an NPE which should only happen if 
> the commons-logging {{LOG}} log is null.
> Hypothesis: the commons logging scan for {{commons-logging.properties}} is 
> happening in the classpath with the HDFS JAR; this is triggering a D/L of the 
> JAR, which needs to force in commons-logging, and, as that's not inited yet, 
> NPEs



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21697) NPE & ExceptionInInitializerError trying to load UTF from HDFS

2017-08-10 Thread Weiqing Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122234#comment-16122234
 ] 

Weiqing Yang commented on SPARK-21697:
--

Thanks for filing this issue!

> NPE & ExceptionInInitializerError trying to load UTF from HDFS
> --
>
> Key: SPARK-21697
> URL: https://issues.apache.org/jira/browse/SPARK-21697
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.1
> Environment: Spark Client mode, Hadoop 2.6.0
>Reporter: Steve Loughran
>Priority: Minor
>
> Reported on [the 
> PR|https://github.com/apache/spark/pull/17342#issuecomment-321438157] for 
> SPARK-12868: trying to load a UDF of HDFS is triggering an 
> {{ExceptionInInitializerError}}, caused by an NPE which should only happen if 
> the commons-logging {{LOG}} log is null.
> Hypothesis: the commons logging scan for {{commons-logging.properties}} is 
> happening in the classpath with the HDFS JAR; this is triggering a D/L of the 
> JAR, which needs to force in commons-logging, and, as that's not inited yet, 
> NPEs



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21697) NPE & ExceptionInInitializerError trying to load UTF from HDFS

2017-08-10 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122195#comment-16122195
 ] 

Steve Loughran commented on SPARK-21697:


{code}
Have u tried it in yarn-client mode? i add this path in v2.1.1 + Hadoop 2.6.0, 
when i run "add jar" through SparkSQL CLI , it comes out this error:
ERROR thriftserver.SparkSQLDriver: Failed in [add jar 
hdfs://SunshineNameNode3:8020/lib/clouddata-common-lib/chardet-0.0.1.jar]
java.lang.ExceptionInInitializerError
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662)
at 
org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:889)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:947)
at java.io.DataInputStream.read(DataInputStream.java:100)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:119)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:369)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:341)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:292)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:2107)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:2076)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:2052)
at 
org.apache.hadoop.hive.ql.session.SessionState.downloadResource(SessionState.java:1274)
at 
org.apache.hadoop.hive.ql.session.SessionState.resolveAndDownload(SessionState.java:1242)
at 
org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1163)
at 
org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1149)
at 
org.apache.hadoop.hive.ql.processors.AddResourceProcessor.run(AddResourceProcessor.java:67)
at 
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:632)
at 
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:601)
at 
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:278)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:225)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:224)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:267)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.runHive(HiveClientImpl.scala:601)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:591)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.addJar(HiveClientImpl.scala:738)
at org.apache.spark.sql.hive.HiveSessionState.addJar(HiveSessionState.scala:105)
at org.apache.spark.sql.execution.command.AddJarCommand.run(resources.scala:40)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
at org.apache.spark.sql.Dataset.(Dataset.scala:185)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:699)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:62)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:335)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:247)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:743)
at