[ 
https://issues.apache.org/jira/browse/SPARK-22651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-22651.
----------------------------------
       Resolution: Fixed
    Fix Version/s: 2.3.0

Issue resolved by pull request 19845
[https://github.com/apache/spark/pull/19845]

> Calling ImageSchema.readImages initiate multiple Hive clients
> -------------------------------------------------------------
>
>                 Key: SPARK-22651
>                 URL: https://issues.apache.org/jira/browse/SPARK-22651
>             Project: Spark
>          Issue Type: Bug
>          Components: ML, PySpark
>    Affects Versions: 2.3.0
>            Reporter: Hyukjin Kwon
>             Fix For: 2.3.0
>
>
> While playing with images, I realised calling {{ImageSchema.readImages}} 
> multiple times seems attempting to create multiple Hive clients.
> {code}
> from pyspark.ml.image import ImageSchema
> data_path = 'data/mllib/images/kittens'
> _ = ImageSchema.readImages(data_path, recursive=True, 
> dropImageFailures=True).collect()
> _ = ImageSchema.readImages(data_path, recursive=True, 
> dropImageFailures=True).collect()
> {code}
> {code}
> ...
> org.datanucleus.exceptions.NucleusDataStoreException: Unable to open a test 
> connection to the given database. JDBC url = 
> jdbc:derby:;databaseName=metastore_db;create=true, username = APP. 
> Terminating connection pool (set lazyInit to true if you expect to start your 
> database after your app). Original Exception: ------
> java.sql.SQLException: Failed to start database 'metastore_db' with class 
> loader 
> org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@742f639f, see 
> the next exception for details.
> ...
>       at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> ...
>       at 
> org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5762)
> ...
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl.newState(HiveClientImpl.scala:180)
> ...
>       at 
> org.apache.spark.sql.SparkSession.createDataFrame(SparkSession.scala:348)
>       at 
> org.apache.spark.ml.image.ImageSchema$$anonfun$readImages$2$$anonfun$apply$1.apply(ImageSchema.scala:253)
> ...
> Caused by: ERROR XJ040: Failed to start database 'metastore_db' with class 
> loader 
> org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@742f639f, see 
> the next exception for details.
>       at org.apache.derby.iapi.error.StandardException.newException(Unknown 
> Source)
>       at 
> org.apache.derby.impl.jdbc.SQLExceptionFactory.wrapArgsForTransportAcrossDRDA(Unknown
>  Source)
>       ... 121 more
> Caused by: ERROR XSDB6: Another instance of Derby may have already booted the 
> database /.../spark/metastore_db.
> ...
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "/.../spark/python/pyspark/ml/image.py", line 190, in readImages
>     dropImageFailures, float(sampleRatio), seed)
>   File "/.../spark/python/lib/py4j-0.10.6-src.zip/py4j/java_gateway.py", line 
> 1160, in __call__
>   File "/.../spark/python/pyspark/sql/utils.py", line 69, in deco
>     raise AnalysisException(s.split(': ', 1)[1], stackTrace)
> pyspark.sql.utils.AnalysisException: u'java.lang.RuntimeException: 
> java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient;'
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to