I think the problem is that when you are using yarn-cluster mode, because
the Spark driver runs inside the application master, the hive-conf is not
accessible by the driver. Can you try to set those confs by using
hiveContext.set(...)? Or, maybe you can copy hive-site.xml to spark/conf in
the node running the application master.


On Tue, Aug 12, 2014 at 8:38 PM, Jenny Zhao <linlin200...@gmail.com> wrote:

>
> Hi Yin,
>
> hive-site.xml was copied to spark/conf and the same as the one under
> $HIVE_HOME/conf.
>
> through hive cli, I don't see any problem. but for spark on yarn-cluster
> mode, I am not able to switch to a database other than the default one, for
> Yarn-client mode, it works fine.
>
> Thanks!
>
> Jenny
>
>
> On Tue, Aug 12, 2014 at 12:53 PM, Yin Huai <huaiyin....@gmail.com> wrote:
>
>> Hi Jenny,
>>
>> Have you copied hive-site.xml to spark/conf directory? If not, can you
>> put it in conf/ and try again?
>>
>> Thanks,
>>
>> Yin
>>
>>
>> On Mon, Aug 11, 2014 at 8:57 PM, Jenny Zhao <linlin200...@gmail.com>
>> wrote:
>>
>>>
>>> Thanks Yin!
>>>
>>> here is my hive-site.xml,  which I copied from $HIVE_HOME/conf, didn't
>>> experience problem connecting to the metastore through hive. which uses DB2
>>> as metastore database.
>>>
>>> <?xml version="1.0"?>
>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>> <!--
>>>    Licensed to the Apache Software Foundation (ASF) under one or more
>>>    contributor license agreements.  See the NOTICE file distributed with
>>>    this work for additional information regarding copyright ownership.
>>>    The ASF licenses this file to You under the Apache License, Version
>>> 2.0
>>>    (the "License"); you may not use this file except in compliance with
>>>    the License.  You may obtain a copy of the License at
>>>
>>>        http://www.apache.org/licenses/LICENSE-2.0
>>>
>>>    Unless required by applicable law or agreed to in writing, software
>>>    distributed under the License is distributed on an "AS IS" BASIS,
>>>    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
>>> implied.
>>>    See the License for the specific language governing permissions and
>>>    limitations under the License.
>>> -->
>>> <configuration>
>>>  <property>
>>>   <name>hive.hwi.listen.port</name>
>>>   <value>9999</value>
>>>  </property>
>>>  <property>
>>>   <name>hive.querylog.location</name>
>>>   <value>/var/ibm/biginsights/hive/query/${user.name}</value>
>>>  </property>
>>>  <property>
>>>   <name>hive.metastore.warehouse.dir</name>
>>>   <value>/biginsights/hive/warehouse</value>
>>>  </property>
>>>  <property>
>>>   <name>hive.hwi.war.file</name>
>>>   <value>lib/hive-hwi-0.12.0.war</value>
>>>  </property>
>>>  <property>
>>>   <name>hive.metastore.metrics.enabled</name>
>>>   <value>true</value>
>>>  </property>
>>>  <property>
>>>   <name>javax.jdo.option.ConnectionURL</name>
>>>   <value>jdbc:db2://hdtest022.svl.ibm.com:50001/BIDB</value>
>>>  </property>
>>>  <property>
>>>   <name>javax.jdo.option.ConnectionDriverName</name>
>>>   <value>com.ibm.db2.jcc.DB2Driver</value>
>>>  </property>
>>>  <property>
>>>   <name>hive.stats.autogather</name>
>>>   <value>false</value>
>>>  </property>
>>>  <property>
>>>   <name>javax.jdo.mapping.Schema</name>
>>>   <value>HIVE</value>
>>>  </property>
>>>  <property>
>>>   <name>javax.jdo.option.ConnectionUserName</name>
>>>   <value>catalog</value>
>>>  </property>
>>>  <property>
>>>   <name>javax.jdo.option.ConnectionPassword</name>
>>>   <value>V2pJNWMxbFlVbWhaZHowOQ==</value>
>>>  </property>
>>>  <property>
>>>   <name>hive.metastore.password.encrypt</name>
>>>   <value>true</value>
>>>  </property>
>>>  <property>
>>>   <name>org.jpox.autoCreateSchema</name>
>>>   <value>true</value>
>>>  </property>
>>>  <property>
>>>   <name>hive.server2.thrift.min.worker.threads</name>
>>>   <value>5</value>
>>>  </property>
>>>  <property>
>>>   <name>hive.server2.thrift.max.worker.threads</name>
>>>   <value>100</value>
>>>  </property>
>>>  <property>
>>>   <name>hive.server2.thrift.port</name>
>>>   <value>10000</value>
>>>  </property>
>>>  <property>
>>>   <name>hive.server2.thrift.bind.host</name>
>>>   <value>hdtest022.svl.ibm.com</value>
>>>  </property>
>>>  <property>
>>>   <name>hive.server2.authentication</name>
>>>   <value>CUSTOM</value>
>>>  </property>
>>>  <property>
>>>   <name>hive.server2.custom.authentication.class</name>
>>>
>>> <value>org.apache.hive.service.auth.WebConsoleAuthenticationProviderImpl</value>
>>>  </property>
>>>  <property>
>>>   <name>hive.server2.enable.impersonation</name>
>>>   <value>true</value>
>>>  </property>
>>>  <property>
>>>   <name>hive.security.webconsole.url</name>
>>>   <value>http://hdtest022.svl.ibm.com:8080</value>
>>>  </property>
>>>  <property>
>>>   <name>hive.security.authorization.enabled</name>
>>>   <value>true</value>
>>>  </property>
>>>  <property>
>>>   <name>hive.security.authorization.createtable.owner.grants</name>
>>>   <value>ALL</value>
>>>  </property>
>>> </configuration>
>>>
>>>
>>>
>>> On Mon, Aug 11, 2014 at 4:29 PM, Yin Huai <huaiyin....@gmail.com> wrote:
>>>
>>>> Hi Jenny,
>>>>
>>>> How's your metastore configured for both Hive and Spark SQL? Which
>>>> metastore mode are you using (based on
>>>> https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin
>>>> )?
>>>>
>>>> Thanks,
>>>>
>>>> Yin
>>>>
>>>>
>>>> On Mon, Aug 11, 2014 at 6:15 PM, Jenny Zhao <linlin200...@gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> you can reproduce this issue with the following steps (assuming you
>>>>> have Yarn cluster + Hive 12):
>>>>>
>>>>> 1) using hive shell, create a database, e.g: create database ttt
>>>>>
>>>>> 2) write a simple spark sql program
>>>>>
>>>>> import org.apache.spark.{SparkConf, SparkContext}
>>>>> import org.apache.spark.sql._
>>>>> import org.apache.spark.sql.hive.HiveContext
>>>>>
>>>>> object HiveSpark {
>>>>>   case class Record(key: Int, value: String)
>>>>>
>>>>>   def main(args: Array[String]) {
>>>>>     val sparkConf = new SparkConf().setAppName("HiveSpark")
>>>>>     val sc = new SparkContext(sparkConf)
>>>>>
>>>>>     // A hive context creates an instance of the Hive Metastore in
>>>>> process,
>>>>>     val hiveContext = new HiveContext(sc)
>>>>>     import hiveContext._
>>>>>
>>>>>     hql("use ttt")
>>>>>     hql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING)")
>>>>>     hql("LOAD DATA INPATH '/user/biadmin/kv1.txt' INTO TABLE src")
>>>>>
>>>>>     // Queries are expressed in HiveQL
>>>>>     println("Result of 'SELECT *': ")
>>>>>     hql("SELECT * FROM src").collect.foreach(println)
>>>>>     sc.stop()
>>>>>   }
>>>>> }
>>>>> 3) run it in yarn-cluster mode.
>>>>>
>>>>>
>>>>> On Mon, Aug 11, 2014 at 9:44 AM, Cheng Lian <lian.cs....@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Since you were using hql(...), it’s probably not related to JDBC
>>>>>> driver. But I failed to reproduce this issue locally with a single node
>>>>>> pseudo distributed YARN cluster. Would you mind to elaborate more about
>>>>>> steps to reproduce this bug? Thanks
>>>>>> ​
>>>>>>
>>>>>>
>>>>>> On Sun, Aug 10, 2014 at 9:36 PM, Cheng Lian <lian.cs....@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Jenny, does this issue only happen when running Spark SQL with
>>>>>>> YARN in your environment?
>>>>>>>
>>>>>>>
>>>>>>> On Sat, Aug 9, 2014 at 3:56 AM, Jenny Zhao <linlin200...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I am able to run my hql query on yarn cluster mode when connecting
>>>>>>>> to the default hive metastore defined in hive-site.xml.
>>>>>>>>
>>>>>>>> however, if I want to switch to a different database, like:
>>>>>>>>
>>>>>>>>   hql("use other-database")
>>>>>>>>
>>>>>>>>
>>>>>>>> it only works in yarn client mode, but failed on yarn-cluster mode
>>>>>>>> with the following stack:
>>>>>>>>
>>>>>>>> 14/08/08 12:09:11 INFO HiveMetaStore: 0: get_database: tt
>>>>>>>> 14/08/08 12:09:11 INFO audit: ugi=biadmin      ip=unknown-ip-addr      
>>>>>>>> cmd=get_database: tt    
>>>>>>>> 14/08/08 12:09:11 ERROR RetryingHMSHandler: 
>>>>>>>> NoSuchObjectException(message:There is no database named tt)
>>>>>>>>        at 
>>>>>>>> org.apache.hadoop.hive.metastore.ObjectStore.getMDatabase(ObjectStore.java:431)
>>>>>>>>        at 
>>>>>>>> org.apache.hadoop.hive.metastore.ObjectStore.getDatabase(ObjectStore.java:441)
>>>>>>>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>>>        at 
>>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
>>>>>>>>        at 
>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
>>>>>>>>        at java.lang.reflect.Method.invoke(Method.java:611)
>>>>>>>>        at 
>>>>>>>> org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:124)
>>>>>>>>        at $Proxy15.getDatabase(Unknown Source)
>>>>>>>>        at 
>>>>>>>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_database(HiveMetaStore.java:628)
>>>>>>>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>>>        at 
>>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
>>>>>>>>        at 
>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
>>>>>>>>        at java.lang.reflect.Method.invoke(Method.java:611)
>>>>>>>>        at 
>>>>>>>> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103)
>>>>>>>>        at $Proxy17.get_database(Unknown Source)
>>>>>>>>        at 
>>>>>>>> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabase(HiveMetaStoreClient.java:810)
>>>>>>>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>>>        at 
>>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
>>>>>>>>        at 
>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
>>>>>>>>        at java.lang.reflect.Method.invoke(Method.java:611)
>>>>>>>>        at 
>>>>>>>> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
>>>>>>>>        at $Proxy18.getDatabase(Unknown Source)
>>>>>>>>        at 
>>>>>>>> org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1139)
>>>>>>>>        at 
>>>>>>>> org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1128)
>>>>>>>>        at 
>>>>>>>> org.apache.hadoop.hive.ql.exec.DDLTask.switchDatabase(DDLTask.java:3479)
>>>>>>>>        at 
>>>>>>>> org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:237)
>>>>>>>>        at 
>>>>>>>> org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
>>>>>>>>        at 
>>>>>>>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
>>>>>>>>        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1414)
>>>>>>>>        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1192)
>>>>>>>>        at 
>>>>>>>> org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1020)
>>>>>>>>        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888)
>>>>>>>>        at 
>>>>>>>> org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:208)
>>>>>>>>        at 
>>>>>>>> org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:182)
>>>>>>>>        at 
>>>>>>>> org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd$lzycompute(HiveContext.scala:272)
>>>>>>>>        at 
>>>>>>>> org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd(HiveContext.scala:269)
>>>>>>>>        at 
>>>>>>>> org.apache.spark.sql.hive.HiveContext.hiveql(HiveContext.scala:86)
>>>>>>>>        at 
>>>>>>>> org.apache.spark.sql.hive.HiveContext.hql(HiveContext.scala:91)
>>>>>>>>        at 
>>>>>>>> org.apache.spark.examples.sql.hive.HiveSpark$.main(HiveSpark.scala:35)
>>>>>>>>        at 
>>>>>>>> org.apache.spark.examples.sql.hive.HiveSpark.main(HiveSpark.scala)
>>>>>>>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>>>        at 
>>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
>>>>>>>>        at 
>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
>>>>>>>>        at java.lang.reflect.Method.invoke(Method.java:611)
>>>>>>>>        at 
>>>>>>>> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:186)
>>>>>>>>
>>>>>>>> 14/08/08 12:09:11 ERROR DDLTask: 
>>>>>>>> org.apache.hadoop.hive.ql.metadata.HiveException: Database does not 
>>>>>>>> exist: tt
>>>>>>>>        at 
>>>>>>>> org.apache.hadoop.hive.ql.exec.DDLTask.switchDatabase(DDLTask.java:3480)
>>>>>>>>        at 
>>>>>>>> org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:237)
>>>>>>>>        at 
>>>>>>>> org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
>>>>>>>>        at 
>>>>>>>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
>>>>>>>>        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1414)
>>>>>>>>        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1192)
>>>>>>>>        at 
>>>>>>>> org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1020)
>>>>>>>>        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888)
>>>>>>>>        at 
>>>>>>>> org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:208)
>>>>>>>>        at 
>>>>>>>> org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:182)
>>>>>>>>        at 
>>>>>>>> org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd$lzycompute(HiveContext.scala:272)
>>>>>>>>        at 
>>>>>>>> org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd(HiveContext.scala:269)
>>>>>>>>        at 
>>>>>>>> org.apache.spark.sql.hive.HiveContext.hiveql(HiveContext.scala:86)
>>>>>>>>        at 
>>>>>>>> org.apache.spark.sql.hive.HiveContext.hql(HiveContext.scala:91)
>>>>>>>>        at 
>>>>>>>> org.apache.spark.examples.sql.hive.HiveSpark$.main(HiveSpark.scala:35)
>>>>>>>>        at 
>>>>>>>> org.apache.spark.examples.sql.hive.HiveSpark.main(HiveSpark.scala)
>>>>>>>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>>>        at 
>>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
>>>>>>>>        at 
>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
>>>>>>>>        at java.lang.reflect.Method.invoke(Method.java:611)
>>>>>>>>        at 
>>>>>>>> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:186)
>>>>>>>> nono
>>>>>>>>
>>>>>>>>     why is that? not sure if this is something to do with hive jdbc
>>>>>>>> driver?
>>>>>>>>
>>>>>>>> Thank you!
>>>>>>>>
>>>>>>>> Jenny
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to