Thank you for the reply. I have tried your experiment, it seems that it does not print the settings out in spark-shell (I'm using 1.3 by the way).
Strangely I have been experimenting with an SQL connection instead, which works after all (still if I go to spark-shell and try to print out the SQL settings that I put in hive-site.xml, it does not print them). On Fri, May 15, 2015 at 7:22 PM, Yana Kadiyska <yana.kadiy...@gmail.com> wrote: > My point was more to how to verify that properties are picked up from > the hive-site.xml file. You don't really need hive.metastore.uris if > you're not running against an external metastore. I just did an > experiment with warehouse.dir. > > My hive-site.xml looks like this: > > <configuration> > <property> > <name>hive.metastore.warehouse.dir</name> > <value>/home/ykadiysk/Github/warehouse_dir</value> > <description>location of default database for the > warehouse</description> > </property> > </configuration> > > > > and spark-shell code: > > scala> val hc= new org.apache.spark.sql.hive.HiveContext(sc) > hc: org.apache.spark.sql.hive.HiveContext = > org.apache.spark.sql.hive.HiveContext@3036c16f > > scala> hc.sql("show tables").collect > 15/05/15 14:12:57 INFO HiveMetaStore: 0: Opening raw store with implemenation > class:org.apache.hadoop.hive.metastore.ObjectStore > 15/05/15 14:12:57 INFO ObjectStore: ObjectStore, initialize called > 15/05/15 14:12:57 INFO Persistence: Property datanucleus.cache.level2 unknown > - will be ignored > 15/05/15 14:12:58 WARN Connection: BoneCP specified but not present in > CLASSPATH (or one of dependencies) > 15/05/15 14:12:58 WARN Connection: BoneCP specified but not present in > CLASSPATH (or one of dependencies) > 15/05/15 14:13:03 INFO ObjectStore: Setting MetaStore object pin classes with > hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order" > 15/05/15 14:13:03 INFO ObjectStore: Initialized ObjectStore > 15/05/15 14:13:04 WARN ObjectStore: Version information not found in > metastore. hive.metastore.schema.verification is not enabled so recording the > schema version 0.12.0-protobuf-2.5 > 15/05/15 14:13:05 INFO HiveMetaStore: 0: get_tables: db=default pat=.* > 15/05/15 14:13:05 INFO audit: ugi=ykadiysk ip=unknown-ip-addr > cmd=get_tables: db=default pat=.* > 15/05/15 14:13:05 INFO Datastore: The class > "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as > "embedded-only" so does not have its own datastore table. > 15/05/15 14:13:05 INFO Datastore: The class > "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" > so does not have its own datastore table. > res0: Array[org.apache.spark.sql.Row] = Array() > > scala> hc.getConf("hive.metastore.warehouse.dir") > res1: String = /home/ykadiysk/Github/warehouse_dir > > > > I have not tried an HDFS path but you should be at least able to verify > that the variable is being read. It might be that your value is read but is > otherwise not liked... > > On Fri, May 15, 2015 at 2:03 PM, Tamas Jambor <jambo...@gmail.com> wrote: > >> thanks for the reply. I am trying to use it without hive setup >> (spark-standalone), so it prints something like this: >> >> hive_ctx.sql("show tables").collect() >> 15/05/15 17:59:03 INFO HiveMetaStore: 0: Opening raw store with >> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore >> 15/05/15 17:59:03 INFO ObjectStore: ObjectStore, initialize called >> 15/05/15 17:59:04 INFO Persistence: Property datanucleus.cache.level2 >> unknown - will be ignored >> 15/05/15 17:59:04 INFO Persistence: Property >> hive.metastore.integral.jdo.pushdown unknown - will be ignored >> 15/05/15 17:59:04 WARN Connection: BoneCP specified but not present in >> CLASSPATH (or one of dependencies) >> 15/05/15 17:59:05 WARN Connection: BoneCP specified but not present in >> CLASSPATH (or one of dependencies) >> 15/05/15 17:59:08 INFO BlockManagerMasterActor: Registering block manager >> xxxx:42819 with 3.0 GB RAM, BlockManagerId(2, xxx, 42819) >> >> [0/1844] >> 15/05/15 17:59:18 INFO ObjectStore: Setting MetaStore object pin classes >> with >> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order" >> 15/05/15 17:59:18 INFO MetaStoreDirectSql: MySQL check failed, assuming >> we are not on mysql: Lexical error at line 1, column 5. Encountered: "@" >> (64), after : "". >> 15/05/15 17:59:20 INFO Datastore: The class >> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as >> "embedded-only" so does not have its own datastore table. >> 15/05/15 17:59:20 INFO Datastore: The class >> "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as >> "embedded-only" so does not have its own datastore table. >> 15/05/15 17:59:28 INFO Datastore: The class >> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as >> "embedded-only" so does not have its own datastore table. >> 15/05/15 17:59:29 INFO Datastore: The class >> "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as >> "embedded-only" so does not have its own datastore table. >> 15/05/15 17:59:31 INFO ObjectStore: Initialized ObjectStore >> 15/05/15 17:59:32 WARN ObjectStore: Version information not found in >> metastore. hive.metastore.schema.verification is not enabled so recording >> the schema version 0.13.1aa >> 15/05/15 17:59:33 WARN MetricsConfig: Cannot locate configuration: tried >> hadoop-metrics2-azure-file-system.properties,hadoop-metrics2.properties >> 15/05/15 17:59:33 INFO MetricsSystemImpl: Scheduled snapshot period at 10 >> second(s). >> 15/05/15 17:59:33 INFO MetricsSystemImpl: azure-file-system metrics >> system started >> 15/05/15 17:59:33 INFO HiveMetaStore: Added admin role in metastore >> 15/05/15 17:59:34 INFO HiveMetaStore: Added public role in metastore >> 15/05/15 17:59:34 INFO HiveMetaStore: No user is added in admin role, >> since config is empty >> 15/05/15 17:59:35 INFO SessionState: No Tez session required at this >> point. hive.execution.engine=mr. >> 15/05/15 17:59:37 INFO HiveMetaStore: 0: get_tables: db=default pat=.* >> 15/05/15 17:59:37 INFO audit: ugi=testuser ip=unknown-ip-addr >> cmd=get_tables: db=default pat=.* >> >> not sure what to put in hive.metastore.uris in this case? >> >> >> On Fri, May 15, 2015 at 2:52 PM, Yana Kadiyska <yana.kadiy...@gmail.com> >> wrote: >> >>> This should work. Which version of Spark are you using? Here is what I >>> do -- make sure hive-site.xml is in the conf directory of the machine >>> you're using the driver from. Now let's run spark-shell from that machine: >>> >>> scala> val hc= new org.apache.spark.sql.hive.HiveContext(sc) >>> hc: org.apache.spark.sql.hive.HiveContext = >>> org.apache.spark.sql.hive.HiveContext@6e9f8f26 >>> >>> scala> hc.sql("show tables").collect >>> 15/05/15 09:34:17 INFO metastore: Trying to connect to metastore with URI >>> thrift://hostname.com:9083 <-- here should be a value from >>> your hive-site.xml >>> 15/05/15 09:34:17 INFO metastore: Waiting 1 seconds before next connection >>> attempt. >>> 15/05/15 09:34:18 INFO metastore: Connected to metastore. >>> res0: Array[org.apache.spark.sql.Row] = Array([table1,false], >>> >>> scala> hc.getConf("hive.metastore.uris") >>> res13: String = thrift://hostname.com:9083 >>> >>> scala> hc.getConf("hive.metastore.warehouse.dir") >>> res14: String = /user/hive/warehouse >>> >>> >>> >>> The first line tells you which metastore it's trying to connect to -- >>> this should be the string specified under hive.metastore.uris property in >>> your hive-site.xml file. I have not mucked with warehouse.dir too much but >>> I know that the value of the metastore URI is in fact picked up from there >>> as I regularly point to different systems... >>> >>> >>> On Thu, May 14, 2015 at 6:26 PM, Tamas Jambor <jambo...@gmail.com> >>> wrote: >>> >>>> I have tried to put the hive-site.xml file in the conf/ directory >>>> with, seems it is not picking up from there. >>>> >>>> >>>> On Thu, May 14, 2015 at 6:50 PM, Michael Armbrust < >>>> mich...@databricks.com> wrote: >>>> >>>>> You can configure Spark SQLs hive interaction by placing a >>>>> hive-site.xml file in the conf/ directory. >>>>> >>>>> On Thu, May 14, 2015 at 10:24 AM, jamborta <jambo...@gmail.com> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> is it possible to set hive.metastore.warehouse.dir, that is internally >>>>>> create by spark, to be stored externally (e.g. s3 on aws or wasb on >>>>>> azure)? >>>>>> >>>>>> thanks, >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> View this message in context: >>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/store-hive-metastore-on-persistent-store-tp22891.html >>>>>> Sent from the Apache Spark User List mailing list archive at >>>>>> Nabble.com. >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>>>> For additional commands, e-mail: user-h...@spark.apache.org >>>>>> >>>>>> >>>>> >>>> >>> >> >