Hi,

I am using Spark with Java and connecting to Hive. The steps in my code are:

a) Create SparkSession :
SparkSession.builder().config(conf).enableHiveSupport().getOrCreate();
b) Check existense of a Hive database :
spark.catalog().databaseExists(config.getHiveDbName()), if false I create
the database
c) Check existense of a table within that database:
spark.catalog().tableExists(config.getHiveDbName(),
config.getHiveTableName()), if false I create the table (external)
d) Land the data to the location specified in the external table

The Hive metastore log for my user shows statements like these suggesting
it is scanning all the databases for functions and then finally getting
database and getting table

2019-08-20 11:08:05,435 INFO  [pool-7-thread-9250]: HiveMetaStore.audit
(HiveMetaStore.java:logAuditEvent(393)) - ugi=user ip=XX cmd=get_functions:
db=abc pat=*
2019-08-20 11:08:05,435 INFO  [pool-7-thread-9250]: HiveMetaStore.audit
(HiveMetaStore.java:logAuditEvent(393)) - ugi=user ip=XX cmd=get_functions:
db=def pat=*
..all databases get_functions calls

2019-08-20 11:08:05,760 INFO  [pool-7-thread-9250]: HiveMetaStore.audit
(HiveMetaStore.java:logAuditEvent(393)) - ugi=user ip=XX cmd=get_database:
mydb


Is this expected behavior? At what step is this get_functions for all
databases happening?

Thank you

Reply via email to