I had a similar issue this summer while prototyping Spark on K8s. I ended up sticking with Hive Metastore 2 to meet time goals. Not sure if I was using it correctly, but I only needed Hadoop + Hive JARs; I did not need to run HDFS, YARN, etc. Using the metastore with an s3a warehouse.dir path seemed to work fine.
When Spark supports Metastore 3.0, things should be a bit easier as HMS 3 will have clearer instructions for standalone deployments. https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+3.0+Administration If you have more time and truly need to move away from everything Hadoop, you can also implement ExternalCatalog: https://github.com/apache/spark/blob/5264164a67df498b73facae207eda12ee133be7d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala See https://jira.apache.org/jira/browse/SPARK-23443 for ongoing progress on a Glue ExternalCatalog implementation. If you are using EMR, you can also check out https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark-glue.html On Mon, Oct 14, 2019 at 12:24 PM xweb <ashish8...@gmail.com> wrote: > > Is it possible to use our own metastore instead of Hive Metastore with > Spark > SQL? > > Can you please point me to some docs or code I can look at to get it done? > > We are moving away from everything Hadoop. > > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >