You can also just make sure that each user is using their own directory. A rough example can be found in TestHive.
Note: in Spark 2.0 there should be no need to use HiveContext unless you need to talk to a metastore. On Thu, May 26, 2016 at 1:36 PM, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Well make sure than you set up a reasonable RDBMS as metastore. Ours is > Oracle but you can get away with others. Check the supported list in > > hduser@rhes564:: :/usr/lib/hive/scripts/metastore/upgrade> ltr > total 40 > drwxr-xr-x 2 hduser hadoop 4096 Feb 21 23:48 postgres > drwxr-xr-x 2 hduser hadoop 4096 Feb 21 23:48 mysql > drwxr-xr-x 2 hduser hadoop 4096 Feb 21 23:48 mssql > drwxr-xr-x 2 hduser hadoop 4096 Feb 21 23:48 derby > drwxr-xr-x 3 hduser hadoop 4096 May 20 18:44 oracle > > you have few good ones in the list. In general the base tables (without > transactional support) are around 55 (Hive 2) and don't take much space > (depending on the volume of tables). I attached a E-R diagram. > > HTH > > > > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > > On 26 May 2016 at 19:09, Gerard Maas <gerard.m...@gmail.com> wrote: > >> Thanks a lot for the advice!. >> >> I found out why the standalone hiveContext would not work: it was trying >> to deploy a derby db and the user had no rights to create the dir where >> there db is stored: >> >> Caused by: java.sql.SQLException: Failed to create database >> 'metastore_db', see the next exception for details. >> >> at >> org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown >> Source) >> >> at >> org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown >> Source) >> >> ... 129 more >> >> Caused by: java.sql.SQLException: Directory >> /usr/share/spark-notebook/metastore_db cannot be created. >> >> >> Now, the new issue is that we can't start more than 1 context at the same >> time. I think we will need to setup a proper metastore. >> >> >> -kind regards, Gerard. >> >> >> >> >> On Thu, May 26, 2016 at 3:06 PM, Mich Talebzadeh < >> mich.talebza...@gmail.com> wrote: >> >>> To use HiveContext witch is basically an sql api within Spark without >>> proper hive set up does not make sense. It is a super set of Spark >>> SQLContext >>> >>> In addition simple things like registerTempTable may not work. >>> >>> HTH >>> >>> Dr Mich Talebzadeh >>> >>> >>> >>> LinkedIn * >>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>> >>> >>> >>> http://talebzadehmich.wordpress.com >>> >>> >>> >>> On 26 May 2016 at 13:01, Silvio Fiorito <silvio.fior...@granturing.com> >>> wrote: >>> >>>> Hi Gerard, >>>> >>>> >>>> >>>> I’ve never had an issue using the HiveContext without a hive-site.xml >>>> configured. However, one issue you may have is if multiple users are >>>> starting the HiveContext from the same path, they’ll all be trying to store >>>> the default Derby metastore in the same location. Also, if you want them to >>>> be able to persist permanent table metadata for SparkSQL then you’ll want >>>> to set up a true metastore. >>>> >>>> >>>> >>>> The other thing it could be is Hive dependency collisions from the >>>> classpath, but that shouldn’t be an issue since you said it’s standalone >>>> (not a Hadoop distro right?). >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Silvio >>>> >>>> >>>> >>>> *From: *Gerard Maas <gerard.m...@gmail.com> >>>> *Date: *Thursday, May 26, 2016 at 5:28 AM >>>> *To: *spark users <user@spark.apache.org> >>>> *Subject: *HiveContext standalone => without a Hive metastore >>>> >>>> >>>> >>>> Hi, >>>> >>>> >>>> >>>> I'm helping some folks setting up an analytics cluster with Spark. >>>> >>>> They want to use the HiveContext to enable the Window functions on >>>> DataFrames(*) but they don't have any Hive installation, nor they need one >>>> at the moment (if not necessary for this feature) >>>> >>>> >>>> >>>> When we try to create a Hive context, we get the following error: >>>> >>>> >>>> >>>> > val sqlContext = new >>>> org.apache.spark.sql.hive.HiveContext(sparkContext) >>>> >>>> java.lang.RuntimeException: java.lang.RuntimeException: Unable to >>>> instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient >>>> >>>> at >>>> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522) >>>> >>>> >>>> >>>> Is my HiveContext failing b/c it wants to connect to an unconfigured >>>> Hive Metastore? >>>> >>>> >>>> >>>> Is there a way to instantiate a HiveContext for the sake of Window >>>> support without an underlying Hive deployment? >>>> >>>> >>>> >>>> The docs are explicit in saying that that is should be the case: [1] >>>> >>>> >>>> >>>> "To use a HiveContext, you do not need to have an existing Hive setup, >>>> and all of the data sources available to aSQLContext are still >>>> available. HiveContext is only packaged separately to avoid including >>>> all of Hive’s dependencies in the default Spark build." >>>> >>>> >>>> >>>> So what is the right way to address this issue? How to instantiate a >>>> HiveContext with spark running on a HDFS cluster without Hive deployed? >>>> >>>> >>>> >>>> >>>> >>>> Thanks a lot! >>>> >>>> >>>> >>>> -Gerard. >>>> >>>> >>>> >>>> (*) The need for a HiveContext to use Window functions is pretty >>>> obscure. The only documentation of this seems to be a runtime exception: >>>> "org.apache.spark.sql.AnalysisException: >>>> Could not resolve window function 'max'. Note that, using window functions >>>> currently requires a HiveContext;" >>>> >>>> >>>> >>>> [1] >>>> http://spark.apache.org/docs/latest/sql-programming-guide.html#getting-started >>>> >>> >>> >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org >