Re: Datastore or DB for spark

2015-10-10 Thread Deenar Toraskar
The choice of datastore is driven by your use case. In fact Spark can work with multiple datastores too. Each datastore is optimised for certain kinds of data. e.g. HDFS is great for analytics and large data sets at rest. It is scalable and very performant, but is immutable. No-SQL databases

Datastore or DB for spark

2015-10-09 Thread Rahul Jeevanandam
Hi Guys, I wanted to know what is the databases that you associate with spark? -- Regards, *Rahul J*

Re: Datastore or DB for spark

2015-10-09 Thread Xiao Li
FYI, in my local environment, Spark is connected to DB2 on z/OS but that requires a special JDBC driver. Xiao Li 2015-10-09 8:38 GMT-07:00 Rahul Jeevanandam : > Hi Jörn Franke > > I was sure that relational database wouldn't be a good option for Spark. > But what about

Re: Datastore or DB for spark

2015-10-09 Thread Ted Yu
There are connectors for hbase, Cassandra, etc. Which data store do you use now ? Cheers > On Oct 9, 2015, at 3:10 AM, Rahul Jeevanandam wrote: > > Hi Guys, > > I wanted to know what is the databases that you associate with spark? > > -- > Regards, > Rahul J

Re: Datastore or DB for spark

2015-10-09 Thread Jörn Franke
I am not aware of any empirical evidence, but I think hadoop (HDFS) as a datastore for Spark is quiet common. With relational databases you usually do not have so much data and you do not benefit from data locality. Le ven. 9 oct. 2015 à 15:16, Rahul Jeevanandam a écrit : >

Re: Datastore or DB for spark

2015-10-09 Thread Rahul Jeevanandam
I wanna know what everyone are using. Which datastore is popular among Spark community. On Fri, Oct 9, 2015 at 6:16 PM, Ted Yu wrote: > There are connectors for hbase, Cassandra, etc. > > Which data store do you use now ? > > Cheers > > On Oct 9, 2015, at 3:10 AM, Rahul

Re: Datastore or DB for spark

2015-10-09 Thread Rahul Jeevanandam
Hi Jörn Franke I was sure that relational database wouldn't be a good option for Spark. But what about distributed databases like Hbase, Cassandra, etc? On Fri, Oct 9, 2015 at 7:21 PM, Jörn Franke wrote: > I am not aware of any empirical evidence, but I think hadoop