The choice of datastore is driven by your use case. In fact Spark can work
with multiple datastores too. Each datastore is optimised for certain kinds
of data.
e.g. HDFS is great for analytics and large data sets at rest. It is
scalable and very performant, but is immutable. No-SQL databases
Hi Guys,
I wanted to know what is the databases that you associate with spark?
--
Regards,
*Rahul J*
FYI, in my local environment, Spark is connected to DB2 on z/OS but that
requires a special JDBC driver.
Xiao Li
2015-10-09 8:38 GMT-07:00 Rahul Jeevanandam :
> Hi Jörn Franke
>
> I was sure that relational database wouldn't be a good option for Spark.
> But what about
There are connectors for hbase, Cassandra, etc.
Which data store do you use now ?
Cheers
> On Oct 9, 2015, at 3:10 AM, Rahul Jeevanandam wrote:
>
> Hi Guys,
>
> I wanted to know what is the databases that you associate with spark?
>
> --
> Regards,
> Rahul J
I am not aware of any empirical evidence, but I think hadoop (HDFS) as a
datastore for Spark is quiet common. With relational databases you usually
do not have so much data and you do not benefit from data locality.
Le ven. 9 oct. 2015 à 15:16, Rahul Jeevanandam a
écrit :
>
I wanna know what everyone are using. Which datastore is popular among
Spark community.
On Fri, Oct 9, 2015 at 6:16 PM, Ted Yu wrote:
> There are connectors for hbase, Cassandra, etc.
>
> Which data store do you use now ?
>
> Cheers
>
> On Oct 9, 2015, at 3:10 AM, Rahul
Hi Jörn Franke
I was sure that relational database wouldn't be a good option for Spark.
But what about distributed databases like Hbase, Cassandra, etc?
On Fri, Oct 9, 2015 at 7:21 PM, Jörn Franke wrote:
> I am not aware of any empirical evidence, but I think hadoop