I am not sure that this will be performant. What do you want to achieve here? Fast lookups? Then the Cassandra Ignite store might be the right solution. If you want to do more analytic style of queries then you can put the data on HDFS/Hive and use the Ignite HDFS cache to cache certain partitions/tables in Hive in-memory. If you want to go to iterative machine learning algorithms you can go for Spark on top of this. You can use then also Ignite cache for Spark RDDs.
> On 4 Oct 2016, at 02:24, Alexey Kuznetsov <[email protected]> wrote: > > Hi, Vincent! > > Ignite also has SQL support (also scalable), I think it will be much faster > to query directly from Ignite than query from Spark. > Also please mind, that before executing queries you should load all needed > data to cache. > To load data from Cassandra to Ignite you may use Cassandra store [1]. > > [1] https://apacheignite.readme.io/docs/ignite-with-apache-cassandra > >> On Tue, Oct 4, 2016 at 4:19 AM, vincent gromakowski >> <[email protected]> wrote: >> Hi, >> I am evaluating the possibility to use Spark SQL (and its scalability) over >> an Ignite cache with Cassandra persistent store to increase read workloads >> like OLAP style analytics. >> Is there any way to configure Spark thriftserver to load an external table >> in Ignite like we can do in Cassandra ? >> Here is an example of config for spark backed by cassandra >> >> CREATE EXTERNAL TABLE MyHiveTable >> ( id int, data string ) >> STORED BY 'org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler' >> TBLPROPERTIES ("cassandra.host" = "x.x.x.x", "cassandra.ks.name" = >> "test" , >> "cassandra.cf.name" = "mytable" , >> "cassandra.ks.repfactor" = "1" , >> "cassandra.ks.strategy" = >> "org.apache.cassandra.locator.SimpleStrategy" ); >> > > > > -- > Alexey Kuznetsov
