Spark & Cassandra work just fine together, but, as I said, Cassandra is
*primarily* used for OLTP.  If your main use case is analytics, I would use
something that's built for analytics.  If 90%+ of your queries are going to
be 1-10ms & customer facing, then you're good to go.  If you're building
something to replace OLAP cubes, I'd look at something else.

On Tue, Mar 1, 2016 at 8:52 AM Jack Krupansky <jack.krupan...@gmail.com>
wrote:

> OLAP using Cassandra and Spark:
>
> http://www.slideshare.net/EvanChan2/breakthrough-olap-performance-with-cassandra-and-spark
>
> What is the cardinality of your cube dimenstions? Obviously any
> multi-dimensional data must be flattened.
>
> Cassandra tables have fixed named columns, but... the map datatype with
> string key values effectively gives you extensible columns.
>
>
>
> -- Jack Krupansky
>
> On Tue, Mar 1, 2016 at 11:22 AM, Andrés Ivaldi <iaiva...@gmail.com> wrote:
>
>> Jonathan thanks for the link,
>> I believe that maybe is good as Data Store part, because is fast for I/o
>> and handles Time Series, for analytics could be with Apache Ignite and/or
>> Apache Spark
>> what it worries me is that looks very complex create the structure for
>> each Fact table and then extend
>>
>> regards.
>>
>> On Sun, Feb 28, 2016 at 12:28 PM, Jonathan Haddad <j...@jonhaddad.com>
>> wrote:
>>
>>> Cassandra is primarily used as an OLTP database, not analytics. You
>>> should watch this 30 min video discussing Cassandra core concepts (coming
>>> from a relational background):
>>> https://academy.datastax.com/courses/ds101-introduction-cassandra
>>>
>>> On Sun, Feb 28, 2016 at 5:40 AM Andrés Ivaldi <iaiva...@gmail.com>
>>> wrote:
>>>
>>>> Hello, At my work we are looking for new technologies for an Analysis
>>>> Engine, and we are evaluating differents technologies one of them is
>>>> Cassandra as our Data repository.
>>>>
>>>> Now we can execute query analysis agains an OLAP Cube and RDBMS, using
>>>> MSSQL as our data repository. Cube is obsolete and SQL server engine is
>>>> slow as data repository.
>>>>
>>>> I don't know much about cassandra, I read some books, and looks to fit
>>>> well on what we are needing, but there are some things that looks like a
>>>> problem for us.
>>>>
>>>> Our engine is designed to be scalable, flexible and dynamic, any user
>>>> can add new dimensions or measures from any source, all the data is stored
>>>> on Cube(this is fixed data) and MSSQL(dynamic data) so we have decoupled
>>>> tables with the dimension values.
>>>>
>>>>
>>>> Ok, with the context given I'll like to clear some doubts
>>>>
>>>> - I able to flat the table with all the possible dimension values to
>>>> cassandra, creating the pk against the dimension columns? this will give me
>>>> the "sensation" of data pivot over the PK columns? If correct, what if I
>>>> want to select the order of the columns, or add another or reduce them?
>>>> - It's possible to extend the values of a row dynamically? What we do
>>>> often is join row against a value of a mapped external data value to extend
>>>> the dimensions hierarchical value structure (ie state->Country->Continent)
>>>>
>>>> I know we can do some of this things in the core of our engine, like
>>>> the dimension extension of the values or reduce columns, but as we are
>>>> evaluating differents technologies is good to know.
>>>>
>>>> Regards!!
>>>>
>>>>
>>>> --
>>>> Ing. Ivaldi Andres
>>>>
>>>
>>
>>
>> --
>> Ing. Ivaldi Andres
>>
>
>

Reply via email to