Re: Cassandra Ussages

2016-03-01 Thread Andrés Ivaldi
Hello Jack
What do you mind with "the map datatype with string key values effectively
gives you extensible columns"

Regards

On Tue, Mar 1, 2016 at 1:34 PM, Jack Krupansky 
wrote:

> OLAP using Cassandra and Spark:
>
> http://www.slideshare.net/EvanChan2/breakthrough-olap-performance-with-cassandra-and-spark
>
> What is the cardinality of your cube dimenstions? Obviously any
> multi-dimensional data must be flattened.
>
> Cassandra tables have fixed named columns, but... the map datatype with
> string key values effectively gives you extensible columns.
>
>
>
> -- Jack Krupansky
>
> On Tue, Mar 1, 2016 at 11:22 AM, Andrés Ivaldi  wrote:
>
>> Jonathan thanks for the link,
>> I believe that maybe is good as Data Store part, because is fast for I/o
>> and handles Time Series, for analytics could be with Apache Ignite and/or
>> Apache Spark
>> what it worries me is that looks very complex create the structure for
>> each Fact table and then extend
>>
>> regards.
>>
>> On Sun, Feb 28, 2016 at 12:28 PM, Jonathan Haddad 
>> wrote:
>>
>>> Cassandra is primarily used as an OLTP database, not analytics. You
>>> should watch this 30 min video discussing Cassandra core concepts (coming
>>> from a relational background):
>>> https://academy.datastax.com/courses/ds101-introduction-cassandra
>>>
>>> On Sun, Feb 28, 2016 at 5:40 AM Andrés Ivaldi 
>>> wrote:
>>>
 Hello, At my work we are looking for new technologies for an Analysis
 Engine, and we are evaluating differents technologies one of them is
 Cassandra as our Data repository.

 Now we can execute query analysis agains an OLAP Cube and RDBMS, using
 MSSQL as our data repository. Cube is obsolete and SQL server engine is
 slow as data repository.

 I don't know much about cassandra, I read some books, and looks to fit
 well on what we are needing, but there are some things that looks like a
 problem for us.

 Our engine is designed to be scalable, flexible and dynamic, any user
 can add new dimensions or measures from any source, all the data is stored
 on Cube(this is fixed data) and MSSQL(dynamic data) so we have decoupled
 tables with the dimension values.


 Ok, with the context given I'll like to clear some doubts

 - I able to flat the table with all the possible dimension values to
 cassandra, creating the pk against the dimension columns? this will give me
 the "sensation" of data pivot over the PK columns? If correct, what if I
 want to select the order of the columns, or add another or reduce them?
 - It's possible to extend the values of a row dynamically? What we do
 often is join row against a value of a mapped external data value to extend
 the dimensions hierarchical value structure (ie state->Country->Continent)

 I know we can do some of this things in the core of our engine, like
 the dimension extension of the values or reduce columns, but as we are
 evaluating differents technologies is good to know.

 Regards!!


 --
 Ing. Ivaldi Andres

>>>
>>
>>
>> --
>> Ing. Ivaldi Andres
>>
>
>


-- 
Ing. Ivaldi Andres


Re: Cassandra Ussages

2016-03-01 Thread Andrés Ivaldi
Thanks all for the tips,
Mainly we are replacing an OLAP cube, but our engine works fine with RDBMS
directly so with the low latency of cassandra it could work nice
(extensibility of this is what worries me).
We will give a try to Cassandra + Spark

Thanks again!!

On Tue, Mar 1, 2016 at 2:59 PM, Jack Krupansky 
wrote:

> I would spin it as Cassandra being the right choice where your primary
> need in OLTP and with a secondary need for analytics. IOW, where you would
> otherwise need to use two separate databases for the same data.
>
>
> -- Jack Krupansky
>
> On Tue, Mar 1, 2016 at 12:40 PM, Jonathan Haddad 
> wrote:
>
>> Spark & Cassandra work just fine together, but, as I said, Cassandra is
>> *primarily* used for OLTP.  If your main use case is analytics, I would use
>> something that's built for analytics.  If 90%+ of your queries are going to
>> be 1-10ms & customer facing, then you're good to go.  If you're building
>> something to replace OLAP cubes, I'd look at something else.
>>
>> On Tue, Mar 1, 2016 at 8:52 AM Jack Krupansky 
>> wrote:
>>
>>> OLAP using Cassandra and Spark:
>>>
>>> http://www.slideshare.net/EvanChan2/breakthrough-olap-performance-with-cassandra-and-spark
>>>
>>> What is the cardinality of your cube dimenstions? Obviously any
>>> multi-dimensional data must be flattened.
>>>
>>> Cassandra tables have fixed named columns, but... the map datatype with
>>> string key values effectively gives you extensible columns.
>>>
>>>
>>>
>>> -- Jack Krupansky
>>>
>>> On Tue, Mar 1, 2016 at 11:22 AM, Andrés Ivaldi 
>>> wrote:
>>>
 Jonathan thanks for the link,
 I believe that maybe is good as Data Store part, because is fast for
 I/o and handles Time Series, for analytics could be with Apache Ignite
 and/or Apache Spark
 what it worries me is that looks very complex create the structure for
 each Fact table and then extend

 regards.

 On Sun, Feb 28, 2016 at 12:28 PM, Jonathan Haddad 
 wrote:

> Cassandra is primarily used as an OLTP database, not analytics. You
> should watch this 30 min video discussing Cassandra core concepts (coming
> from a relational background):
> https://academy.datastax.com/courses/ds101-introduction-cassandra
>
> On Sun, Feb 28, 2016 at 5:40 AM Andrés Ivaldi 
> wrote:
>
>> Hello, At my work we are looking for new technologies for an Analysis
>> Engine, and we are evaluating differents technologies one of them is
>> Cassandra as our Data repository.
>>
>> Now we can execute query analysis agains an OLAP Cube and RDBMS,
>> using MSSQL as our data repository. Cube is obsolete and SQL server 
>> engine
>> is slow as data repository.
>>
>> I don't know much about cassandra, I read some books, and looks to
>> fit well on what we are needing, but there are some things that looks 
>> like
>> a problem for us.
>>
>> Our engine is designed to be scalable, flexible and dynamic, any user
>> can add new dimensions or measures from any source, all the data is 
>> stored
>> on Cube(this is fixed data) and MSSQL(dynamic data) so we have decoupled
>> tables with the dimension values.
>>
>>
>> Ok, with the context given I'll like to clear some doubts
>>
>> - I able to flat the table with all the possible dimension values to
>> cassandra, creating the pk against the dimension columns? this will give 
>> me
>> the "sensation" of data pivot over the PK columns? If correct, what if I
>> want to select the order of the columns, or add another or reduce them?
>> - It's possible to extend the values of a row dynamically? What we do
>> often is join row against a value of a mapped external data value to 
>> extend
>> the dimensions hierarchical value structure (ie 
>> state->Country->Continent)
>>
>> I know we can do some of this things in the core of our engine, like
>> the dimension extension of the values or reduce columns, but as we are
>> evaluating differents technologies is good to know.
>>
>> Regards!!
>>
>>
>> --
>> Ing. Ivaldi Andres
>>
>


 --
 Ing. Ivaldi Andres

>>>
>>>
>


-- 
Ing. Ivaldi Andres


Re: Cassandra Ussages

2016-03-01 Thread Jack Krupansky
I would spin it as Cassandra being the right choice where your primary need
in OLTP and with a secondary need for analytics. IOW, where you would
otherwise need to use two separate databases for the same data.


-- Jack Krupansky

On Tue, Mar 1, 2016 at 12:40 PM, Jonathan Haddad  wrote:

> Spark & Cassandra work just fine together, but, as I said, Cassandra is
> *primarily* used for OLTP.  If your main use case is analytics, I would use
> something that's built for analytics.  If 90%+ of your queries are going to
> be 1-10ms & customer facing, then you're good to go.  If you're building
> something to replace OLAP cubes, I'd look at something else.
>
> On Tue, Mar 1, 2016 at 8:52 AM Jack Krupansky 
> wrote:
>
>> OLAP using Cassandra and Spark:
>>
>> http://www.slideshare.net/EvanChan2/breakthrough-olap-performance-with-cassandra-and-spark
>>
>> What is the cardinality of your cube dimenstions? Obviously any
>> multi-dimensional data must be flattened.
>>
>> Cassandra tables have fixed named columns, but... the map datatype with
>> string key values effectively gives you extensible columns.
>>
>>
>>
>> -- Jack Krupansky
>>
>> On Tue, Mar 1, 2016 at 11:22 AM, Andrés Ivaldi 
>> wrote:
>>
>>> Jonathan thanks for the link,
>>> I believe that maybe is good as Data Store part, because is fast for I/o
>>> and handles Time Series, for analytics could be with Apache Ignite and/or
>>> Apache Spark
>>> what it worries me is that looks very complex create the structure for
>>> each Fact table and then extend
>>>
>>> regards.
>>>
>>> On Sun, Feb 28, 2016 at 12:28 PM, Jonathan Haddad 
>>> wrote:
>>>
 Cassandra is primarily used as an OLTP database, not analytics. You
 should watch this 30 min video discussing Cassandra core concepts (coming
 from a relational background):
 https://academy.datastax.com/courses/ds101-introduction-cassandra

 On Sun, Feb 28, 2016 at 5:40 AM Andrés Ivaldi 
 wrote:

> Hello, At my work we are looking for new technologies for an Analysis
> Engine, and we are evaluating differents technologies one of them is
> Cassandra as our Data repository.
>
> Now we can execute query analysis agains an OLAP Cube and RDBMS, using
> MSSQL as our data repository. Cube is obsolete and SQL server engine is
> slow as data repository.
>
> I don't know much about cassandra, I read some books, and looks to fit
> well on what we are needing, but there are some things that looks like a
> problem for us.
>
> Our engine is designed to be scalable, flexible and dynamic, any user
> can add new dimensions or measures from any source, all the data is stored
> on Cube(this is fixed data) and MSSQL(dynamic data) so we have decoupled
> tables with the dimension values.
>
>
> Ok, with the context given I'll like to clear some doubts
>
> - I able to flat the table with all the possible dimension values to
> cassandra, creating the pk against the dimension columns? this will give 
> me
> the "sensation" of data pivot over the PK columns? If correct, what if I
> want to select the order of the columns, or add another or reduce them?
> - It's possible to extend the values of a row dynamically? What we do
> often is join row against a value of a mapped external data value to 
> extend
> the dimensions hierarchical value structure (ie state->Country->Continent)
>
> I know we can do some of this things in the core of our engine, like
> the dimension extension of the values or reduce columns, but as we are
> evaluating differents technologies is good to know.
>
> Regards!!
>
>
> --
> Ing. Ivaldi Andres
>

>>>
>>>
>>> --
>>> Ing. Ivaldi Andres
>>>
>>
>>


Re: Cassandra Ussages

2016-03-01 Thread Jonathan Haddad
Spark & Cassandra work just fine together, but, as I said, Cassandra is
*primarily* used for OLTP.  If your main use case is analytics, I would use
something that's built for analytics.  If 90%+ of your queries are going to
be 1-10ms & customer facing, then you're good to go.  If you're building
something to replace OLAP cubes, I'd look at something else.

On Tue, Mar 1, 2016 at 8:52 AM Jack Krupansky 
wrote:

> OLAP using Cassandra and Spark:
>
> http://www.slideshare.net/EvanChan2/breakthrough-olap-performance-with-cassandra-and-spark
>
> What is the cardinality of your cube dimenstions? Obviously any
> multi-dimensional data must be flattened.
>
> Cassandra tables have fixed named columns, but... the map datatype with
> string key values effectively gives you extensible columns.
>
>
>
> -- Jack Krupansky
>
> On Tue, Mar 1, 2016 at 11:22 AM, Andrés Ivaldi  wrote:
>
>> Jonathan thanks for the link,
>> I believe that maybe is good as Data Store part, because is fast for I/o
>> and handles Time Series, for analytics could be with Apache Ignite and/or
>> Apache Spark
>> what it worries me is that looks very complex create the structure for
>> each Fact table and then extend
>>
>> regards.
>>
>> On Sun, Feb 28, 2016 at 12:28 PM, Jonathan Haddad 
>> wrote:
>>
>>> Cassandra is primarily used as an OLTP database, not analytics. You
>>> should watch this 30 min video discussing Cassandra core concepts (coming
>>> from a relational background):
>>> https://academy.datastax.com/courses/ds101-introduction-cassandra
>>>
>>> On Sun, Feb 28, 2016 at 5:40 AM Andrés Ivaldi 
>>> wrote:
>>>
 Hello, At my work we are looking for new technologies for an Analysis
 Engine, and we are evaluating differents technologies one of them is
 Cassandra as our Data repository.

 Now we can execute query analysis agains an OLAP Cube and RDBMS, using
 MSSQL as our data repository. Cube is obsolete and SQL server engine is
 slow as data repository.

 I don't know much about cassandra, I read some books, and looks to fit
 well on what we are needing, but there are some things that looks like a
 problem for us.

 Our engine is designed to be scalable, flexible and dynamic, any user
 can add new dimensions or measures from any source, all the data is stored
 on Cube(this is fixed data) and MSSQL(dynamic data) so we have decoupled
 tables with the dimension values.


 Ok, with the context given I'll like to clear some doubts

 - I able to flat the table with all the possible dimension values to
 cassandra, creating the pk against the dimension columns? this will give me
 the "sensation" of data pivot over the PK columns? If correct, what if I
 want to select the order of the columns, or add another or reduce them?
 - It's possible to extend the values of a row dynamically? What we do
 often is join row against a value of a mapped external data value to extend
 the dimensions hierarchical value structure (ie state->Country->Continent)

 I know we can do some of this things in the core of our engine, like
 the dimension extension of the values or reduce columns, but as we are
 evaluating differents technologies is good to know.

 Regards!!


 --
 Ing. Ivaldi Andres

>>>
>>
>>
>> --
>> Ing. Ivaldi Andres
>>
>
>


Re: Cassandra Ussages

2016-03-01 Thread Jack Krupansky
OLAP using Cassandra and Spark:
http://www.slideshare.net/EvanChan2/breakthrough-olap-performance-with-cassandra-and-spark

What is the cardinality of your cube dimenstions? Obviously any
multi-dimensional data must be flattened.

Cassandra tables have fixed named columns, but... the map datatype with
string key values effectively gives you extensible columns.



-- Jack Krupansky

On Tue, Mar 1, 2016 at 11:22 AM, Andrés Ivaldi  wrote:

> Jonathan thanks for the link,
> I believe that maybe is good as Data Store part, because is fast for I/o
> and handles Time Series, for analytics could be with Apache Ignite and/or
> Apache Spark
> what it worries me is that looks very complex create the structure for
> each Fact table and then extend
>
> regards.
>
> On Sun, Feb 28, 2016 at 12:28 PM, Jonathan Haddad 
> wrote:
>
>> Cassandra is primarily used as an OLTP database, not analytics. You
>> should watch this 30 min video discussing Cassandra core concepts (coming
>> from a relational background):
>> https://academy.datastax.com/courses/ds101-introduction-cassandra
>>
>> On Sun, Feb 28, 2016 at 5:40 AM Andrés Ivaldi  wrote:
>>
>>> Hello, At my work we are looking for new technologies for an Analysis
>>> Engine, and we are evaluating differents technologies one of them is
>>> Cassandra as our Data repository.
>>>
>>> Now we can execute query analysis agains an OLAP Cube and RDBMS, using
>>> MSSQL as our data repository. Cube is obsolete and SQL server engine is
>>> slow as data repository.
>>>
>>> I don't know much about cassandra, I read some books, and looks to fit
>>> well on what we are needing, but there are some things that looks like a
>>> problem for us.
>>>
>>> Our engine is designed to be scalable, flexible and dynamic, any user
>>> can add new dimensions or measures from any source, all the data is stored
>>> on Cube(this is fixed data) and MSSQL(dynamic data) so we have decoupled
>>> tables with the dimension values.
>>>
>>>
>>> Ok, with the context given I'll like to clear some doubts
>>>
>>> - I able to flat the table with all the possible dimension values to
>>> cassandra, creating the pk against the dimension columns? this will give me
>>> the "sensation" of data pivot over the PK columns? If correct, what if I
>>> want to select the order of the columns, or add another or reduce them?
>>> - It's possible to extend the values of a row dynamically? What we do
>>> often is join row against a value of a mapped external data value to extend
>>> the dimensions hierarchical value structure (ie state->Country->Continent)
>>>
>>> I know we can do some of this things in the core of our engine, like the
>>> dimension extension of the values or reduce columns, but as we are
>>> evaluating differents technologies is good to know.
>>>
>>> Regards!!
>>>
>>>
>>> --
>>> Ing. Ivaldi Andres
>>>
>>
>
>
> --
> Ing. Ivaldi Andres
>


Re: Cassandra Ussages

2016-03-01 Thread Andrés Ivaldi
Jonathan thanks for the link,
I believe that maybe is good as Data Store part, because is fast for I/o
and handles Time Series, for analytics could be with Apache Ignite and/or
Apache Spark
what it worries me is that looks very complex create the structure for each
Fact table and then extend

regards.

On Sun, Feb 28, 2016 at 12:28 PM, Jonathan Haddad  wrote:

> Cassandra is primarily used as an OLTP database, not analytics. You should
> watch this 30 min video discussing Cassandra core concepts (coming from a
> relational background):
> https://academy.datastax.com/courses/ds101-introduction-cassandra
>
> On Sun, Feb 28, 2016 at 5:40 AM Andrés Ivaldi  wrote:
>
>> Hello, At my work we are looking for new technologies for an Analysis
>> Engine, and we are evaluating differents technologies one of them is
>> Cassandra as our Data repository.
>>
>> Now we can execute query analysis agains an OLAP Cube and RDBMS, using
>> MSSQL as our data repository. Cube is obsolete and SQL server engine is
>> slow as data repository.
>>
>> I don't know much about cassandra, I read some books, and looks to fit
>> well on what we are needing, but there are some things that looks like a
>> problem for us.
>>
>> Our engine is designed to be scalable, flexible and dynamic, any user can
>> add new dimensions or measures from any source, all the data is stored on
>> Cube(this is fixed data) and MSSQL(dynamic data) so we have decoupled
>> tables with the dimension values.
>>
>>
>> Ok, with the context given I'll like to clear some doubts
>>
>> - I able to flat the table with all the possible dimension values to
>> cassandra, creating the pk against the dimension columns? this will give me
>> the "sensation" of data pivot over the PK columns? If correct, what if I
>> want to select the order of the columns, or add another or reduce them?
>> - It's possible to extend the values of a row dynamically? What we do
>> often is join row against a value of a mapped external data value to extend
>> the dimensions hierarchical value structure (ie state->Country->Continent)
>>
>> I know we can do some of this things in the core of our engine, like the
>> dimension extension of the values or reduce columns, but as we are
>> evaluating differents technologies is good to know.
>>
>> Regards!!
>>
>>
>> --
>> Ing. Ivaldi Andres
>>
>


-- 
Ing. Ivaldi Andres


Re: Cassandra Ussages

2016-02-28 Thread Jonathan Haddad
Cassandra is primarily used as an OLTP database, not analytics. You should
watch this 30 min video discussing Cassandra core concepts (coming from a
relational background):
https://academy.datastax.com/courses/ds101-introduction-cassandra
On Sun, Feb 28, 2016 at 5:40 AM Andrés Ivaldi  wrote:

> Hello, At my work we are looking for new technologies for an Analysis
> Engine, and we are evaluating differents technologies one of them is
> Cassandra as our Data repository.
>
> Now we can execute query analysis agains an OLAP Cube and RDBMS, using
> MSSQL as our data repository. Cube is obsolete and SQL server engine is
> slow as data repository.
>
> I don't know much about cassandra, I read some books, and looks to fit
> well on what we are needing, but there are some things that looks like a
> problem for us.
>
> Our engine is designed to be scalable, flexible and dynamic, any user can
> add new dimensions or measures from any source, all the data is stored on
> Cube(this is fixed data) and MSSQL(dynamic data) so we have decoupled
> tables with the dimension values.
>
>
> Ok, with the context given I'll like to clear some doubts
>
> - I able to flat the table with all the possible dimension values to
> cassandra, creating the pk against the dimension columns? this will give me
> the "sensation" of data pivot over the PK columns? If correct, what if I
> want to select the order of the columns, or add another or reduce them?
> - It's possible to extend the values of a row dynamically? What we do
> often is join row against a value of a mapped external data value to extend
> the dimensions hierarchical value structure (ie state->Country->Continent)
>
> I know we can do some of this things in the core of our engine, like the
> dimension extension of the values or reduce columns, but as we are
> evaluating differents technologies is good to know.
>
> Regards!!
>
>
> --
> Ing. Ivaldi Andres
>


Cassandra Ussages

2016-02-28 Thread Andrés Ivaldi
Hello, At my work we are looking for new technologies for an Analysis
Engine, and we are evaluating differents technologies one of them is
Cassandra as our Data repository.

Now we can execute query analysis agains an OLAP Cube and RDBMS, using
MSSQL as our data repository. Cube is obsolete and SQL server engine is
slow as data repository.

I don't know much about cassandra, I read some books, and looks to fit well
on what we are needing, but there are some things that looks like a problem
for us.

Our engine is designed to be scalable, flexible and dynamic, any user can
add new dimensions or measures from any source, all the data is stored on
Cube(this is fixed data) and MSSQL(dynamic data) so we have decoupled
tables with the dimension values.


Ok, with the context given I'll like to clear some doubts

- I able to flat the table with all the possible dimension values to
cassandra, creating the pk against the dimension columns? this will give me
the "sensation" of data pivot over the PK columns? If correct, what if I
want to select the order of the columns, or add another or reduce them?
- It's possible to extend the values of a row dynamically? What we do often
is join row against a value of a mapped external data value to extend the
dimensions hierarchical value structure (ie state->Country->Continent)

I know we can do some of this things in the core of our engine, like the
dimension extension of the values or reduce columns, but as we are
evaluating differents technologies is good to know.

Regards!!


-- 
Ing. Ivaldi Andres