Hi all,
I have a use case with huge data which i am not able to design in cassandra.
Table name : MetricResult
Sample Data :
Metric=Sales, Time=Month, Period=Jan-10, Tag=U.S.A, Tag=Pen, Value=10
Metric=Sales, Time=Month, Period=Jan-10, Tag=U.S.A, Tag=Pencil, Value=20
Metric=Sales, Time=Month, Period=Feb-10, Tag=U.S.A, Tag=Pen, Value=30
Metric=Sales, Time=Month, Period=Feb-10, Tag=U.S.A, Tag=Pencil, Value=10
Metric=Sales, Time=Month, Period=Feb-10, Tag=India,
Value=90
Metric=Sales, Time=Year, Period=2010, Tag=U.S.A,
Value=70
Metric=Cost, Time=Year, Period=2010, Tag=CPU,
Value=8000
Metric=Cost, Time=Year, Period=2010, Tag=RAM,
Value=4000
Metric=Cost, Time=Year Period=2011, Tag=CPU,
Value=9000
Metric=Resource, Time=Week Period=Week1-2013, Value=100
So in above case i have case of
TimeSeries data i.e Time,Period column
Dynamic columns i.e Tag column
Indexing on dynamic columns i.e Tag column
Aggregations SUM, AVERAGE
Same value comes again for a Metric, Time, Period, Tag then
overwrite it
Queries i need to support :
--------------------------------------
a)Give data for Metric=Sales AND Time=Month
O/P : 5 rows
b)Give data for Metric=Sales AND Time=Month AND Period=Jan-10
O/P : 2 rows
c)Give data for Metric=Sales AND Tag=U.S.A
O/P : 5 rows
d)Give data for Metric=Sales AND Period=Jan-10 AND Tag=U.S.A AND Tag=Pen
O/P :1 row
This table can have TB's of data and for a Metric,Period can have millions
of rows.
Please give suggestion to design/model this table in Cassandra. If some
limitation in Cassandra then suggest best technology to handle this.
Thanks
Naresh