[jira] [Issue Comment Edited] (CASSANDRA-571) API for requesting sub-slices of a range of supercolumns

Caleb William Rackliffe (Issue Comment Edited) (JIRA) Sun, 13 Nov 2011 00:52:18 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13149253#comment-13149253
 ]


Caleb William Rackliffe edited comment on CASSANDRA-571 at 11/13/11 8:50 AM:
-----------------------------------------------------------------------------

Cassandra seems like it performs well as a time-series database, but it could 
also be used as a "bi-temporal" time-series database if this feature were 
implemented.

For instance, let's say I want to store time-series data corresponding to major 
macro-economic measures like GDP, PPI, and CPI.  These measures are frequently 
revised months after they are first published, so I may want to know what they 
looked like "as of" a specific date.  A more concrete example might be that I 
want to know the past year of monthly CPI (consumer price index) numbers "as 
of" May 31st.  If the data point for May 1st is revised on June 1st, I can 
query CPI as of June 2nd and get a different series.

If I store the data along the axis of "as of" dates, I may have to replicate a 
great deal of data across columns.  If I decide to give every series its own 
CF, that may be unwieldy, but at least I could make my keys "observation dates" 
and my CF columns "as of" dates.

Am I just overlooking some other way to model the data to make these sorts of 
queries reasonable?
                
      was (Author: maedhroz):
    Cassandra seems like it could perform well as a time-series database, but 
it could also be used as a "bi-temporal" time-series database if this feature 
were implemented.

For instance, let's say I want to store time-series data corresponding to major 
macro-economic measures like GDP, PPI, and CPI.  These measures are frequently 
revised months after they are first published, so I may want to know what they 
looked like "as of" a specific date.  A more concrete example might be that I 
want to know the past year of monthly CPI (consumer price index) numbers "as 
of" May 31st.  If the data point for May 1st is revised on June 1st, I can 
query CPI as of June 2nd and get a different series.

If I store the data along the axis of "as of" dates, I may have to replicate a 
great deal of data across columns.  If I decide to give every series its own 
CF, that may be unwieldy, but at least I could make my keys "observation dates" 
and my CF columns "as of" dates.

Am I just overlooking some other way to model the data to make these sorts of 
queries reasonable?
                  
> API for requesting sub-slices of a range of supercolumns
> --------------------------------------------------------
>
>                 Key: CASSANDRA-571
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-571
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Priority: Minor
>
> Suhail Doshi wrote in a comment to CASSANDRA-570 (a different issue):
> Ability to slice a column and specify an exact super column key, for example:
> column_1 {
>    sc1: {}
> }
> column_2 {
>    sc1: {}
>    sc2: {}
> }
> Be able to slice by "column_1" to "column_2" but instead of grabbing every 
> column, grab only super column "sc1" from each? The reasoning is that it's 
> terrible to have to slice by column and get *every* super column and have it 
> held in memory for the client application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-571) API for requesting sub-slices of a range of supercolumns

Reply via email to