Depends what you mean by downsampling: for example if you have time as
clustering key in your table you can order data DESC inside partitions and
then just do a select with per partition limit ( last 10 entries in each
partition for example).
But if you would like to extract a totally random subset of data from an
existing table (which would correspond more to downsampling) - then you
need to do this in an analytical layer on top of Cassandra ( like Apache
Spark). As far as I know you cannot do this in Cassandra.
On Fri, Feb 23, 2018 at 9:44 PM, Akash Gangil <akashg1...@gmail.com> wrote:
> Hi Valentina,
> In that case, are there any well defined ways on how to do downsampling of
> data in C*?
> On Fri, Feb 23, 2018 at 11:36 AM, Valentina Crisan <
> valentina.cri...@gmail.com> wrote:
>> as far as I know it is not intended for MV's to have a different TTL than
>> the base tables. There was patch released at some point to not allow TTL
>> setting on MV (https://issues.apache.org/jira/browse/CASSANDRA-12868).
>> MV's should inherit the TTL of the base table.
>> On Fri, Feb 23, 2018 at 6:42 PM, Akash Gangil <akashg1...@gmail.com>
>>> I had a couple of questions:
>>> 1. Can I create a materialized view on a table with a TTL longer than
>>> the base table? For ex: my materialized view TTL is 1 month while my base
>>> table TTL is 1 week.
>>> 2. In the above scenario, since the data in my base table would be gone
>>> after a week, would it impact data in the materialized view?
>>> My use case if I have some time series data, which is stored in the base
>>> table by_minute and I want to downsample it to by_month. So my base table
>>> stores by_minute data but my materialized view stores stores by_week data.