[ 
https://issues.apache.org/jira/browse/CASSANALYTICS-102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANALYTICS-102:
------------------------------------
    Description: Propose adding time range filter in Bulk Reader. This filter 
can improve the performance of bulk reader especially for tables using 
TimeWindowCompactionStrategy, when analytics users want to filter out SSTables 
outside the required time window. Analytics users will be able to set start and 
end timestamp of SSTable they are interested in with spark options. Internally 
a time range filter is created from the options and passed to SSTableReader. We 
filter out SSTables with min and max SSTable timestamp and avoid streaming data 
files.  (was: Creating this MTC request to add time range filter in Bulk 
Reader. This filter can improve the performance of bulk reader especially for 
tables using TimeWindowCompactionStrategy, when analytics users want to filter 
out SSTables outside the required time window. Analytics users will be able to 
set start and end timestamp of SSTable they are interested in with spark 
options. Internally a time range filter is created from the options and passed 
to SSTableReader. We filter out SSTables with min and max SSTable timestamp and 
avoid streaming data files.)

> Add TimeRangeFilter to filter out SSTables outside given time window
> --------------------------------------------------------------------
>
>                 Key: CASSANALYTICS-102
>                 URL: https://issues.apache.org/jira/browse/CASSANALYTICS-102
>             Project: Apache Cassandra Analytics
>          Issue Type: New Feature
>          Components: Reader
>            Reporter: Saranya Krishnakumar
>            Assignee: Saranya Krishnakumar
>            Priority: Normal
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Propose adding time range filter in Bulk Reader. This filter can improve the 
> performance of bulk reader especially for tables using 
> TimeWindowCompactionStrategy, when analytics users want to filter out 
> SSTables outside the required time window. Analytics users will be able to 
> set start and end timestamp of SSTable they are interested in with spark 
> options. Internally a time range filter is created from the options and 
> passed to SSTableReader. We filter out SSTables with min and max SSTable 
> timestamp and avoid streaming data files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to