C. Scott Andreas updated CASSANDRA-5741:
    Component/s: Secondary Indexes

> Provide a way to disable automatic index rebuilds during bulk loading
> ---------------------------------------------------------------------
>                 Key: CASSANDRA-5741
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5741
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Secondary Indexes
>    Affects Versions: 1.2.6
>            Reporter: Jim Zamata
>            Priority: Major
> When using the BulkLoadOutputFormat the actual streaming of the SSTables into 
> Cassandra is fast, but the index rebuilds can take several minutes. Cassandra 
> does not send the response until after all of the rebuilds for a streaming 
> session complete. This causes the tasks to appear to hang at 100%, since the 
> record writer streams the files in its close method.  If the rebuilding 
> process takes too long, the tasks can actually time out.
> Many SQL databases provide bulk insert utilities that disable index updates 
> to allow large amounts of data to be added quickly.  This functionality would 
> serve a similar purpose.
> An alternative might be an option that would allow the session to return once 
> the SSTables had been successfully imported without waiting for the index 
> builds to complete.  However, I have noticed heavy CPU loads during the index 
> rebuilds, so bulkload performance might be better if this step could be 
> deferred until after all of the data is loaded. 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to