[ 
https://issues.apache.org/jira/browse/CASSANDRA-18714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-18714:
----------------------------------------
    Description: 
{{CQLSSTableWriter}} currently has no way of writing any secondary indexes 
inline as it writes the core SSTable components. With SAI, this has become 
tractable problem, and we should be able to enhance both it and 
{{SSTableImporter}} to handle cases where we might want to write SSTables 
somewhere in bulk (and in parallel) and then import them without waiting for 
index building on import. It would require the following changes:

1.) {{CQLSSTableWriter}} must accept 2i definitions on top of its current table 
schema definition. Once added to the schema, any {{ColumnFamilyStore}} 
instances opened will have those 2i defined in their index managers.

2.) All {{AbstractSSTableSimpleWriter}} instances must register index groups, 
allowing the proper {{SSTableFlushObservers}} to be attached to 
{{SSTableWriter}}. Once this is done, SAI (and any other SSTable-attached 
indexes) components will be built incrementally along w/ the SSTable data file, 
and will be finalized when the newly written SSTable is finalized.

3.) Provide an example (in a unit test?) of how a third-party tool might, 
assuming access to the right C* JAR, validate/checksum SAI components outside 
C* proper.

4.) {{SSTableImporter}} should have two new options:
    a.) an option that fails import if any SSTable-attached 2i must be built 
(i.e. has not already been built and brought along w/ the other new SSTable 
components)
    b.) an option that allows us to bypass full checksum validation on 
imported/already-built SSTable-attached indexes (assuming they have just been 
written by {{CQLSSTableWriter}})

  was:
{{CQLSSTableWriter}} currently has no way of writing any secondary indexes 
inline as it writes the core SSTable components. With SAI, this has become 
tractable problem, and we should be able to enhance both it and 
{{SSTableImporter}} to handle cases where we might want to write SSTables 
somewhere in bulk (and in parallel) and then import them without waiting for 
index building on import. It would require the following changes:

1.) {{CQLSSTableWriter}} must accept 2i definitions on top of its current table 
schema definition. Once added to the schema, any {{ColumnFamilyStore}} 
instances opened will have those 2i defined in their index managers.

2.) All {{AbstractSSTableSimpleWrite}}r instances must register index groups, 
allowing the proper {{SSTableFlushObservers}} to be attached to 
{{SSTableWriter}}. Once this is done, SAI (and any other SSTable-attached 
indexes) components will be built incrementally along w/ the SSTable data file, 
and will be finalized when the newly written SSTable is finalized.

3.) Provide an example (in a unit test?) of how a third-party tool might, 
assuming access to the right C* JAR, validate/checksum SAI components outside 
C* proper.

4.) {{SSTableImporter}} should have two new options:
    a.) an option that fails import if any SSTable-attached 2i must be built 
(i.e. has not already been built and brought along w/ the other new SSTable 
components)
    b.) an option that allows us to bypass full checksum validation on 
imported/already-built SSTable-attached indexes (assuming they have just been 
written by {{CQLSSTableWriter}})


> Expand CQLSSTableWriter to write SSTable-attached secondary indexes
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-18714
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18714
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Feature/SAI, Tool/bulk load
>            Reporter: Caleb Rackliffe
>            Assignee: Caleb Rackliffe
>            Priority: Normal
>             Fix For: 5.x
>
>
> {{CQLSSTableWriter}} currently has no way of writing any secondary indexes 
> inline as it writes the core SSTable components. With SAI, this has become 
> tractable problem, and we should be able to enhance both it and 
> {{SSTableImporter}} to handle cases where we might want to write SSTables 
> somewhere in bulk (and in parallel) and then import them without waiting for 
> index building on import. It would require the following changes:
> 1.) {{CQLSSTableWriter}} must accept 2i definitions on top of its current 
> table schema definition. Once added to the schema, any {{ColumnFamilyStore}} 
> instances opened will have those 2i defined in their index managers.
> 2.) All {{AbstractSSTableSimpleWriter}} instances must register index groups, 
> allowing the proper {{SSTableFlushObservers}} to be attached to 
> {{SSTableWriter}}. Once this is done, SAI (and any other SSTable-attached 
> indexes) components will be built incrementally along w/ the SSTable data 
> file, and will be finalized when the newly written SSTable is finalized.
> 3.) Provide an example (in a unit test?) of how a third-party tool might, 
> assuming access to the right C* JAR, validate/checksum SAI components outside 
> C* proper.
> 4.) {{SSTableImporter}} should have two new options:
>     a.) an option that fails import if any SSTable-attached 2i must be built 
> (i.e. has not already been built and brought along w/ the other new SSTable 
> components)
>     b.) an option that allows us to bypass full checksum validation on 
> imported/already-built SSTable-attached indexes (assuming they have just been 
> written by {{CQLSSTableWriter}})



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to