[
https://issues.apache.org/jira/browse/CASSANDRA-10681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15011664#comment-15011664
]
Pavel Yaskevich edited comment on CASSANDRA-10681 at 11/18/15 8:17 PM:
-----------------------------------------------------------------------
Let me clarify a bit - it serializes a compilation of all of the indexes not
building them. That's what I have already mentioned, Indexes are built separate
but it's just a nature of the Index API since PerRowIndex is no more we have to
build all of the indexes independently, this requires to pass through a data
multiple times but I don't think it's necessary a problem if we list the
assumption that indexes are build in one and only way - by merging sstables
together and feeding index collated row - and let API implementers decide how
to build indexes based on the set of sstables. When the SSTable is added via
streaming for example CASSANDRA-10678 would take care of creating indexes for
it in case of SASI and Indexer API in case of standard indexes, so I don't
really see a problem there, in case of side loading new compaction task per
index is going to be triggered to build such indexes if necessary but we can't
really go around that.
was (Author: xedin):
Let me clarify a bit - it serializes a compilation of all of the indexes not
building them. That's what I have already mentioned, Indexes are built separate
but it's just a nature of the Index API since PerRowIndex is no more we have to
build all of the indexes independently, this requires to pass through a data
multiple times but I don't think it's necessary a problem if we list the
assumption that indexes are build in one and only way - by merging sstables
together and feeding index collated row - and let API implementers decide how
to build indexes based on the set of sstables. When the SSTable is added via
streaming for example CASSANDRA-10678 would take care of creating indexes for
it in case of SASI and Indexer API in case of standard indexes, so I don't
really see a problem there, in case of side loading new compaction task per
index is going to be triggered to build such indexes in necessary but we can't
really go around that.
> make index building pluggable via IndexBuildTask
> ------------------------------------------------
>
> Key: CASSANDRA-10681
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10681
> Project: Cassandra
> Issue Type: Sub-task
> Components: Local Write-Read Paths
> Reporter: Pavel Yaskevich
> Assignee: Pavel Yaskevich
> Priority: Minor
> Labels: sasi
> Fix For: 3.x
>
>
> Currently index building assumes one and only way to build all of the indexes
> - through SecondaryIndexBuilder - which merges all of the sstables together,
> collates columns etc. Such works fine for built-in indexes but not for SASI
> since it's attaches to every SSTable individually. We need a "IndexBuildTask"
> interface (based on CompactionInfo.Holder) to be returned from Index on
> demand to give power to SI interface implementers to decide how build should
> work. This might be less effective for CassandraIndex, since this effectively
> means that collation will have to be done multiple times on the same data,
> but nevertheless is a good compromise for clean interface to outside world.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)