[ 
https://issues.apache.org/jira/browse/CASSANDRA-13075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15794760#comment-15794760
 ] 

Alex Petrov commented on CASSANDRA-13075:
-----------------------------------------

Good find! 

I might be misunderstanding the issue, but as far as I can say, we have 
multiple alternatives, if what we need is ensuring that 
{{Index.Indexer::start}} and {{Index.Indexer::finish}} are called just once per 
partition:

  * piggyback the {{readStatic}} boolean, that makes sure we index the static 
row just once
  * call them outside of the loop (since essentially we have both 
{{partitionColumns}} and {{writeGroup}} available before we have a partition 
page, so it's optional

I may have misunderstood the part about {{PartitionIterators.getOnlyElement}}, 
but it also seems to me that this behaviour will be just fine if we take the 
{{start}} and {{finish}} out of the loop, since {{insertRow}} for statics will 
be skipped on further iterations and for partition rows it will be also skipped 
since partition is empty on exhausted iterator.. 

> Indexer is not correctly invoked when building indexes over sstables
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-13075
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13075
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sergio Bossa
>            Assignee: Alex Petrov
>            Priority: Critical
>
> Following CASSANDRA-12796, {{SecondaryIndexManager#indexPartition()}} calls 
> each {{Indexer}} {{begin}} and {{finish}} methods multiple times per 
> partition (depending on the page size), as 
> {{PartitionIterators#getOnlyElement()}} returns an empty partition even when 
> the iterator is exhausted.
> This leads to bugs for {{Indexer}} implementations doing actual work in those 
>  methods, but even worse, it provides the {{Indexer}} the same input of an 
> empty partition containing only a non-live partition deletion, as the 
> {{Indexer#partitionDelete()}} method is *not* actually called.
> My proposed solution:
> 1) Stop the iteration before the empty partition is returned and ingested 
> into the {{Indexer}}.
> 2) Actually call the {{Indexer#partitionDelete()}} method inside 
> {{SecondaryIndexManager#indexPartition()}} (which requires to use a filtered 
> iterator so it actually contains the deletion info).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to