[ 
https://issues.apache.org/jira/browse/PHOENIX-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15982230#comment-15982230
 ] 

Vincent Poon commented on PHOENIX-3806:
---------------------------------------

What I find is that if I create 100 versions of a row, 
NonTxIndexBuilder#createTimestampBatchesFromMutation creates 100 batches based 
on timestamp.

Then for each batch, we call addMutationsForBatch, which in step 3 does a loop 
and adds index updates for all timestamps up to the current batch's timestamp.  
All the index updates are DELETE updates.

What this means is that overall, the # of updates you have is the summation of 
the series 1...100.  So you have something like 5050 index updates, and because 
of the issues described above, you're sorting 5050 times.

And of course as you create more versions, the numbers quickly become 
unfeasible.

> IndexUpdateManager spending a lot of time sorting mutations on Index rebuild
> ----------------------------------------------------------------------------
>
>                 Key: PHOENIX-3806
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3806
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>
> Here's the stack trace. The Array contains 50001 Delete Mutations in this 
> case.
> It seems the code is sorting this over and over again.
> {code}
> Thread 170 (B.DefaultRpcServer.handler=67,queue=7,port=60020):
>   State: RUNNABLE
>   Blocked count: 220598
>   Waited count: 377933
>   Stack:
>     java.util.TimSort.binarySort(TimSort.java:296)
>     java.util.TimSort.sort(TimSort.java:239)
>     java.util.Arrays.sort(Arrays.java:1438)
>     
> org.apache.phoenix.hbase.index.covered.update.SortedCollection.iterator(SortedCollection.java:78)
>     
> org.apache.phoenix.hbase.index.covered.update.IndexUpdateManager.fixUpCurrentUpdates(IndexUpdateManager.java:128)
>     
> org.apache.phoenix.hbase.index.covered.update.IndexUpdateManager.addIndexUpdate(IndexUpdateManager.java:115)
>     
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.addCurrentStateMutationsForBatch(NonTxIndexBuilder.java:333)
>     
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.addUpdateForGivenTimestamp(NonTxIndexBuilder.java:258)
>     
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.addMutationsForBatch(NonTxIndexBuilder.java:231)
>     
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.batchMutationAndAddUpdates(NonTxIndexBuilder.java:109)
>     
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.getIndexUpdate(NonTxIndexBuilder.java:71)
>     
> org.apache.phoenix.hbase.index.builder.IndexBuildManager$1.call(IndexBuildManager.java:137)
>     
> org.apache.phoenix.hbase.index.builder.IndexBuildManager$1.call(IndexBuildManager.java:133)
>     java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293)
>     
> com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:61)
>     
> org.apache.phoenix.hbase.index.parallel.BaseTaskRunner.submit(BaseTaskRunner.java:58)
>     
> org.apache.phoenix.hbase.index.parallel.BaseTaskRunner.submitUninterruptible(BaseTaskRunner.java:99)
>     
> org.apache.phoenix.hbase.index.builder.IndexBuildManager.getIndexUpdate(IndexBuildManager.java:144)
>     
> org.apache.phoenix.hbase.index.Indexer.preBatchMutateWithExceptions(Indexer.java:324)
> Thread 169 (B.DefaultRpcServer.handler=66,queue=6,port=60020):
> {code}
> [~jamestaylor]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to