[
https://issues.apache.org/jira/browse/HBASE-15014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15064965#comment-15064965
]
Elliott Clark edited comment on HBASE-15014 at 12/18/15 11:34 PM:
------------------------------------------------------------------
bq.How does it address issue?
{noformat}
arrayList.removeAll is O(n^2) for every item that needs to be removed. O(n) to
find the item and between O(log n) and O(n) to copy the remaining items into
the array in the correct positions.
So for files where just about every edit has already been flushed we have a
time complexity of about: O(n) [Finding edits to remove ] + O(n^2) [removing
edits]
The solution is to basically change the algorithm to make a single pass and
find elements to keep. So we're at O(n)
{noformat}
bq.This has to be public?
Nope I missed that they were in the same package. Let me get that and one other
tweak.
was (Author: eclark):
bq.How does it address issue?
arrayList.removeAll is O(n^2) for every item that needs to be removed. O(n) to
find the item and between O(log n) and O(n) to copy the remaining items into
the array in the correct positions.
So for files where just about every edit has already been flushed we have a
time complexity of about: O(n) [Finding edits to remove ] + O(n^2) [removing
edits]
The solution is to basically change the algorithm to make a single pass and
find elements to keep. So we're at O(n)
bq.This has to be public?
Nope I missed that they were in the same package. Let me get that and one other
tweak.
> Fix filterCellByStore in WALsplitter is awful for performance
> -------------------------------------------------------------
>
> Key: HBASE-15014
> URL: https://issues.apache.org/jira/browse/HBASE-15014
> Project: HBase
> Issue Type: Bug
> Reporter: Elliott Clark
> Assignee: Elliott Clark
> Priority: Critical
> Attachments: HBASE-15014.patch
>
>
> Testing the latest 1.2 I see this when there is a regionserver that crashes.
> {code}
> Thread 921 (RS_LOG_REPLAY_OPS-hbase2698:16020-0-Writer-1):
> State: RUNNABLE
> Blocked count: 6354
> Waited count: 6249
> Stack:
> org.apache.hadoop.hbase.KeyValue.equals(KeyValue.java:1128)
> java.util.ArrayList.indexOf(ArrayList.java:317)
> java.util.ArrayList.contains(ArrayList.java:300)
> java.util.ArrayList.batchRemove(ArrayList.java:720)
> java.util.ArrayList.removeAll(ArrayList.java:690)
>
> org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.filterCellByStore(WALSplitter.java:1529)
>
> org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.append(WALSplitter.java:1557)
>
> org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.writeBuffer(WALSplitter.java:1113)
>
> org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.doRun(WALSplitter.java:1105)
>
> org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.run(WALSplitter.java:1075)
> Thread 920 (RS_LOG_REPLAY_OPS-hbase2698:16020-0-Writer-0):
> State: TIMED_WAITING
> Blocked count: 17560
> Waited count: 19695
> Stack:
> java.lang.Object.wait(Native Method)
>
> org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.doRun(WALSplitter.java:1093)
>
> org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.run(WALSplitter.java:1075)
> Thread 919 (RS_LOG_REPLAY_OPS-hbase2698:16020-0):
> State: TIMED_WAITING
> Blocked count: 115
> Waited count: 976
> Stack:
> java.lang.Object.wait(Native Method)
>
> org.apache.hadoop.hbase.wal.WALSplitter$EntryBuffers.appendEntry(WALSplitter.java:944)
> org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:365)
> org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:236)
>
> org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:104)
>
> org.apache.hadoop.hbase.regionserver.handler.WALSplitterHandler.process(WALSplitterHandler.java:72)
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> java.lang.Thread.run(Thread.java:745)
> {code}
> This has been going on for >10 mins.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)