[
https://issues.apache.org/jira/browse/HBASE-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12837146#action_12837146
]
ryan rawson commented on HBASE-2248:
------------------------------------
could you please tell me where your 4k of memory quote is coming from?
the clone() is a deep/shallow clone. The KeyValues arent being cloned, but in
ever other way the clone is a deep clone - it copies all the nodes! That could
be literally a million nodes! The number of nodes is dependent on your data
size... 64MB memstore can accomodate 1.3m values if your KeyValue size is ~ 50
bytes. Or even larger if you start kicking in the memstore multiplier during a
pending snapshot, you could have 4m+ nodes in a snapshot and a oversized kvset.
Clone is not really viable, it needs to be rolled back. Furthermore it doesnt
provide atomic protection anyways.
> New MemStoreScanner copies memstore for each scan, makes short scans slow
> -------------------------------------------------------------------------
>
> Key: HBASE-2248
> URL: https://issues.apache.org/jira/browse/HBASE-2248
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.20.3
> Reporter: Dave Latham
> Fix For: 0.20.4
>
> Attachments: threads.txt
>
>
> HBASE-2037 introduced a new MemStoreScanner which triggers a
> ConcurrentSkipListMap.buildFromSorted clone of the memstore and snapshot when
> starting a scan.
> After upgrading to 0.20.3, we noticed a big slowdown in our use of short
> scans. Some of our data repesent a time series. The data is stored in time
> series order, MR jobs often insert/update new data at the end of the series,
> and queries usually have to pick up some or all of the series. These are
> often scans of 0-100 rows at a time. To load one page, we'll observe about
> 20 such scans being triggered concurrently, and they take 2 seconds to
> complete. Doing a thread dump of a region server shows many threads in
> ConcurrentSkipListMap.biuldFromSorted which traverses the entire map of key
> values to copy it.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.