[
https://issues.apache.org/jira/browse/HBASE-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Purtell resolved HBASE-2959.
-----------------------------------
Resolution: Won't Fix
Cleaning up issue by resolving as Won't Fix based on discussion
> Scanning always starts at the beginning of a row
> ------------------------------------------------
>
> Key: HBASE-2959
> URL: https://issues.apache.org/jira/browse/HBASE-2959
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Affects Versions: 0.20.4, 0.20.5, 0.20.6, 0.89.20100621
> Reporter: Benoit Sigoure
>
> In HBASE-2248, the code in {{HRegion#get}} was changed like so:
> {code}
> - private void get(final Store store, final Get get,
> - final NavigableSet<byte []> qualifiers, List<KeyValue> result)
> - throws IOException {
> - store.get(get, qualifiers, result);
> + /*
> + * Do a get based on the get parameter.
> + */
> + private List<KeyValue> get(final Get get) throws IOException {
> + Scan scan = new Scan(get);
> +
> + List<KeyValue> results = new ArrayList<KeyValue>();
> +
> + InternalScanner scanner = null;
> + try {
> + scanner = getScanner(scan);
> + scanner.next(results);
> + } finally {
> + if (scanner != null)
> + scanner.close();
> + }
> + return results;
> }
> {code}
> So instead of doing a {{get}} straight on the {{Store}}, we now open a
> scanner. The problem is that we eventually end up in {{ScanQueryMatcher}}
> where the constructor does: {{this.startKey =
> KeyValue.createFirstOnRow(scan.getStartRow());}}. This entails that if we
> have a very wide row (thousands of columns), the scanner will need to go
> through thousands of {{KeyValue}}'s before finding the right entry, because
> it always starts from the beginning of the row, whereas before it was much
> more straightforward.
> This problem was under the radar for a while because the overhead isn't too
> unreasonable, but later on, {{incrementColumnValue}} was changed to do a
> {{get}} under the hood. At StumbleUpon we do thousands of ICV per second, so
> thousand of times per second we're scanning some really wide rows. When a
> row is contented, this results in all the IPC threads being stuck on
> acquiring a row lock, while one thread is doing the ICV (albeit slowly due to
> the excessive scanning). When all IPC threads are stuck, the region server
> is unable to serve more requests.
> As a nice side effect, fixing this bug will make {{get}} and
> {{incrementColumnValue}} faster, as well as the first call to {{next}} on a
> scanner.
--
This message was sent by Atlassian JIRA
(v6.2#6252)