[ 
https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14334504#comment-14334504
 ] 

Lars Hofhansl commented on HBASE-13082:
---------------------------------------

bq. how many column families
Just one.

bq. when we have a scanner stuck down deep inside an HRegion for 20 minutes at 
a time.
I agree. The scanner must have been stuck at the StoreScanner level anyway (per 
my analysis above)

bq. Ain't a scanner stuck for 20minutes a different issue altogether than the 
one being solved here? If a scan disappears for 20 minutes trying to pull out a 
row, can't we do something like the Jonathan Lawlor chunking patch only we have 
it time based? We return a partial – even if empty – if scanning for a full 
minute say?

Problem is that in this case the scanner produced no result at all. I suppose 
we can time in the StoreScanner and return an empty result (just as we would 
when we exhausted the region). RegionScanner will do the right thing. But then 
when I do the patch, the RegionScannerImpl also needs to periodically release 
the lock to give other threads a chance to continue with a flush.

bq. It would be cool if we could do the lock on a Store-basis, especially given 
we not can flush at the Store level.
Where we place lock is not the so much the issue. It's how often we lock and 
unlock the lock.


> Coarsen StoreScanner locks to RegionScanner
> -------------------------------------------
>
>                 Key: HBASE-13082
>                 URL: https://issues.apache.org/jira/browse/HBASE-13082
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>         Attachments: 13082.txt
>
>
> Continuing where HBASE-10015 left of.
> We can avoid locking (and memory fencing) inside StoreScanner by deferring to 
> the lock already held by the RegionScanner.
> In tests this shows quite a scan improvement and reduced CPU (the fences make 
> the cores wait for memory fetches).
> There are some drawbacks too:
> * All calls to RegionScanner need to be remain synchronized
> * Implementors of coprocessors need to be diligent in following the locking 
> contract. For example Phoenix does not lock RegionScanner.nextRaw() and 
> required in the documentation (not picking on Phoenix, this one is my fault 
> as I told them it's OK)
> * possible starving of flushes and compaction with heavy read load. 
> RegionScanner operations would keep getting the locks and the 
> flushes/compactions would not be able finalize the set of files.
> I'll have a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to