[ 
https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14964031#comment-14964031
 ] 

stack commented on HBASE-13082:
-------------------------------

bq. When we mark a file as compacted the confusion can be like whether this 
file is a file created out of compaction.

Yeah, that was my first thought too... same as Anoop.

Is that a new Chore per Store per Region?  Every two minutes seems like a long 
time to hold on to files?

Yeah, this is a good point by [~anoop.hbase]:

bq. What about flush? .....So we wont change the store file's heap during the 
scan is not really possible?

The other thing to consider is getting bulk loaded files in here while scans 
are going on. Seems like they could go in on open of a new scan. That'll work 
nicely. A file will be bullk loaded but won't be seen by ongoing scans.

The hard thing then is what to do about flush (we've been here before!)

On flush, what if we let all Scans complete before letting go the snapshot? 
More memory pressure. Simpler implementation.

Otherwise, flush registers it has happened at the region level. Scanners check 
for flush events every-so-often (we already have checkpoints in the scan to 
ensure we don't go over size or time constraints... could check at these times) 
and when they find one, they swap in the flushed file. When all Scanners have 
done this, then we let go the snapshot.  This might be a different sort of 
event to the one described in the doc here..  where we swap in compacted files 
on new scan creation.. but yeah, implementation would be cleaner if swap in of 
flushed files and compacted files all happened in the one manner.





> Coarsen StoreScanner locks to RegionScanner
> -------------------------------------------
>
>                 Key: HBASE-13082
>                 URL: https://issues.apache.org/jira/browse/HBASE-13082
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 
> 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, 
> HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, 
> gc.png, gc.png, hits.png, next.png, next.png
>
>
> Continuing where HBASE-10015 left of.
> We can avoid locking (and memory fencing) inside StoreScanner by deferring to 
> the lock already held by the RegionScanner.
> In tests this shows quite a scan improvement and reduced CPU (the fences make 
> the cores wait for memory fetches).
> There are some drawbacks too:
> * All calls to RegionScanner need to be remain synchronized
> * Implementors of coprocessors need to be diligent in following the locking 
> contract. For example Phoenix does not lock RegionScanner.nextRaw() and 
> required in the documentation (not picking on Phoenix, this one is my fault 
> as I told them it's OK)
> * possible starving of flushes and compaction with heavy read load. 
> RegionScanner operations would keep getting the locks and the 
> flushes/compactions would not be able finalize the set of files.
> I'll have a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to