[ 
https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15023832#comment-15023832
 ] 

stack commented on HBASE-13082:
-------------------------------

Nice explanation on compactedfiles

bq.    // Sorting may not be really needed here for the compacted files?

Yeah. They are just going to be deleted, right... No harm sorting though I 
suppose.. you'll delete oldest to newest?

Keep CompactedHFilesDischarger name. It ties his chore to this stuff going on 
in the StoreManager.


Could also have the method internally do these checks for null and check for 
zero count?

2351        if (copyCompactedfiles != null && copyCompactedfiles.size() != 0) {
2352          removeCompactedFiles(copyCompactedfiles);
2353        }

Then all in one place.

closeAfterCompaction seems misnamed. The return is whether there are references 
outstanding and if the file can be safely removed/closed? 

470       /**
471        * Closes and archives the compacted files under this store
472        */
473       void closeAndArchiveCompactedFiles() throws IOException;

We'll only close and archive if no references and if it is marked as compacted, 
right? Otherwise, we'll do it at a later place?

So, we are going to change this in another issue?

      compactedFile.markCompacted();

i.e. marking a file as compacted rather than telling StoreFileManager it is 
compacted?

So, the StoreFile has new attributes of current refcount and if compacted away. 
StoreFileManager has list of compacted files. StoreFileManager is in charge of 
the list of StoreFlies in a Store. It makes ruling on what to include in a 
Scan. It does clearFiles... and compaction. Shouldn't it be in charge of 
knowing when files are not to be included in a Scan/can be removed? When we 
mark a file compacted, we should do it on StoreFileManager?  Can it do the 
refcounting? When a Scan is done, tell the StoreFileManager so it can do the 
refcounting?

>From your earlier comment:

bq. We cannot have this in StorefileInfo because we only cache the Storefile 
(in the StorefileManager) and not the StorefileInfo. StoreFileInfos are created 
every time from the hfile path.

Can StoreFileManager then do refcounting and knowing what files are compacted? 
Would that be doable and put these attributes in one location?

This should be happening internal to StoreFileManager rather than out here in 
HStore?

        853           Collection<StoreFile> compactedfiles =
854               storeEngine.getStoreFileManager().clearCompactedFiles();
855           // clear the compacted files
856           if (compactedfiles != null && !compactedfiles.isEmpty()) {
857             removeCompactedFiles(compactedfiles);
858           }

Gets complicated here on the end in closeAndArchiveCompactedFiles where there 
is a lock for the Store being used to close out storefiles....  And, you are 
just following on from what is in here already. Ugh.

In StoreFile you have

  public boolean isCompacted() {

and then later you have on the Reader isCompactedAway. These methods should be 
named the same (And see above where hopefully, we don't have to do this on the 
StoreFile itself at all). Ditto for getRefCount (Does StoreFileManager know 
refcount?)

Looking at the additions to StoreFileManager, if compaction was kept internal 
to StoreFileManager, would you have to add these new methods?

Does StoreFileScanner have reference to StoreFileManager?

On how often to call checkResetHeap, we have to be prompt because we are 
carrying a snapshot until we do reset?








> Coarsen StoreScanner locks to RegionScanner
> -------------------------------------------
>
>                 Key: HBASE-13082
>                 URL: https://issues.apache.org/jira/browse/HBASE-13082
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 
> 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, 
> HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, 
> HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_1_WIP.patch, 
> HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, 
> HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, 
> HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, 
> LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png
>
>
> Continuing where HBASE-10015 left of.
> We can avoid locking (and memory fencing) inside StoreScanner by deferring to 
> the lock already held by the RegionScanner.
> In tests this shows quite a scan improvement and reduced CPU (the fences make 
> the cores wait for memory fetches).
> There are some drawbacks too:
> * All calls to RegionScanner need to be remain synchronized
> * Implementors of coprocessors need to be diligent in following the locking 
> contract. For example Phoenix does not lock RegionScanner.nextRaw() and 
> required in the documentation (not picking on Phoenix, this one is my fault 
> as I told them it's OK)
> * possible starving of flushes and compaction with heavy read load. 
> RegionScanner operations would keep getting the locks and the 
> flushes/compactions would not be able finalize the set of files.
> I'll have a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to