stack created HBASE-18166:
-----------------------------

             Summary: [AMv2] We are splitting already-split files
                 Key: HBASE-18166
                 URL: https://issues.apache.org/jira/browse/HBASE-18166
             Project: HBase
          Issue Type: Bug
          Components: Region Assignment
    Affects Versions: 2.0.0
            Reporter: stack
            Assignee: stack
             Fix For: 2.0.0


Interesting issue. The below adds a lag cleaning up files after a compaction in 
case of on-going Scanners (for read replicas/offheap).

HBASE-14970 Backport HBASE-13082 and its sub-jira to branch-1 - recommit (Ram)

What the lag means is that now that split is run from the HMaster in master 
branch, when it goes to get a listing of the files to split, it can pick up 
files that are for archiving but that have not been archived yet.  When it 
does, it goes ahead and splits them... making references of references.

Its a mess.

I added asking the Region if it is splittable a while back. The Master calls 
this from SplitTableRegionProcedure during preparation. If the RegionServer 
asked for the split, it is sort of redundant work given the RS asks itself if 
any references still; if any, it'll wait before asking for a split. But if a 
user/client asks, then this isSplittable over RPC comes in handy.

I was thinking that isSplittable could return list of files.... 

Or, easier, given we know a region is Splittable by the time we go to split the 
files, then I think master-side we can just skip any references found presuming 
read-for-archive.

Will be back with a patch. Want to test on cluster first (Side-effect is 
regions are offline because file at end of the reference to a reference is 
removed ... and so the open fails).





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to