[
https://issues.apache.org/jira/browse/HBASE-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12659703#action_12659703
]
Andrew Purtell commented on HBASE-1050:
---------------------------------------
Ran a new experiment. I have two largish tables -- urls and content. Running
scanners over and over on content, with 300 second pause in between. No
scanners running at all over urls. Started up a crawler pounding, lots of new
data. I'm seeing urls split, content is not. Stands to reason both should be.
[Later...]
After restart content went from ~1100 regions to ~1300 regions.
> Allow regions to split around scanners
> --------------------------------------
>
> Key: HBASE-1050
> URL: https://issues.apache.org/jira/browse/HBASE-1050
> Project: Hadoop HBase
> Issue Type: Improvement
> Components: client, regionserver
> Reporter: Andrew Purtell
> Assignee: stack
> Priority: Blocker
> Fix For: 0.19.0
>
>
> We have a number of scanners iterating over a table that also sees a lot of
> constant write activity. If the scans are too frequent we will suppress
> splitting. At a lull then a number of splits happen all at once, occasionally
> overwhelming DFS and causing file corruption.
> I wonder how much work it would be to split regions around scanners. Rather
> than wait for scanner leases to expire, suspend/block the scanner, split the
> table, and then negotiate with the client to continue.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.