[
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16038521#comment-16038521
]
Mikhail Antonov commented on HBASE-18090:
-----------------------------------------
Thanks [[email protected]] and [~easyliangjob] for reviews! I'll address them
shortly.
I've made my patch off branch-1.3 so not sure why you couldn't apply it
locally. Merge conflicts?
I found an issue with current patch, if we try to open a region from several
tasks we're hitting a race in this code:
{code}
at org.apache.hadoop.fs.FileSystem.createNewFile(FileSystem.java:1154)
at
org.apache.hadoop.hbase.wal.WALSplitter.writeRegionSequenceIdFile(WALSplitter.java:740)
at
org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:876)
at
org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:802)
at
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6708)
at
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6669)
at
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6640)
at
org.apache.hadoop.hbase.client.ClientSideRegionScanner.<init>(ClientSideRegionScanner.java:60)
{code}
Why do we need to go through the code path if we know region is in read-only
mode?
> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --------------------------------------------------------------------------
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
> Issue Type: Improvement
> Components: mapreduce
> Affects Versions: 1.4.0
> Reporter: Mikhail Antonov
> Attachments: HBASE-18090-branch-1.3-v1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot.
> This places unnecessary restriction that the region layout of the original
> table needs to take the processing resources available to MR job into
> consideration. Allowing to run multiple mappers per region (assuming
> reasonably even key distribution) would be useful.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)