[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-28 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-18090:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Resolving.

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 1.4.0, 1.3.2, 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1-v2.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch, 
> HBASE-18090.branch-1.3.001.patch, HBASE-18090.branch-1.3.001.patch, 
> HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-28 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-18090:
--
Fix Version/s: 1.3.2
   1.4.0

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 1.4.0, 1.3.2, 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1-v2.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch, 
> HBASE-18090.branch-1.3.001.patch, HBASE-18090.branch-1.3.001.patch, 
> HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-28 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-18090:
--
Attachment: HBASE-18090.branch-1.3.001.patch

Retry. The test passes for me locally. Also seems totally unrelated to what the 
patch is about.

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1-v2.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch, 
> HBASE-18090.branch-1.3.001.patch, HBASE-18090.branch-1.3.001.patch, 
> HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-27 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-18090:
--
Attachment: HBASE-18090.branch-1.3.001.patch

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1-v2.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch, 
> HBASE-18090.branch-1.3.001.patch, HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-16 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Release Note: 
In this task, we make it possible to run multiple mappers per region in the 
table snapshot. The following code is primary table snapshot mapper 
initializatio: 

TableMapReduceUtil.initTableSnapshotMapperJob(
  snapshotName, // The name of the snapshot (of a 
table) to read from
  scan,  // Scan instance to 
control CF and attribute selection
  mapper, // mapper
  outputKeyClass,   // mapper output key 
  outputValueClass,// mapper output value
  job,   // The current job to 
adjust
  true, // upload HBase jars and 
jars for any of the configured job classes via the distributed cache (tmpjars)
  restoreDir,   // a temporary directory to 
copy the snapshot files into
);

The job only run one map task per region in the table snapshot. With this 
feature, client can specify the desired num of mappers when init table snapshot 
mapper job:

TableMapReduceUtil.initTableSnapshotMapperJob(
  snapshotName, // The name of the snapshot (of a 
table) to read from
  scan,  // Scan instance to 
control CF and attribute selection
  mapper, // mapper
  outputKeyClass,   // mapper output key 
  outputValueClass,// mapper output value
  job,   // The current job to 
adjust
  true, // upload HBase jars and 
jars for any of the configured job classes via the distributed cache (tmpjars)
  restoreDir,   // a temporary directory to 
copy the snapshot files into
  splitAlgorithm, // splitAlgo algorithm to split, 
current split algorithms  support RegionSplitter.UniformSplit() and 
RegionSplitter.HexStringSplit()
  n // how many input splits to 
generate per one region
);

  was:
In this task, we make it possible to run multiple mappers per region in the 
table snapshot. The following code is primary table snapshot mapper 
initializatio: 

TableMapReduceUtil.initTableSnapshotMapperJob(
  snapshotName, // The name of the snapshot (of a 
table) to read from
  scan,  // Scan instance to 
control CF and attribute selection
  mapper, // mapper
  outputKeyClass,   // mapper output key 
  outputValueClass,// mapper output value
  job,   // The current job to 
adjust
  true, // upload HBase jars and 
jars for any of the configured job classes via the distributed cache (tmpjars)
  restoreDir,   // a temporary directory to 
copy the snapshot files into
);

The job only run one map task per region in the table snapshot. With this 
feature, client can specify the desired num of mappers when init table snapshot 
mapper job:

TableMapReduceUtil.initTableSnapshotMapperJob(
  snapshotName, // The name of the snapshot (of a 
table) to read from
  scan,  // Scan instance to 
control CF and attribute selection
  mapper, // mapper
  outputKeyClass,   // mapper output key 
  outputValueClass,// mapper output value
  job,   // The current job to 
adjust
  true, // upload HBase jars and 
jars for any of the configured job classes via the distributed cache (tmpjars)
  restoreDir,   // a temporary directory to 
copy the snapshot files into
  splitAlgorithm, // splitAlgo algorithm to split, 
current split algorithms only support RegionSplitter.UniformSplit() and 
RegionSplitter.HexStringSplit()
  n // how many input splits to 
generate per one region
);


> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: 

[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-16 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Release Note: 
In this task, we make it possible to run multiple mappers per region in the 
table snapshot. The following code is primary table snapshot mapper 
initializatio: 

TableMapReduceUtil.initTableSnapshotMapperJob(
  snapshotName, // The name of the snapshot (of a 
table) to read from
  scan,  // Scan instance to 
control CF and attribute selection
  mapper, // mapper
  outputKeyClass,   // mapper output key 
  outputValueClass,// mapper output value
  job,   // The current job to 
adjust
  true, // upload HBase jars and 
jars for any of the configured job classes via the distributed cache (tmpjars)
  restoreDir,   // a temporary directory to 
copy the snapshot files into
);

The job only run one map task per region in the table snapshot. With this 
feature, client can specify the desired num of mappers when init table snapshot 
mapper job:

TableMapReduceUtil.initTableSnapshotMapperJob(
  snapshotName, // The name of the snapshot (of a 
table) to read from
  scan,  // Scan instance to 
control CF and attribute selection
  mapper, // mapper
  outputKeyClass,   // mapper output key 
  outputValueClass,// mapper output value
  job,   // The current job to 
adjust
  true, // upload HBase jars and 
jars for any of the configured job classes via the distributed cache (tmpjars)
  restoreDir,   // a temporary directory to 
copy the snapshot files into
  splitAlgorithm, // splitAlgo algorithm to split, 
current split algorithms only support RegionSplitter.UniformSplit() and 
RegionSplitter.HexStringSplit()
  n // how many input splits to 
generate per one region
);

  was:
In this task, we make it possible to run multiple mappers per region in the 
table snapshot. 

The following code is primary table snapshot mapper initializatio: 
TableMapReduceUtil.initTableSnapshotMapperJob(
  snapshotName, // The name of the snapshot (of a 
table) to read from
  scan,  // Scan instance to 
control CF and attribute selection
  mapper, // mapper
  outputKeyClass,   // mapper output key 
  outputValueClass,// mapper output value
  job,   // The current job to 
adjust
  true, // upload HBase jars and 
jars for any of the configured job classes via the distributed cache (tmpjars)
  restoreDir,   // a temporary directory to 
copy the snapshot files into
);

The job only run one map task per region in the table snapshot. With this 
feature, client can specify the desired num of mappers when init table snapshot 
mapper job:

TableMapReduceUtil.initTableSnapshotMapperJob(
  snapshotName, // The name of the snapshot (of a 
table) to read from
  scan,  // Scan instance to 
control CF and attribute selection
  mapper, // mapper
  outputKeyClass,   // mapper output key 
  outputValueClass,// mapper output value
  job,   // The current job to 
adjust
  true, // upload HBase jars and 
jars for any of the configured job classes via the distributed cache (tmpjars)
  restoreDir,   // a temporary directory to 
copy the snapshot files into
  splitAlgorithm, // splitAlgo algorithm to split, 
current split algorithms only support RegionSplitter.UniformSplit() and 
RegionSplitter.HexStringSplit()
  n // how many input splits to 
generate per one region
);


> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  

[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-16 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Release Note: 
In this task, we make it possible to run multiple mappers per region in the 
table snapshot. 

The following code is primary table snapshot mapper initializatio: 
TableMapReduceUtil.initTableSnapshotMapperJob(
  snapshotName, // The name of the snapshot (of a 
table) to read from
  scan,  // Scan instance to 
control CF and attribute selection
  mapper, // mapper
  outputKeyClass,   // mapper output key 
  outputValueClass,// mapper output value
  job,   // The current job to 
adjust
  true, // upload HBase jars and 
jars for any of the configured job classes via the distributed cache (tmpjars)
  restoreDir,   // a temporary directory to 
copy the snapshot files into
);

The job only run one map task per region in the table snapshot. With this 
feature, client can specify the desired num of mappers when init table snapshot 
mapper job:

TableMapReduceUtil.initTableSnapshotMapperJob(
  snapshotName, // The name of the snapshot (of a 
table) to read from
  scan,  // Scan instance to 
control CF and attribute selection
  mapper, // mapper
  outputKeyClass,   // mapper output key 
  outputValueClass,// mapper output value
  job,   // The current job to 
adjust
  true, // upload HBase jars and 
jars for any of the configured job classes via the distributed cache (tmpjars)
  restoreDir,   // a temporary directory to 
copy the snapshot files into
  splitAlgorithm, // splitAlgo algorithm to split, 
current split algorithms only support RegionSplitter.UniformSplit() and 
RegionSplitter.HexStringSplit()
  n // how many input splits to 
generate per one region
);

  was:
In this task, we make it possible to run multiple mappers per region in the 
table snapshot. With this feature, client can specify the desired num of 
mappers when init table snapshot mapper job:

TableMapReduceUtil.initTableSnapshotMapperJob(
  snapshotName, // The name of the snapshot (of a 
table) to read from
  scan,  // Scan instance to 
control CF and attribute selection
  mapper, // mapper
  outputKeyClass,   // mapper output key 
  outputValueClass,// mapper output value
  job,   // The current job to 
adjust
  true, // upload HBase jars and 
jars for any of the configured job classes via the distributed cache (tmpjars)
  restoreDir,   // a temporary directory to 
copy the snapshot files into
  splitAlgorithm, // splitAlgo algorithm to split, 
current split algorithms only support RegionSplitter.UniformSplit() and 
RegionSplitter.HexStringSplit()
  n // how many input splits to 
generate per one region
);


> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1-v2.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch, 
> HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-16 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Release Note: 
In this task, we make it possible to run multiple mappers per region in the 
table snapshot. With this feature, client can specify the desired num of 
mappers when init table snapshot mapper job:

TableMapReduceUtil.initTableSnapshotMapperJob(
  snapshotName, // The name of the snapshot (of a 
table) to read from
  scan,  // Scan instance to 
control CF and attribute selection
  mapper, // mapper
  outputKeyClass,   // mapper output key 
  outputValueClass,// mapper output value
  job,   // The current job to 
adjust
  true, // upload HBase jars and 
jars for any of the configured job classes via the distributed cache (tmpjars)
  restoreDir,   // a temporary directory to 
copy the snapshot files into
  splitAlgorithm, // splitAlgo algorithm to split, 
current split algorithms only support RegionSplitter.UniformSplit() and 
RegionSplitter.HexStringSplit()
  n // how many input splits to 
generate per one region
);

  was:
In this task, we make it possible to run multiple mappers per region in the 
table snapshot. With this feature, client can specify the desired num of 
mappers when init table snapshot mapper job:
{code}
TableMapReduceUtil.initTableSnapshotMapperJob(
  snapshotName, // The name of the snapshot (of a 
table) to read from
  scan,  // Scan instance to 
control CF and attribute selection
  mapper, // mapper
  outputKeyClass,   // mapper output key 
  outputValueClass,// mapper output value
  job,   // The current job to 
adjust
  true, // upload HBase jars and 
jars for any of the configured job classes via the distributed cache (tmpjars)
  restoreDir,   // a temporary directory to 
copy the snapshot files into
  splitAlgorithm, // splitAlgo algorithm to split, 
current split algorithms only support RegionSplitter.UniformSplit() and 
RegionSplitter.HexStringSplit()
  n // how many input splits to 
generate per one region
);
{code}


> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1-v2.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch, 
> HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-16 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Release Note: 
In this task, we make it possible to run multiple mappers per region in the 
table snapshot. With this feature, client can specify the desired num of 
mappers when init table snapshot mapper job:
{code}
TableMapReduceUtil.initTableSnapshotMapperJob(
  snapshotName, // The name of the snapshot (of a 
table) to read from
  scan,  // Scan instance to 
control CF and attribute selection
  mapper, // mapper
  outputKeyClass,   // mapper output key 
  outputValueClass,// mapper output value
  job,   // The current job to 
adjust
  true, // upload HBase jars and 
jars for any of the configured job classes via the distributed cache (tmpjars)
  restoreDir,   // a temporary directory to 
copy the snapshot files into
  splitAlgorithm, // splitAlgo algorithm to split, 
current split algorithms only support RegionSplitter.UniformSplit() and 
RegionSplitter.HexStringSplit()
  n // how many input splits to 
generate per one region
);
{code}

  was:
In this task, we make it possible to run multiple mappers per region in the 
table snapshot. With this feature, client can specify the desired num of 
mappers when init table snapshot mapper job:

{code}
TableMapReduceUtil.initTableSnapshotMapperJob(
  snapshotName, // The name of the snapshot (of a 
table) to read from
  scan,  // Scan instance to 
control CF and attribute selection
  mapper, // mapper
  outputKeyClass,   // mapper output key 
  outputValueClass,// mapper output value
  job,   // The current job to 
adjust
  true, // upload HBase jars and 
jars for any of the configured job classes via the distributed cache (tmpjars)
  restoreDir,   // a temporary directory to 
copy the snapshot files into
  splitAlgorithm, // splitAlgo algorithm to split, 
current split algorithms only support RegionSplitter.UniformSplit() and 
RegionSplitter.HexStringSplit()
  n // how many input splits to 
generate per one region
);
{code}


> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1-v2.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch, 
> HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-16 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Release Note: 
In this task, we make it possible to run multiple mappers per region in the 
table snapshot. With this feature, client can specify the desired num of 
mappers when init table snapshot mapper job:

{code}
TableMapReduceUtil.initTableSnapshotMapperJob(
  snapshotName, // The name of the snapshot (of a 
table) to read from
  scan,  // Scan instance to 
control CF and attribute selection
  mapper, // mapper
  outputKeyClass,   // mapper output key 
  outputValueClass,// mapper output value
  job,   // The current job to 
adjust
  true, // upload HBase jars and 
jars for any of the configured job classes via the distributed cache (tmpjars)
  restoreDir,   // a temporary directory to 
copy the snapshot files into
  splitAlgorithm, // splitAlgo algorithm to split, 
current split algorithms only support RegionSplitter.UniformSplit() and 
RegionSplitter.HexStringSplit()
  n // how many input splits to 
generate per one region
);
{code}

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1-v2.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch, 
> HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-16 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Description: 
TableSnapshotInputFormat runs one map task per region in the table snapshot. 
This places unnecessary restriction that the region layout of the original 
table needs to take the processing resources available to MR job into 
consideration. Allowing to run multiple mappers per region (assuming reasonably 
even key distribution) would be useful.


  was:
TableSnapshotInputFormat runs one map task per region in the table snapshot. 
This places unnecessary restriction that the region layout of the original 
table needs to take the processing resources available to MR job into 
consideration. Allowing to run multiple mappers per region (assuming reasonably 
even key distribution) would be useful.

With this feature, client can specify the desired num of mappers when init 
table snapshot mapper job:

{code}
TableMapReduceUtil.initTableSnapshotMapperJob(snapshotName,)
{code}



> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1-v2.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch, 
> HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-16 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Description: 
TableSnapshotInputFormat runs one map task per region in the table snapshot. 
This places unnecessary restriction that the region layout of the original 
table needs to take the processing resources available to MR job into 
consideration. Allowing to run multiple mappers per region (assuming reasonably 
even key distribution) would be useful.

With this feature, client can specify the desired num of mappers when init 
table snapshot mapper job:

{code}
TableMapReduceUtil.initTableSnapshotMapperJob(snapshotName,)
{code}


  was:
TableSnapshotInputFormat runs one map task per region in the table snapshot. 
This places unnecessary restriction that the region layout of the original 
table needs to take the processing resources available to MR job into 
consideration. Allowing to run multiple mappers per region (assuming reasonably 
even key distribution) would be useful.




> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1-v2.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch, 
> HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.
> With this feature, client can specify the desired num of mappers when init 
> table snapshot mapper job:
> {code}
> TableMapReduceUtil.initTableSnapshotMapperJob(snapshotName,)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-16 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Description: 
TableSnapshotInputFormat runs one map task per region in the table snapshot. 
This places unnecessary restriction that the region layout of the original 
table needs to take the processing resources available to MR job into 
consideration. Allowing to run multiple mappers per region (assuming reasonably 
even key distribution) would be useful.



  was:TableSnapshotInputFormat runs one map task per region in the table 
snapshot. This places unnecessary restriction that the region layout of the 
original table needs to take the processing resources available to MR job into 
consideration. Allowing to run multiple mappers per region (assuming reasonably 
even key distribution) would be useful.


> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1-v2.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch, 
> HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-16 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-18090:
--
Attachment: HBASE-18090-branch-1-v2.patch

Retry

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1-v2.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch, 
> HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-15 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: (was: HBASE-18090-branch-1-v2.patch)

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-15 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: HBASE-18090-branch-1-v2.patch

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-14 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: (was: HBASE-18090-branch-1-v2.patch)

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-14 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: HBASE-18090-branch-1-v2.patch

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-13 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: HBASE-18090-branch-1-v2.patch

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-13 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: (was: HBASE-18090-branch-1-v2.patch)

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch, 
> HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-12 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: HBASE-18090-branch-1-v2.patch

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-12 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: (was: HBASE-18090-branch-1-v2.patch)

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch, 
> HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-12 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: (was: HBASE-18090-branch-1-v2.patch)

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-12 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: HBASE-18090-branch-1-v2.patch

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-12 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: (was: HBASE-18090-branch-1-v2.patch)

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-12 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: HBASE-18090-branch-1-v2.patch

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1-v2.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch, 
> HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-09 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: HBASE-18090-branch-1-v2.patch

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1-v2.patch, HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-08 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: (was: HBASE-18090.branch-1.patch)

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-08 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: HBASE-18090.branch-1.patch

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch, 
> HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-08 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: (was: HBASE-18090.branch-1-V2.patch)

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-11-08 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: HBASE-18090.branch-1-V2.patch

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch, 
> HBASE-18090.branch-1-V2.patch, HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-10-18 Thread Ashu Pachauri (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashu Pachauri updated HBASE-18090:
--
Affects Version/s: (was: 3.0.0)

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch, 
> HBASE-18090-branch-1.3-v1.patch, HBASE-18090-branch-1.3-v2.patch, 
> HBASE-18090.branch-1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-10-05 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: HBASE-18090.branch-1.patch

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 3.0.0
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090.branch-1.patch, 
> HBASE-18090-V3-master.patch, HBASE-18090-V4-master.patch, 
> HBASE-18090-V5-master.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-10-02 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-18090:
--
Fix Version/s: (was: 2.0.0-alpha-4)
   (was: 3.0.0)
   2.0.0-beta-1

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 3.0.0
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-09-29 Thread Ashu Pachauri (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashu Pachauri updated HBASE-18090:
--
Fix Version/s: 2.0.0-alpha-4

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 3.0.0
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 3.0.0, 2.0.0-alpha-4
>
> Attachments: HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-09-27 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Affects Version/s: (was: 1.4.0)
   3.0.0

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 3.0.0
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 3.0.0
>
> Attachments: HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-09-27 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Fix Version/s: 3.0.0

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 1.4.0
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Fix For: 3.0.0
>
> Attachments: HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-09-27 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: HBASE-18090-V5-master.patch

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 1.4.0
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Attachments: HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-09-27 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: (was: HBASE-18090-V5-master.patch)

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 1.4.0
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Attachments: HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-09-26 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: HBASE-18090-V5-master.patch

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 1.4.0
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Attachments: HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-09-26 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: (was: HBASE-18090-V5-master.patch)

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 1.4.0
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Attachments: HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-09-26 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: HBASE-18090-V5-master.patch

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 1.4.0
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Attachments: HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch, HBASE-18090-V5-master.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-09-19 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: HBASE-18090-V4-master.patch

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 1.4.0
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Attachments: HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-09-19 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: (was: HBASE-18090-V4-master.patch)

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 1.4.0
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Attachments: HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090-V3-master.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-09-18 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: HBASE-18090-V4-master.patch

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 1.4.0
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Attachments: HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090-V3-master.patch, 
> HBASE-18090-V4-master.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-09-13 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: HBASE-18090-V3-master.patch

create patch for master branch

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 1.4.0
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Attachments: HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch, HBASE-18090-V3-master.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-09-08 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: HBASE-18090-branch-1.3-v2.patch

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 1.4.0
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Attachments: HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-09-08 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: (was: HBASE-18090-branch-1.3-v2.patch)

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 1.4.0
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Attachments: HBASE-18090-branch-1.3-v1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-09-08 Thread xinxin fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xinxin fan updated HBASE-18090:
---
Attachment: HBASE-18090-branch-1.3-v2.patch

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 1.4.0
>Reporter: Mikhail Antonov
>Assignee: xinxin fan
> Attachments: HBASE-18090-branch-1.3-v1.patch, 
> HBASE-18090-branch-1.3-v2.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-05-22 Thread Mikhail Antonov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Antonov updated HBASE-18090:

Status: Patch Available  (was: Open)

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 1.4.0
>Reporter: Mikhail Antonov
> Attachments: HBASE-18090-branch-1.3-v1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-05-22 Thread Mikhail Antonov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Antonov updated HBASE-18090:

Status: Open  (was: Patch Available)

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 1.4.0
>Reporter: Mikhail Antonov
> Attachments: HBASE-18090-branch-1.3-v1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-05-22 Thread Mikhail Antonov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Antonov updated HBASE-18090:

Status: Patch Available  (was: Open)

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 1.4.0
>Reporter: Mikhail Antonov
> Attachments: HBASE-18090-branch-1.3-v1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18090) Improve TableSnapshotInputFormat to allow more multiple mappers per region

2017-05-22 Thread Mikhail Antonov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Antonov updated HBASE-18090:

Attachment: HBASE-18090-branch-1.3-v1.patch

patch for review

> Improve TableSnapshotInputFormat to allow more multiple mappers per region
> --
>
> Key: HBASE-18090
> URL: https://issues.apache.org/jira/browse/HBASE-18090
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 1.4.0
>Reporter: Mikhail Antonov
> Attachments: HBASE-18090-branch-1.3-v1.patch
>
>
> TableSnapshotInputFormat runs one map task per region in the table snapshot. 
> This places unnecessary restriction that the region layout of the original 
> table needs to take the processing resources available to MR job into 
> consideration. Allowing to run multiple mappers per region (assuming 
> reasonably even key distribution) would be useful.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)