[
https://issues.apache.org/jira/browse/HBASE-6372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13427952#comment-13427952
]
Lars George commented on HBASE-6372:
------------------------------------
A few comments on the patch:
{code}
+ final static String EXPORT_BATCHING = "hbase.mapreduce.export.batch";
...
+ " -Dhbase.client.scanner.caching=100\n"
{code}
I see that you followed the config key naming already in use where you declared
your new key. But looking at the other keys being used, they are all over the
place. The one for scanner caching - which is the closest to what you are
adding. I suggest we follow the same rules, i.e. name it
"hbase.mapreduce.export.batch".
{code}
+ try {
+ s.setBatch(batching);
+ } catch (Exception e) {
+ LOG.error("Batching is not set because : "+e.toString());
+ }
{code}
Why wrap the setBatch() in a try/except? None of the filter being used are of
the kind that trigger the runtime exception. We can add the try/catch later if
ever needed?
{code}
+ int batching = conf.getInt(EXPORT_BATCHING,-1);
{code}
Minor not, there should a space between the command the value.
{code}
+ LOG.error("Batching is not set because : "+e.toString());
{code}
Same minor nit, no spaces between the string concatenation.
{code}
+ System.err.println("For very wide rows consider set scan batching
properties as below:\n"
{code}
Maybe rephrase a bit? For example: "For tables with very wide rows consider
setting the batch size as below:".
> Add scanner batching to Export job
> ----------------------------------
>
> Key: HBASE-6372
> URL: https://issues.apache.org/jira/browse/HBASE-6372
> Project: HBase
> Issue Type: Improvement
> Components: mapreduce
> Affects Versions: 0.96.0, 0.94.2
> Reporter: Lars George
> Assignee: Shengsheng Huang
> Priority: Minor
> Labels: newbie
> Attachments: HBASE-6372.2.patch, HBASE-6372.3.patch, HBASE-6372.patch
>
>
> When a single row is too large for the RS heap then an OOME can take out the
> entire RS. Setting scanner batching in custom scans helps avoiding this
> scenario, but for the supplied Export job this is not set.
> Similar to HBASE-3421 we can set the batching to a low number - or if needed
> make it a command line option.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira