[
https://issues.apache.org/jira/browse/MAPREDUCE-4522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15391345#comment-15391345
]
Tsuyoshi Ozawa commented on MAPREDUCE-4522:
-------------------------------------------
[~shyam_gav] Thanks for updating and sorry for my delay. Following lines
exceeds 80LOC, so could you fix them?
{quote}
+ List<List<K>> subLists = Lists.partition(records,
context.getConfiguration().getInt(MR_DB_OUTPUTFORMAT_BATCH_SIZE,1000));
{quote}
{quote}
+ public static final String
MR_DB_OUTPUTFORMAT_BATCH_SIZE="mapreduce.output.dboutputformat.batch-size";
{quote}
{quote}
+ <description>The batch size of SQL statements that will be executed before
reporting progress. Default is 1000</description>
{quote}
> DBOutputFormat Times out on large batch inserts
> -----------------------------------------------
>
> Key: MAPREDUCE-4522
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4522
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: task-controller
> Affects Versions: 0.20.205.0
> Reporter: Nathan Jarus
> Assignee: Shyam Gavulla
> Labels: newbie
> Attachments: MAPREDUCE-4522.001.patch
>
>
> In DBRecordWriter#close(), progress is never updated. In large batch inserts,
> this can cause the reduce task to time out due to the amount of time it takes
> the SQL engine to process that insert.
> Potential solutions I can see:
> Don't batch inserts; do the insert when DBRecordWriter#write() is called
> (awful)
> Spin up a thread in DBRecordWriter#close() and update progress in that.
> (gross)
> I can provide code for either if you're interested.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]