[
https://issues.apache.org/jira/browse/CASSANDRA-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13213452#comment-13213452
]
Erik Forsberg commented on CASSANDRA-3859:
------------------------------------------
bq. I am not seeing this on our end. Our job is running 50 reducers on our end,
and it certainly takes > timeout seconds (600 for us). It's progressing ...
Just to make sure we're measuring the same thing - are your reducers taking
more than 600 seconds *after* the creation of sstables have finished?
For us, the creation of sstables take ~10 minutes - and during that period the
job is consuming input, so Hadoop knows it's active, and then it's the loading
phase that takes much longer, and gets killed if I don't set
mapred.task.timeout seconds to a very high value.
bq. Brandon, one thing I could think of, is if they are adding a lot of
batches, we don't actually call progress until the loop is over.
Hmm.. what is "a batch" in this context?
> Add Progress Reporting to Cassandra OutputFormats
> -------------------------------------------------
>
> Key: CASSANDRA-3859
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3859
> Project: Cassandra
> Issue Type: Improvement
> Components: Hadoop, Tools
> Affects Versions: 1.1.0
> Reporter: Samarth Gahire
> Assignee: Brandon Williams
> Priority: Minor
> Labels: bulkloader, hadoop, mapreduce, sstableloader
> Fix For: 1.1.0
>
> Attachments: 0001-add-progress-reporting-to-BOF.txt,
> 0002-Add-progress-to-CFOF.txt
>
> Original Estimate: 48h
> Remaining Estimate: 48h
>
> When we are using the BulkOutputFormat to load the data to cassandra. We
> should use the progress reporting to Hadoop Job within Sstable loader because
> while loading the data for particular task if streaming is taking more time
> and progress is not reported to Job it may kill the task with timeout
> exception.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira