[ 
https://issues.apache.org/jira/browse/CASSANDRA-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210309#comment-13210309
 ] 

Erik Forsberg commented on CASSANDRA-3859:
------------------------------------------

Nope. Even after applying this patch, if the loading takes more than 
mapred.task.timeout seconds, it will be killed. So progress reporting not 
working.

Would it be possible to add some counters instead of the progress reporting? 
That would be even more useful, i.e. hadoop counters for reporting:

* Number of bytes of sstables generated, before compression
* Number of bytes of sstables generated, after applying hadoop-side compression 
(if enabled)
* Progress report on number of bytes streamed to servers.

I know I'm asking for a feature here, but it could also be an alternative 
solution to this problem.
                
> Add Progress Reporting to Cassandra OutputFormats
> -------------------------------------------------
>
>                 Key: CASSANDRA-3859
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3859
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Hadoop, Tools
>    Affects Versions: 1.1.0
>            Reporter: Samarth Gahire
>            Assignee: Brandon Williams
>            Priority: Minor
>              Labels: bulkloader, hadoop, mapreduce, sstableloader
>             Fix For: 1.1.0
>
>         Attachments: 0001-add-progress-reporting-to-BOF.txt, 
> 0002-Add-progress-to-CFOF.txt
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> When we are using the BulkOutputFormat to load the data to cassandra. We 
> should use the progress reporting to Hadoop Job within Sstable loader because 
> while loading the data for particular task if streaming is taking more time 
> and progress is not reported to Job it may kill the task with timeout 
> exception. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to