bert Passek created CASSANDRA-4229:
--------------------------------------
Summary: Infinite MapReduce Task while reading via
ColumnFamilyInputFormat
Key: CASSANDRA-4229
URL: https://issues.apache.org/jira/browse/CASSANDRA-4229
Project: Cassandra
Issue Type: Bug
Components: Hadoop
Affects Versions: 1.1.0
Environment: Debian Squeeze
Reporter: bert Passek
Attachments: screenshot.jpg
Hi,
we recently upgraded cassandra from version 1.0.9 to 1.1.0. After that we can
not execute any hadoop jobs which reads data from cassandra via
ColumnFamilyInputFormat.
A map task is created which is running infinitely. We are trying to read from a
super column family with more or less 1000 row keys.
This is the output from job interface where we already have 17 million map
input records !!!
Map input records 17.273.127 0 17.273.127
Reduce shuffle bytes 0 391 391
Spilled Records 3.288 0 3.288
Map output bytes 639.849.351 0 639.849.351
CPU time spent (ms) 792.750 7.600 800.350
Total committed heap usage (bytes) 354.680.832 48.955.392
403.636.224
Combine input records 17.039.783 0 17.039.783
SPLIT_RAW_BYTES 212 0 212
Reduce input records 0 0 0
Reduce input groups 0 0 0
Combine output records 3.288 0 3.288
Physical memory (bytes) snapshot 510.275.584 96.370.688
606.646.272
Reduce output records 0 0 0
Virtual memory (bytes) snapshot 1.826.496.512 934.473.728
2.760.970.240
Map output records 17.273.126 0 17.273.126
We must kill the job and we have to go back to version 1.0.9 because 1.1.0 is
not usable for reading from cassandra.
Best regards
Bert Passek
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira