[jira] [Commented] (CASSANDRA-3589) Degraded performance of sstable-generator api and sstable-loader utility in cassandra 1.0.x
[ https://issues.apache.org/jira/browse/CASSANDRA-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207466#comment-13207466 ] Peter Schuller commented on CASSANDRA-3589: --- I just realized something. I wasn't looking into this but had a separate realization investigating streaming performance - ever since Cassandra moved to single-pass streaming (CASSANDRA-2677) streaming easily becomes CPU bound. If the mpbs numbers earlier in this ticket (35, 19) are mega*bytes*, the numbers are well within what you might reasonably expect from being CPU bound. I have yet to look into how the bulk loader stuff works, but is it possible it's going through the same path, such that it's spending CPU time re-creating the sstable on reception? (This may be obvious to folks, I have actually never look at the bulk loader support before.) Degraded performance of sstable-generator api and sstable-loader utility in cassandra 1.0.x --- Key: CASSANDRA-3589 URL: https://issues.apache.org/jira/browse/CASSANDRA-3589 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 1.0.0 Reporter: Samarth Gahire Priority: Minor we are using Sstable-Generation API and Sstable-Loader utility.As soon as newer version of cassandra releases I test them for sstable generation and loading for time taken by both the processes.Till cassandra 0.8.7 there is no significant change in time taken.But in all cassandra-1.0.x i have seen 3-4 times degraded performance in generation and 2 times degraded performance in loading.Because of this we are not upgrading the cassandra to latest version as we are processing some TeraBytes of data everyday time taken is very important. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3589) Degraded performance of sstable-generator api and sstable-loader utility in cassandra 1.0.x
[ https://issues.apache.org/jira/browse/CASSANDRA-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169343#comment-13169343 ] Samarth Gahire commented on CASSANDRA-3589: --- I have used apache-cassandra-2011-12-14_03-12-43-bin.tar.gz binary downloaded from https://builds.apache.org/job/Cassandra/1255/artifact/cassandra/build/; for testing sstable generation. I hope ,I have used a proper binary for testing .Please correct me if I am wrong. There is a massive improvement.Thank you so much for the fix. Eagerly waiting for sstable-loader fix. Degraded performance of sstable-generator api and sstable-loader utility in cassandra 1.0.x --- Key: CASSANDRA-3589 URL: https://issues.apache.org/jira/browse/CASSANDRA-3589 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 1.0.0 Reporter: Samarth Gahire Assignee: Sylvain Lebresne Priority: Minor we are using Sstable-Generation API and Sstable-Loader utility.As soon as newer version of cassandra releases I test them for sstable generation and loading for time taken by both the processes.Till cassandra 0.8.7 there is no significant change in time taken.But in all cassandra-1.0.x i have seen 3-4 times degraded performance in generation and 2 times degraded performance in loading.Because of this we are not upgrading the cassandra to latest version as we are processing some TeraBytes of data everyday time taken is very important. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3589) Degraded performance of sstable-generator api and sstable-loader utility in cassandra 1.0.x
[ https://issues.apache.org/jira/browse/CASSANDRA-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13168321#comment-13168321 ] Samarth Gahire commented on CASSANDRA-3589: --- No, I do not have any secondary indexes on any of the column family and I have done the fair comparison and seen some performance hit in sstable-loader utility. Degraded performance of sstable-generator api and sstable-loader utility in cassandra 1.0.x --- Key: CASSANDRA-3589 URL: https://issues.apache.org/jira/browse/CASSANDRA-3589 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 1.0.0 Reporter: Samarth Gahire Assignee: Sylvain Lebresne Priority: Minor we are using Sstable-Generation API and Sstable-Loader utility.As soon as newer version of cassandra releases I test them for sstable generation and loading for time taken by both the processes.Till cassandra 0.8.7 there is no significant change in time taken.But in all cassandra-1.0.x i have seen 3-4 times degraded performance in generation and 2 times degraded performance in loading.Because of this we are not upgrading the cassandra to latest version as we are processing some TeraBytes of data everyday time taken is very important. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3589) Degraded performance of sstable-generator api and sstable-loader utility in cassandra 1.0.x
[ https://issues.apache.org/jira/browse/CASSANDRA-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13168467#comment-13168467 ] Jonathan Ellis commented on CASSANDRA-3589: --- Have you been able to benchmark Sylvain's patch? Degraded performance of sstable-generator api and sstable-loader utility in cassandra 1.0.x --- Key: CASSANDRA-3589 URL: https://issues.apache.org/jira/browse/CASSANDRA-3589 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 1.0.0 Reporter: Samarth Gahire Assignee: Sylvain Lebresne Priority: Minor we are using Sstable-Generation API and Sstable-Loader utility.As soon as newer version of cassandra releases I test them for sstable generation and loading for time taken by both the processes.Till cassandra 0.8.7 there is no significant change in time taken.But in all cassandra-1.0.x i have seen 3-4 times degraded performance in generation and 2 times degraded performance in loading.Because of this we are not upgrading the cassandra to latest version as we are processing some TeraBytes of data everyday time taken is very important. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3589) Degraded performance of sstable-generator api and sstable-loader utility in cassandra 1.0.x
[ https://issues.apache.org/jira/browse/CASSANDRA-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13168010#comment-13168010 ] Sylvain Lebresne commented on CASSANDRA-3589: - For the generation first, the problem is CASSANDRA-3618, but I've opened a separate ticket with a more accurate description and because this doesn't only affect the sstable generator. Note that in simple tests, I saw up to a ~2x difference between 0.8 and 1.0, but never a 3-4x. That could be because what you do for the sstable generation exacerbate the problem more than my tests, but it would still be great if you could try the patch on CASSANDRA-3618 and if it fully solve the sstable generation problem for you. I have yet to check for the sstable loading performance problem, but a quick question first: do you have secondary indexes on the column family you're loading data in? Degraded performance of sstable-generator api and sstable-loader utility in cassandra 1.0.x --- Key: CASSANDRA-3589 URL: https://issues.apache.org/jira/browse/CASSANDRA-3589 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 1.0.0 Reporter: Samarth Gahire Assignee: Sylvain Lebresne Priority: Minor we are using Sstable-Generation API and Sstable-Loader utility.As soon as newer version of cassandra releases I test them for sstable generation and loading for time taken by both the processes.Till cassandra 0.8.7 there is no significant change in time taken.But in all cassandra-1.0.x i have seen 3-4 times degraded performance in generation and 2 times degraded performance in loading.Because of this we are not upgrading the cassandra to latest version as we are processing some TeraBytes of data everyday time taken is very important. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3589) Degraded performance of sstable-generator api and sstable-loader utility in cassandra 1.0.x
[ https://issues.apache.org/jira/browse/CASSANDRA-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166011#comment-13166011 ] Samarth Gahire commented on CASSANDRA-3589: --- I have checked the CPU and IO utilization 1) While generating sstables CPU utilization with cassandra-0.8.7 is around 80% while it is around 90-95% in cassandra-1.0.5 2) While generating sstables We can see I/O that is disk write after each 20 -25 seconds and cassandra0.8.7 write to disk with around 35mbps while cassandra-1.0.5 write to disk with 19mbps Apart from this i cant see any deference let me know for additional information. Degraded performance of sstable-generator api and sstable-loader utility in cassandra 1.0.x --- Key: CASSANDRA-3589 URL: https://issues.apache.org/jira/browse/CASSANDRA-3589 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 1.0.0 Reporter: Samarth Gahire Assignee: Sylvain Lebresne Priority: Minor we are using Sstable-Generation API and Sstable-Loader utility.As soon as newer version of cassandra releases I test them for sstable generation and loading for time taken by both the processes.Till cassandra 0.8.7 there is no significant change in time taken.But in all cassandra-1.0.x i have seen 3-4 times degraded performance in generation and 2 times degraded performance in loading.Because of this we are not upgrading the cassandra to latest version as we are processing some TeraBytes of data everyday time taken is very important. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3589) Degraded performance of sstable-generator api and sstable-loader utility in cassandra 1.0.x
[ https://issues.apache.org/jira/browse/CASSANDRA-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13165139#comment-13165139 ] Samarth Gahire commented on CASSANDRA-3589: --- Actually i am just calculating the total time taken by the our program to generate the sstables.I can see this deference when i change the cassandra jar included in classpath of program. About CPU and IO utilisation i will check and let you know as soon as possible. Degraded performance of sstable-generator api and sstable-loader utility in cassandra 1.0.x --- Key: CASSANDRA-3589 URL: https://issues.apache.org/jira/browse/CASSANDRA-3589 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 1.0.0 Reporter: Samarth Gahire Priority: Minor we are using Sstable-Generation API and Sstable-Loader utility.As soon as newer version of cassandra releases I test them for sstable generation and loading for time taken by both the processes.Till cassandra 0.8.7 there is no significant change in time taken.But in all cassandra-1.0.x i have seen 3-4 times degraded performance in generation and 2 times degraded performance in loading.Because of this we are not upgrading the cassandra to latest version as we are processing some TeraBytes of data everyday time taken is very important. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira