[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
[ https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726617#comment-13726617 ] Yuki Morishita commented on CASSANDRA-2698: --- Benedict, Thanks for the update. bq. 1) I'm not sure what you mean by not serializing those I meant you don't have to send back to the coordinator. Changing serialized format means we have to bump up messaging version defined in MessagingService. 2.0 got feature freeze, so I think it's better to wait until next minor version for message change. Also, I looked at the change made to MerkleTree#differenceHelper, and I'm still not sure how row count helps improve logic. What is the difference from just using hash value? bq. 5) One thing we might want to consider changing is the format of the EstimatedHistogram ranges in the log messages Yeah, especially -1 in (-1, 0] feels weird. How about omitting lower bound from the label and output like: {code} ~0: xxx ~10: xxx ~20: xxx {code} nit: you should surround whole output logic in Validator#compele with `logger.isDebugEnabled` check Instrument repair to be able to assess it's efficiency (precision) -- Key: CASSANDRA-2698 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Benedict Priority: Minor Labels: lhf Attachments: nodetool_repair_and_cfhistogram.tar.gz, patch_2698_v1.txt, patch.diff, patch-rebased.diff, patch.taketwo.alpha.diff, patch.taketwo.forreview.diff Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis is that the merkle tree precision may deteriorate too much at some data size. To check this hypothesis, it would be reasonably to gather statistic during the merkle tree building of how many rows each merkle tree range account for (and the size that this represent). It is probably an interesting statistic to have anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
[ https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726641#comment-13726641 ] Benedict commented on CASSANDRA-2698: - Good points. As to the changes to differenceHelper(), the row count permits not breaking up contiguous ranges of differences that happen to be separated by unpopulated leaves (using just the hash to determine if the data was populated I realised was dangerous, as you cannot disambiguate between no rows and a non-zero number of empty rows), which in my previous patch was generating a lot of ugly log messages. After sending my patch last night I must admit I began to doubt the sense of keeping the changes in, and was probably the hangover of wanting to retain what I could from the previous patch. I think kill your babies is the mantra to apply here, as it doesn't serve any purpose at the moment, and if we don't intend to send counts over the wire would be actively dangerous. I'll strip out those changes, modify the messages and and fire over another patch. That said, I'd prefer to emit the lower bound as well so we know the starting point; ~100: xxx doesn't tell you if the distribution is 0-100, or 99-100, which might be useful information. This is only helpful for the first item, so could emit only for that, but for neatness I'd probably retain it for all; since we're dealing with integers there's an easy fix of just bumping both start/end by 1 and swapping the brackets. Instrument repair to be able to assess it's efficiency (precision) -- Key: CASSANDRA-2698 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Benedict Priority: Minor Labels: lhf Attachments: nodetool_repair_and_cfhistogram.tar.gz, patch_2698_v1.txt, patch.diff, patch-rebased.diff, patch.taketwo.alpha.diff, patch.taketwo.forreview.diff Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis is that the merkle tree precision may deteriorate too much at some data size. To check this hypothesis, it would be reasonably to gather statistic during the merkle tree building of how many rows each merkle tree range account for (and the size that this represent). It is probably an interesting statistic to have anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
[ https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726665#comment-13726665 ] Benedict commented on CASSANDRA-2698: - Of course the other option is to always emit the range [0..lb] even if its not populated to demarcate the lb - which is your preference? Instrument repair to be able to assess it's efficiency (precision) -- Key: CASSANDRA-2698 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Benedict Priority: Minor Labels: lhf Attachments: nodetool_repair_and_cfhistogram.tar.gz, patch_2698_v1.txt, patch.diff, patch-rebased.diff, patch.taketwo.alpha.diff, patch.taketwo.forreview.diff Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis is that the merkle tree precision may deteriorate too much at some data size. To check this hypothesis, it would be reasonably to gather statistic during the merkle tree building of how many rows each merkle tree range account for (and the size that this represent). It is probably an interesting statistic to have anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
[ https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13725604#comment-13725604 ] Benedict commented on CASSANDRA-2698: - Hi Yuki, Had some fun rebasing, but think everything looks good now. A few things to note: 1) I'm not sure what you mean by not serializing those - for correctness I serialize all of the data in a node. Do you want me to change the serialization methods to not send these values? I don't log them the other end, but I would prefer they were sent to ensure no surprises for users of the data, and also because of some optimisations to difference() that rely on knowing the number of rows for each sub-tree. It's not a tremendous amount of data after all. 2) I've modified DifferencerTest, and created two versions of the testDifference() method - one that tests differences on an empty tree, and one which tests a tree that has been populated with rows. Previously only the former was tested. This is because the changes I made to difference() for my previous patch, which I have retained and which ensures contiguous ranges are emitted where possible, treats the entire empty tree as one contiguous difference range (since the only non-empty sub-range in the tree is different), which was breaking the previous test. This test now works with the fully populated tree, and the previous test now confirms that the whole tree is considered different when it is empty. It's possible you may want to not deploy these improvements in this patch, but it seems a good idea to me whilst it's being modified, and given that I'd made the change already. Since we're not logging the ranges themselves at this time it won't have any direct impact, but it will be useful if that ever changes, and might help with future debugging. 3) I've updated the MerkleTreeTest methods to test the serialization and difference changes, and introduced a new HistogramBuilderTest 4) The histogram is built differently from my first patch, and is described in HistogramBuilder. Basically rather than creating neat linear ranges, I calculate the mean and create ranges that are multiples of the standard deviation either side of the mean, up to min/max (or, in this case, 3 stdevs, plus one range to min/max) 5) One thing we might want to consider changing is the format of the EstimatedHistogram ranges in the log messages. I've reproduced faithfully the boundary conventions of the EstimatedHistogram, but this is not a user friendly convention - it has an exclusive lower bound and inclusive upper bound, as opposed to the typical opposite convention. As such we get ranges like (-1, 0] to represent the range containing only 0, as opposed to [0, 1) Think that's everything. Should respond quickly to queries at the moment, so drop me a line if you have any questions. Instrument repair to be able to assess it's efficiency (precision) -- Key: CASSANDRA-2698 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Benedict Priority: Minor Labels: lhf Attachments: nodetool_repair_and_cfhistogram.tar.gz, patch_2698_v1.txt, patch.diff, patch-rebased.diff, patch.taketwo.alpha.diff Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis is that the merkle tree precision may deteriorate too much at some data size. To check this hypothesis, it would be reasonably to gather statistic during the merkle tree building of how many rows each merkle tree range account for (and the size that this represent). It is probably an interesting statistic to have anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
[ https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724263#comment-13724263 ] Benedict commented on CASSANDRA-2698: - Thanks Yuki, sounds like my changes are fine assuming they test okay. I have a bit of a furlow between my various trips now (which I had rather optimistically expected to find time to test this on) so should be able to get a patch over in the next couple of days or so. Instrument repair to be able to assess it's efficiency (precision) -- Key: CASSANDRA-2698 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Benedict Priority: Minor Labels: lhf Attachments: nodetool_repair_and_cfhistogram.tar.gz, patch_2698_v1.txt, patch.diff, patch-rebased.diff, patch.taketwo.alpha.diff Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis is that the merkle tree precision may deteriorate too much at some data size. To check this hypothesis, it would be reasonably to gather statistic during the merkle tree building of how many rows each merkle tree range account for (and the size that this represent). It is probably an interesting statistic to have anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
[ https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13704655#comment-13704655 ] Yuki Morishita commented on CASSANDRA-2698: --- Hi Benedict, Sorry for late reply. I think calculating number of rows and those size per range part is fine. One thing to point out is that we don't need to serialize those and return to the initiator, just log locally like you do is enough for now. p.s. ActiveRepairService is broken up to o.a.c.repair package, so be careful when rebasing. Instrument repair to be able to assess it's efficiency (precision) -- Key: CASSANDRA-2698 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Benedict Priority: Minor Labels: lhf Attachments: nodetool_repair_and_cfhistogram.tar.gz, patch_2698_v1.txt, patch.diff, patch-rebased.diff, patch.taketwo.alpha.diff Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis is that the merkle tree precision may deteriorate too much at some data size. To check this hypothesis, it would be reasonably to gather statistic during the merkle tree building of how many rows each merkle tree range account for (and the size that this represent). It is probably an interesting statistic to have anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
[ https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689804#comment-13689804 ] Benedict commented on CASSANDRA-2698: - Hi Yuki, {quote} how many rows each merkle tree range account for (and the size that this represent) {quote} I take it two histograms, one of row distribution, the other of size (of row) distribution, is what you're looking for? I've attached a new patch which is *not* complete, in that I have not tested it and may want to change a few of the final details (such as, possibly, where and what is logged with the histogram), but before I iron out those kinks I wanted to run past the main crux of the changes to see if it's what you're looking for. Simply put, the merkle tree ranges now retain both a sizeOfRange (ie size of rows added) and rowsInRange (ie number of rows added). Merkle tree now exposes two histogramXXX() methods which use these, and which as of now are logged in Validator.complete(). As it stands, I serialise both these new values over the wire with any merkle tree, to ensure no unexpected behaviour for future users of the class, and as such I also retained my TreeDifference changes to the merkle tree, which reports the size and row count of each side of a difference. These latter two changes may be slightly controversial, so want to run them past you, as well as confirm the basic information I'm printing is what you're looking for. Cheers! Instrument repair to be able to assess it's efficiency (precision) -- Key: CASSANDRA-2698 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Benedict Priority: Minor Labels: lhf Attachments: nodetool_repair_and_cfhistogram.tar.gz, patch_2698_v1.txt, patch.diff, patch-rebased.diff Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis is that the merkle tree precision may deteriorate too much at some data size. To check this hypothesis, it would be reasonably to gather statistic during the merkle tree building of how many rows each merkle tree range account for (and the size that this represent). It is probably an interesting statistic to have anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
[ https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689854#comment-13689854 ] Benedict commented on CASSANDRA-2698: - Just noticed that patch doesn't include the HistogramBuilder class - after I posted I noticed I needed to pull in the latest remote changes, which unfortunately use 1.7 syntax, and since I'm on an old eclipse my problems window exploded. It's late here so I didn't/don't want to faff around too much, but if you have trouble let me know and I'll upload another patch after upgrading eclipse. The patch uploaded should be easy to quickly scan with my description to let me know if I'm still barking up the wrong tree, or if there are any changes you disagree with in principle. Instrument repair to be able to assess it's efficiency (precision) -- Key: CASSANDRA-2698 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Benedict Priority: Minor Labels: lhf Attachments: nodetool_repair_and_cfhistogram.tar.gz, patch_2698_v1.txt, patch.diff, patch-rebased.diff, patch.taketwo.alpha.diff Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis is that the merkle tree precision may deteriorate too much at some data size. To check this hypothesis, it would be reasonably to gather statistic during the merkle tree building of how many rows each merkle tree range account for (and the size that this represent). It is probably an interesting statistic to have anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
[ https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13684430#comment-13684430 ] Benedict commented on CASSANDRA-2698: - Hi Yuki, Sorry for the glacial response. {quote} If you do that, you should create your own class with labels and array since you're not using default offsets nor other histogram related methods. It confused me at first why you are doing addToIndex to EstimatedHistogram. {quote} Agreed. I was a little reticent to introduce a different histogram class, but it is a little ugly as it stands. {quote} But looking at this from the begining again, what we want to see is if we have Merkle tree of evenly distributed keys(or rows) in each hash. You can use EstimatedHistogram or your own to show that. For now, just use logger to log that distribution at the end of Merkle Tree creation with corresponding repair session Id is fine, instead of sending stats back to the coordinator. {quote} It sounds like all you want logged is the number of rows per merkle leaf, to see if the tree is roughly balanced? In which case I misinterpreted the goal entirely, though it makes sense now I understand more how the repair works. Is it worth leaving in the streaming of the leaf sizes with the merkle tree so that the deltas can be logged in future, should that be desired? I'll strip out the logging of the size of the ranges being streamed for now as well. Instrument repair to be able to assess it's efficiency (precision) -- Key: CASSANDRA-2698 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Benedict Priority: Minor Labels: lhf Attachments: nodetool_repair_and_cfhistogram.tar.gz, patch_2698_v1.txt, patch.diff, patch-rebased.diff Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis is that the merkle tree precision may deteriorate too much at some data size. To check this hypothesis, it would be reasonably to gather statistic during the merkle tree building of how many rows each merkle tree range account for (and the size that this represent). It is probably an interesting statistic to have anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
[ https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629454#comment-13629454 ] Jonathan Ellis commented on CASSANDRA-2698: --- Jake has an Eclipse formatter at https://github.com/tjake/cassandra-style-eclipse. Not sure if it's 100% accurate but Jake's patches look okay to me so it's probably close. :) Instrument repair to be able to assess it's efficiency (precision) -- Key: CASSANDRA-2698 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Benedict Priority: Minor Labels: lhf Attachments: nodetool_repair_and_cfhistogram.tar.gz, patch_2698_v1.txt, patch.diff, patch-rebased.diff Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis is that the merkle tree precision may deteriorate too much at some data size. To check this hypothesis, it would be reasonably to gather statistic during the merkle tree building of how many rows each merkle tree range account for (and the size that this represent). It is probably an interesting statistic to have anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
[ https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626520#comment-13626520 ] Benedict commented on CASSANDRA-2698: - Hi Yuki, Without in some way collecting (or at least sampling) the size of the differences, I don't know what bucket sizes to use. Since I need to reinsert all the records once I've decided this anyway, I need to retain them all, which I chose to do in EstimatedHistogram as they do, in effect, constitute a histogram. I also sample the largest records which I figure could be useful for debugging purposes (though that was just a guess). I don't see why 1000s of items is a major issue? I agree that logging is suboptimal for this data. Presumably similar data for other tasks may be optionally logged in future, and so I would guess this should form part of a discussion about metric logging? {quote} fix coding style (especially whitespace) to match other code. {quote} Do you have an Eclipse formatter profile I could use for your coding convention? I did my best to keep it correct manually, but it is difficult to spot differences in an unfamiliar convention. Whitespace should be comparatively easy though. {quote} EstimatedHistogram#testGroupBy is failing. {quote} Noted - will fix and resubmit {quote} comparator in Arrays#sort in EstimatedHistogram#logSummary has the same conditions in both if and else if. {quote} Thanks, good spot. I'm surprised Eclipse didn't warn me. Instrument repair to be able to assess it's efficiency (precision) -- Key: CASSANDRA-2698 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Benedict Priority: Minor Labels: lhf Attachments: nodetool_repair_and_cfhistogram.tar.gz, patch_2698_v1.txt, patch.diff, patch-rebased.diff Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis is that the merkle tree precision may deteriorate too much at some data size. To check this hypothesis, it would be reasonably to gather statistic during the merkle tree building of how many rows each merkle tree range account for (and the size that this represent). It is probably an interesting statistic to have anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
[ https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626208#comment-13626208 ] Yuki Morishita commented on CASSANDRA-2698: --- [~benedict] hmm, it is not clear to me why you create EstimatedHistogram of size of differences. Sometimes I see more than 1000 of differences for large clusters. You should just create it with reasonable bucket count. You don't have to keep every size for every range. Using logging for outputting statistic is fine at this point, but I think we should come up with other way so that it is easy to see all the related logs and statistics about certain repair session. I don't have specific idea yet though. (Maybe another system cf similar to Tracing?) nits: - fix coding style (especially whitespace) to match other code. - EstimatedHistogram#testGroupBy is failing. - comparator in Arrays#sort in EstimatedHistogram#logSummary has the same conditions in both if and else if. Instrument repair to be able to assess it's efficiency (precision) -- Key: CASSANDRA-2698 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Benedict Priority: Minor Labels: lhf Attachments: nodetool_repair_and_cfhistogram.tar.gz, patch_2698_v1.txt, patch.diff, patch-rebased.diff Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis is that the merkle tree precision may deteriorate too much at some data size. To check this hypothesis, it would be reasonably to gather statistic during the merkle tree building of how many rows each merkle tree range account for (and the size that this represent). It is probably an interesting statistic to have anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
[ https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616569#comment-13616569 ] Benedict commented on CASSANDRA-2698: - Hi Yuki, The patch was created some time ago, and there were some minor renames/changes to MerkleTree and AntiEntropyService in the meantime. I've pulled the latest changes, merged, and regenerated the patch. This is against the main trunk. Instrument repair to be able to assess it's efficiency (precision) -- Key: CASSANDRA-2698 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Benedict Priority: Minor Labels: lhf Attachments: nodetool_repair_and_cfhistogram.tar.gz, patch_2698_v1.txt, patch.diff, patch-rebased.diff Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis is that the merkle tree precision may deteriorate too much at some data size. To check this hypothesis, it would be reasonably to gather statistic during the merkle tree building of how many rows each merkle tree range account for (and the size that this represent). It is probably an interesting statistic to have anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
[ https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13612683#comment-13612683 ] Yuki Morishita commented on CASSANDRA-2698: --- Hi, Benedict, Patch is not applicable to any branch. Can you rebase and re-upload? When you do, please tell us which branch the patch is for. Instrument repair to be able to assess it's efficiency (precision) -- Key: CASSANDRA-2698 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Benedict Priority: Minor Labels: lhf Attachments: nodetool_repair_and_cfhistogram.tar.gz, patch_2698_v1.txt, patch.diff Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis is that the merkle tree precision may deteriorate too much at some data size. To check this hypothesis, it would be reasonably to gather statistic during the merkle tree building of how many rows each merkle tree range account for (and the size that this represent). It is probably an interesting statistic to have anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
[ https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13609625#comment-13609625 ] Benedict commented on CASSANDRA-2698: - Hi, I've uploaded a patch for this issue (patch.diff - apologies for the potentially future-clashing name). Logging is performed in two places, both on the source (not requesting) node of any comparison: 1) On the requesting node in AntiEntropyService.Difference.run(), after the MerkleTree difference is calculated and before the StreamingRepairTask is created 2) On the source node, on which StreamingRepairTask is run, in StreamOut.createPendingFiles() In both cases we log, at debug level, a sample of the largest ranges followed by a histogram of the range size distribution. The first is achieved by inserting each range directly into an EstimatedHistogram, on which we call the new logSummary() method; the second by calling the new groupByFrequency() method on that same histogram, to yield a histogram based on the frequency of sizes present in the original (on which we simply call log()). In case 1, we construct the MerkleTree to include a size taken from the AbstractCompactedRow we compute the hash from, and use this in MerkleTree.difference to estimate the size of mismatching ranges. This tends to underestimate, versus that reported by StreamOut, by around 15%. One design decision of note here: instead of modifying AbstractCompactedRow to return a size (which would be invasive and in some cases incur an unnecessary penalty) we use a custom implementation of MessageDigest that counts the number of bytes provided to it. Case 2 is much simpler, as we already have the ranges and their sizes available to us. There are some other changes, particularly in MerkleTree, with some refactoring/renames/new subclasses as part of updating MerkleTree.difference(). In particular, TreeDifference is returned instead of TreeRange (to accommodate the extra size information), and it is used generally in place of it within this method tree where applicable; hash() and hashHelper() have also been renamed to find() and findHelper(), with a new hash() implementation depending on find(). I'm sure there are other minutiae, but hopefully nothing too opaque. If you need any clarification, feel free to ask. Instrument repair to be able to assess it's efficiency (precision) -- Key: CASSANDRA-2698 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Priority: Minor Labels: lhf Attachments: nodetool_repair_and_cfhistogram.tar.gz, patch_2698_v1.txt, patch.diff Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis is that the merkle tree precision may deteriorate too much at some data size. To check this hypothesis, it would be reasonably to gather statistic during the merkle tree building of how many rows each merkle tree range account for (and the size that this represent). It is probably an interesting statistic to have anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
[ https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13575928#comment-13575928 ] Ivan Sobolev commented on CASSANDRA-2698: - Hi, guys, it appears, the thing would be useful for me too. I'm going to give it a try. I'll follow the plan summarized by [~jbellis]. If you feel that I should know something you know - please shout :) Instrument repair to be able to assess it's efficiency (precision) -- Key: CASSANDRA-2698 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Priority: Minor Labels: lhf Attachments: nodetool_repair_and_cfhistogram.tar.gz Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis is that the merkle tree precision may deteriorate too much at some data size. To check this hypothesis, it would be reasonably to gather statistic during the merkle tree building of how many rows each merkle tree range account for (and the size that this represent). It is probably an interesting statistic to have anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
[ https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408329#comment-13408329 ] Jonathan Ellis commented on CASSANDRA-2698: --- Following the comments at the top, you want two things here: # A histogram of TreeRange row counts # for each pair of merkle tree, the number of ranges that differs and the corresponding streamed size of the data 1. is easy: add an EstimatedHistogram to the MerkleTree class, and when the ranges are finished computing, you'd iterate over each and add its row count to the histogram 2. is a bit more involved: you want to extend the logging done by Differencer to include the given information, which is going to involve poking into the guts of (probably) MerkleTree.difference and performStreamingRepair. I agree though that repair is an intimidating part of the code base. If you want to start with something simpler, that's fine too. Instrument repair to be able to assess it's efficiency (precision) -- Key: CASSANDRA-2698 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Priority: Minor Labels: lhf Attachments: nodetool_repair_and_cfhistogram.tar.gz Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis is that the merkle tree precision may deteriorate too much at some data size. To check this hypothesis, it would be reasonably to gather statistic during the merkle tree building of how many rows each merkle tree range account for (and the size that this represent). It is probably an interesting statistic to have anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
[ https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13406548#comment-13406548 ] Jason Wee commented on CASSANDRA-2698: -- Hello Jonathan, I read through this ticket and trying to understand the context. Unfortunately, at this stage, I am clueless where to start. However, attach is the output of nodetool repair and nodetool cfhistogram which may help in this ticket context imo. Please let me know if the attachment helps or I should probably go to trivial bug first so that I can have more time to understand the code as well as the development culture and procedures here. Instrument repair to be able to assess it's efficiency (precision) -- Key: CASSANDRA-2698 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Priority: Minor Labels: lhf Attachments: nodetool_repair_and_cfhistogram.tar.gz Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis is that the merkle tree precision may deteriorate too much at some data size. To check this hypothesis, it would be reasonably to gather statistic during the merkle tree building of how many rows each merkle tree range account for (and the size that this represent). It is probably an interesting statistic to have anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
[ https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13405822#comment-13405822 ] Jason Wee commented on CASSANDRA-2698: -- Hello, I've been using cassandra and develop client application to interfacing (crud) to the cassandra. It has been a great software and I would like to contribute back to cassandra and I've read http://wiki.apache.org/cassandra/HowToContribute which link me to https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=truejqlQuery=project+%3D+12310865+AND+labels+%3D+lhf+AND+status+!%3D+resolved . Since this is a fresh for me, I'm not sure if this is a right place to start contributing and I hope you can response where and how should I able to contribute. Thank you. Instrument repair to be able to assess it's efficiency (precision) -- Key: CASSANDRA-2698 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Priority: Minor Labels: lhf Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis is that the merkle tree precision may deteriorate too much at some data size. To check this hypothesis, it would be reasonably to gather statistic during the merkle tree building of how many rows each merkle tree range account for (and the size that this represent). It is probably an interesting statistic to have anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
[ https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13405990#comment-13405990 ] Jonathan Ellis commented on CASSANDRA-2698: --- Hi Jason, you've come to the right place. Fire away. Instrument repair to be able to assess it's efficiency (precision) -- Key: CASSANDRA-2698 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Priority: Minor Labels: lhf Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis is that the merkle tree precision may deteriorate too much at some data size. To check this hypothesis, it would be reasonably to gather statistic during the merkle tree building of how many rows each merkle tree range account for (and the size that this represent). It is probably an interesting statistic to have anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
[ https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095348#comment-13095348 ] Sylvain Lebresne commented on CASSANDRA-2698: - An EstimatedHistogram would be just fine. That plus for each pair of merkle tree, the number of ranges that differs and the corresponding streamed size of the data would be a very good start imho. I think the only thing we need to figure out for this patch is where it makes the most sense to record that data. What I mean here is that the merkle trees are computed on each node participating in a repair (and thus that is where the EstimatedHistogram can be computed), while the computing of the differences is only done on the coordinator. But on an ideal world, it would seem more useful to store those information together (for a given repair) because they are related. Instrument repair to be able to assess it's efficiency (precision) -- Key: CASSANDRA-2698 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Priority: Minor Labels: lhf Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis is that the merkle tree precision may deteriorate too much at some data size. To check this hypothesis, it would be reasonably to gather statistic during the merkle tree building of how many rows each merkle tree range account for (and the size that this represent). It is probably an interesting statistic to have anyway. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira