[jira] [Commented] (CASSANDRA-11090) Hinted Handoff loop
[ https://issues.apache.org/jira/browse/CASSANDRA-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15505264#comment-15505264 ] Aleksey Yeschenko commented on CASSANDRA-11090: --- That doesn't seem like a problem to me - that's the way it should work. The original issue was causing empty hint files to be created and deleted every 10 seconds, this seems to be processing non-empty hint files every minute, as intended. > Hinted Handoff loop > --- > > Key: CASSANDRA-11090 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11090 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Aleksey Yeschenko >Priority: Minor > Fix For: 3.0.4, 3.4 > > > After the hints executor finishes sending hints, another hints file is > created for the same host, which is then processed on the next go around. > This continues indefinitely. The new hint files are empty, so there shouldn't > be any network traffic. However, there's still unnecessary hint activity, and > the hint crc file isn't deleted in the 3.0 branch after hints are dispatched > (but is in trunk), so any hint activity will trigger ~8640 files being > created per day until the node is restarted. Restarting the node fixes the > problem, after the existing hint files are processed. > This can be duplicated on cassandra-3.0 to trunk with this script: > https://gist.github.com/bdeggleston/13fbb9e70c0c0bd277c7 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11090) Hinted Handoff loop
[ https://issues.apache.org/jira/browse/CASSANDRA-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15429960#comment-15429960 ] Cameron Zemek commented on CASSANDRA-11090: --- In both 3.5 and 3.7 we seeing hinted handoff loop Aug 04 07:20:33 ip-10-222-104-29.ec2.internal cassandra[31062]: INFO o.a.c.hints.HintsDispatchExecutor Finished hinted handoff of file 4566c9ad-69df-45fd-84d2-6386072d3efa-1470295200416-1.hints to endpoint 4566c9ad-69df-45fd-84d2-6386072d3efa Aug 04 07:21:03 ip-10-222-104-29.ec2.internal cassandra[31062]: INFO o.apache.cassandra.hints.HintsStore Deleted hint file 4566c9ad-69df-45fd-84d2-6386072d3efa-1470295230416-1.hints Aug 04 07:21:03 ip-10-222-104-29.ec2.internal cassandra[31062]: INFO o.a.c.hints.HintsDispatchExecutor Finished hinted handoff of file 4566c9ad-69df-45fd-84d2-6386072d3efa-1470295230416-1.hints to endpoint 4566c9ad-69df-45fd-84d2-6386072d3efa Aug 04 07:21:33 ip-10-222-104-29.ec2.internal cassandra[31062]: INFO o.apache.cassandra.hints.HintsStore Deleted hint file 4566c9ad-69df-45fd-84d2-6386072d3efa-1470295260416-1.hints and it just keeps repeating this. Have to rolling restart the cluster to resolve. Waiting for it to happen again to collect more information on the cause for this loop. > Hinted Handoff loop > --- > > Key: CASSANDRA-11090 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11090 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Aleksey Yeschenko >Priority: Minor > Fix For: 3.0.4, 3.4 > > > After the hints executor finishes sending hints, another hints file is > created for the same host, which is then processed on the next go around. > This continues indefinitely. The new hint files are empty, so there shouldn't > be any network traffic. However, there's still unnecessary hint activity, and > the hint crc file isn't deleted in the 3.0 branch after hints are dispatched > (but is in trunk), so any hint activity will trigger ~8640 files being > created per day until the node is restarted. Restarting the node fixes the > problem, after the existing hint files are processed. > This can be duplicated on cassandra-3.0 to trunk with this script: > https://gist.github.com/bdeggleston/13fbb9e70c0c0bd277c7 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11090) Hinted Handoff loop
[ https://issues.apache.org/jira/browse/CASSANDRA-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15138952#comment-15138952 ] Aleksey Yeschenko commented on CASSANDRA-11090: --- Committed as [1f626087c8819b75f17fcbe757603fc0026d3cc1|https://github.com/apache/cassandra/commit/1f626087c8819b75f17fcbe757603fc0026d3cc1] to 3.0 and merged with trunk, thanks. > Hinted Handoff loop > --- > > Key: CASSANDRA-11090 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11090 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Aleksey Yeschenko >Priority: Minor > Fix For: 3.0.4, 3.4 > > > After the hints executor finishes sending hints, another hints file is > created for the same host, which is then processed on the next go around. > This continues indefinitely. The new hint files are empty, so there shouldn't > be any network traffic. However, there's still unnecessary hint activity, and > the hint crc file isn't deleted in the 3.0 branch after hints are dispatched > (but is in trunk), so any hint activity will trigger ~8640 files being > created per day until the node is restarted. Restarting the node fixes the > problem, after the existing hint files are processed. > This can be duplicated on cassandra-3.0 to trunk with this script: > https://gist.github.com/bdeggleston/13fbb9e70c0c0bd277c7 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11090) Hinted Handoff loop
[ https://issues.apache.org/jira/browse/CASSANDRA-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15138157#comment-15138157 ] Blake Eggleston commented on CASSANDRA-11090: - ah I missed that. +1 then > Hinted Handoff loop > --- > > Key: CASSANDRA-11090 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11090 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Aleksey Yeschenko >Priority: Minor > Fix For: 3.0.x, 3.x > > > After the hints executor finishes sending hints, another hints file is > created for the same host, which is then processed on the next go around. > This continues indefinitely. The new hint files are empty, so there shouldn't > be any network traffic. However, there's still unnecessary hint activity, and > the hint crc file isn't deleted in the 3.0 branch after hints are dispatched > (but is in trunk), so any hint activity will trigger ~8640 files being > created per day until the node is restarted. Restarting the node fixes the > problem, after the existing hint files are processed. > This can be duplicated on cassandra-3.0 to trunk with this script: > https://gist.github.com/bdeggleston/13fbb9e70c0c0bd277c7 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11090) Hinted Handoff loop
[ https://issues.apache.org/jira/browse/CASSANDRA-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15138023#comment-15138023 ] Aleksey Yeschenko commented on CASSANDRA-11090: --- I could, but it would change the semantics around closing the current writer. Right now we only close in {{flush()}} if we are forced to switch the writer if it exceeds the max configured hint file size. With this change, we would do it unconditionally. > Hinted Handoff loop > --- > > Key: CASSANDRA-11090 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11090 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Aleksey Yeschenko >Priority: Minor > Fix For: 3.0.x, 3.x > > > After the hints executor finishes sending hints, another hints file is > created for the same host, which is then processed on the next go around. > This continues indefinitely. The new hint files are empty, so there shouldn't > be any network traffic. However, there's still unnecessary hint activity, and > the hint crc file isn't deleted in the 3.0 branch after hints are dispatched > (but is in trunk), so any hint activity will trigger ~8640 files being > created per day until the node is restarted. Restarting the node fixes the > problem, after the existing hint files are processed. > This can be duplicated on cassandra-3.0 to trunk with this script: > https://gist.github.com/bdeggleston/13fbb9e70c0c0bd277c7 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11090) Hinted Handoff loop
[ https://issues.apache.org/jira/browse/CASSANDRA-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15138007#comment-15138007 ] Blake Eggleston commented on CASSANDRA-11090: - Functionally, this is fine, but could you're checking iterator.hasNext twice. Could you move the flushInternal call under the original hasNext check? So it looks like this: {code} if (!iterator.hasNext()) break; flushInternal(iterator, store); {code} > Hinted Handoff loop > --- > > Key: CASSANDRA-11090 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11090 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Aleksey Yeschenko >Priority: Minor > Fix For: 3.0.x, 3.x > > > After the hints executor finishes sending hints, another hints file is > created for the same host, which is then processed on the next go around. > This continues indefinitely. The new hint files are empty, so there shouldn't > be any network traffic. However, there's still unnecessary hint activity, and > the hint crc file isn't deleted in the 3.0 branch after hints are dispatched > (but is in trunk), so any hint activity will trigger ~8640 files being > created per day until the node is restarted. Restarting the node fixes the > problem, after the existing hint files are processed. > This can be duplicated on cassandra-3.0 to trunk with this script: > https://gist.github.com/bdeggleston/13fbb9e70c0c0bd277c7 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11090) Hinted Handoff loop
[ https://issues.apache.org/jira/browse/CASSANDRA-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15137343#comment-15137343 ] Aleksey Yeschenko commented on CASSANDRA-11090: --- Pushed a simple fix here - checking for iterator emptiness before proceeding to create a file. ||branch||testall||dtest|| |[11090-3.0|https://github.com/iamaleksey/cassandra/tree/11090-3.0]|[testall|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-11090-3.0-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-11090-3.0-dtest]| |[11090-3.4|https://github.com/iamaleksey/cassandra/tree/11090-3.4]|[testall|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-11090-3.4-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-11090-3.4-dtest]| > Hinted Handoff loop > --- > > Key: CASSANDRA-11090 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11090 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Aleksey Yeschenko >Priority: Minor > Fix For: 3.0.x, 3.x > > > After the hints executor finishes sending hints, another hints file is > created for the same host, which is then processed on the next go around. > This continues indefinitely. The new hint files are empty, so there shouldn't > be any network traffic. However, there's still unnecessary hint activity, and > the hint crc file isn't deleted in the 3.0 branch after hints are dispatched > (but is in trunk), so any hint activity will trigger ~8640 files being > created per day until the node is restarted. Restarting the node fixes the > problem, after the existing hint files are processed. > This can be duplicated on cassandra-3.0 to trunk with this script: > https://gist.github.com/bdeggleston/13fbb9e70c0c0bd277c7 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11090) Hinted Handoff loop
[ https://issues.apache.org/jira/browse/CASSANDRA-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15137155#comment-15137155 ] Blake Eggleston commented on CASSANDRA-11090: - Ok, looks like the crc problem is gone in the current 3.0 branch. The main issue I'm seeing though, is that any hint activity starts an infinite loop of empty hint file creation / processing. In the script I linked in the description, some data is inserted while there's a node down, creating a hints file. When the node is brought up, the hints are transmitted, as you'd expect. After that, every 10 seconds, another hints file (empty this time) is created for that node, and processed / deleted. It's less of an issue if the crc problem is fixed, but will still create a lot of noise in the logs for moderately sized clusters. > Hinted Handoff loop > --- > > Key: CASSANDRA-11090 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11090 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Aleksey Yeschenko > Fix For: 3.0.x, 3.x > > > After the hints executor finishes sending hints, another hints file is > created for the same host, which is then processed on the next go around. > This continues indefinitely. The new hint files are empty, so there shouldn't > be any network traffic. However, there's still unnecessary hint activity, and > the hint crc file isn't deleted in the 3.0 branch after hints are dispatched > (but is in trunk), so any hint activity will trigger ~8640 files being > created per day until the node is restarted. Restarting the node fixes the > problem, after the existing hint files are processed. > This can be duplicated on cassandra-3.0 to trunk with this script: > https://gist.github.com/bdeggleston/13fbb9e70c0c0bd277c7 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11090) Hinted Handoff loop
[ https://issues.apache.org/jira/browse/CASSANDRA-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15136934#comment-15136934 ] Aleksey Yeschenko commented on CASSANDRA-11090: --- bq. the hint crc file isn't deleted in the 3.0 branch after hints are dispatched (but is in trunk) CASSANDRA-10947 is definitely there in both 3.0 and trunk (so 3.0.3, 3.3, future 3.4). Are you certain you've tested with a recent cassandra-3.0 branch? > Hinted Handoff loop > --- > > Key: CASSANDRA-11090 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11090 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Aleksey Yeschenko > Fix For: 3.0.x, 3.x > > > After the hints executor finishes sending hints, another hints file is > created for the same host, which is then processed on the next go around. > This continues indefinitely. The new hint files are empty, so there shouldn't > be any network traffic. However, there's still unnecessary hint activity, and > the hint crc file isn't deleted in the 3.0 branch after hints are dispatched > (but is in trunk), so any hint activity will trigger ~8640 files being > created per day until the node is restarted. Restarting the node fixes the > problem, after the existing hint files are processed. > This can be duplicated on cassandra-3.0 to trunk with this script: > https://gist.github.com/bdeggleston/13fbb9e70c0c0bd277c7 -- This message was sent by Atlassian JIRA (v6.3.4#6332)