[jira] [Comment Edited] (CASSANDRA-15098) Endpoints no longer owning tokens are not removed for vnode
[ https://issues.apache.org/jira/browse/CASSANDRA-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16889202#comment-16889202 ] Jay Zhuang edited comment on CASSANDRA-15098 at 7/19/19 10:07 PM: -- Rebased the code and passed tests, please review: | Branch | uTest | jvm-dTest | dTest | dTest vnode | | [15098-3.0|https://github.com/instagram/cassandra/tree/15098-3.0] | [#107 passed |https://circleci.com/gh/Instagram/cassandra/107] | [#108 failed|https://circleci.com/gh/Instagram/cassandra/108], known issue: CASSANDRA-15239 | [#110 failed|https://circleci.com/gh/Instagram/cassandra/110], passed locally, known issue: CASSANDRA-14595 | [#109 failed | https://circleci.com/gh/Instagram/cassandra/109], passed locally, known issue: CASSANDRA-14595 | | [15098-3.11|https://github.com/instagram/cassandra/tree/15098-3.11] | [#100 passed|https://circleci.com/gh/Instagram/cassandra/100] | [#99 passed|https://circleci.com/gh/Instagram/cassandra/99] | [#111 failed|https://circleci.com/gh/Instagram/cassandra/111], passed locally: CASSANDRA-14595 | [#112 failed|https://circleci.com/gh/Instagram/cassandra/112], passed locally: CASSANDRA-14595 | | [15098-trunk|https://github.com/instagram/cassandra/tree/15098-trunk] | [#104 failed|https://circleci.com/gh/Instagram/cassandra/104], passed locally and re-run passed [#117|https://circleci.com/gh/Instagram/cassandra/117] | [#105 passed|https://circleci.com/gh/Instagram/cassandra/105] | [#114 passed|https://circleci.com/gh/Instagram/cassandra/114] | [#113 passed|https://circleci.com/gh/Instagram/cassandra/113] | was (Author: jay.zhuang): Rebased the code and passed tests, please review: | Branch | uTest | jvm-dTest | dTest | dTest vnode | | [15098-3.0|https://github.com/instagram/cassandra/tree/15098-3.0] | [#107 passed |https://circleci.com/gh/Instagram/cassandra/107] | [#108 failed|https://circleci.com/gh/Instagram/cassandra/108], known issue: CASSANDRA-15239 | [#110 failed|https://circleci.com/gh/Instagram/cassandra/110], passed locally, known issue: CASSANDRA-14595 | [#109 failed | https://circleci.com/gh/Instagram/cassandra/109], passed locally, known issue: CASSANDRA-14595 | | [15098-3.11|https://github.com/instagram/cassandra/tree/15098-3.11] | [#100 passed|https://circleci.com/gh/Instagram/cassandra/100] | [#99 passed|https://circleci.com/gh/Instagram/cassandra/99] | [#111 failed|https://circleci.com/gh/Instagram/cassandra/111], passed locally: CASSANDRA-14595 | [#112 failed|https://circleci.com/gh/Instagram/cassandra/112], passed locally: CASSANDRA-14595 | | [15098-trunk|https://github.com/instagram/cassandra/tree/15098-trunk] | [#104 failed|https://circleci.com/gh/Instagram/cassandra/104], passed locally and re-run passed [#117|https://circleci.com/gh/Instagram/cassandra/117] | [#105 passed|https://circleci.com/gh/Instagram/cassandra/105] | [#114 passed|https://circleci.com/gh/Instagram/cassandra/114] | [#113 passed|https://circleci.com/gh/Instagram/cassandra/113] | > Endpoints no longer owning tokens are not removed for vnode > --- > > Key: CASSANDRA-15098 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15098 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Normal > > The logical here to remove endpoints no longer owning tokens is not working > for multiple tokens (vnode): > https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/StorageService.java#L2505 > And it's very expensive to copy the tokenmetadata for every check. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15098) Endpoints no longer owning tokens are not removed for vnode
[ https://issues.apache.org/jira/browse/CASSANDRA-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16889202#comment-16889202 ] Jay Zhuang commented on CASSANDRA-15098: Rebased the code and passed tests, please review: | Branch | uTest | jvm-dTest | dTest | dTest vnode | | [15098-3.0|https://github.com/instagram/cassandra/tree/15098-3.0] | [#107 passed |https://circleci.com/gh/Instagram/cassandra/107] | [#108 failed|https://circleci.com/gh/Instagram/cassandra/108], known issue: CASSANDRA-15239 | [#110 failed|https://circleci.com/gh/Instagram/cassandra/110], passed locally, known issue: CASSANDRA-14595 | [#109 failed | https://circleci.com/gh/Instagram/cassandra/109], passed locally, known issue: CASSANDRA-14595 | | [15098-3.11|https://github.com/instagram/cassandra/tree/15098-3.11] | [#100 passed|https://circleci.com/gh/Instagram/cassandra/100] | [#99 passed|https://circleci.com/gh/Instagram/cassandra/99] | [#111 failed|https://circleci.com/gh/Instagram/cassandra/111], passed locally: CASSANDRA-14595 | [#112 failed|https://circleci.com/gh/Instagram/cassandra/112], passed locally: CASSANDRA-14595 | | [15098-trunk|https://github.com/instagram/cassandra/tree/15098-trunk] | [#104 failed|https://circleci.com/gh/Instagram/cassandra/104], passed locally and re-run passed [#117|https://circleci.com/gh/Instagram/cassandra/117] | [#105 passed|https://circleci.com/gh/Instagram/cassandra/105] | [#114 passed|https://circleci.com/gh/Instagram/cassandra/114] | [#113 passed|https://circleci.com/gh/Instagram/cassandra/113] | > Endpoints no longer owning tokens are not removed for vnode > --- > > Key: CASSANDRA-15098 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15098 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Normal > > The logical here to remove endpoints no longer owning tokens is not working > for multiple tokens (vnode): > https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/StorageService.java#L2505 > And it's very expensive to copy the tokenmetadata for every check. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15240) Reinstate support for native libraries for in-JVM dtests
Jon Meredith created CASSANDRA-15240: Summary: Reinstate support for native libraries for in-JVM dtests Key: CASSANDRA-15240 URL: https://issues.apache.org/jira/browse/CASSANDRA-15240 Project: Cassandra Issue Type: Improvement Components: Test/dtest Reporter: Jon Meredith While working on CASSANDRA-15170 native libraries for libc functions, epoll support and openssl were observed holding gcroots to the instance class loaders when in-JVM dtest {{with(NETWORK)}} support was enabled. The solution for CASSANDRA-15170 was to disable native libraries to get everything working, but this is not ideal because in-JVM tests will not be testing the real code on that platform. One proposed solution from [~ifesdjeen] and [~benedict] is to introduce an additional classloader per-Cassandra version that can be used for loading native libraries and share the {{CassandraVersionClassLoader}} by each instance of that version, enabling the {{InstanceClassLoader}} to be garbage collected. {noformat} CLibrary com.sun.jna.Native.registeredClasses com.sun.jna.Native.options com.sun.jna.Native.registredLibraries Netty io.netty.channel.ChannelException io.netty.channel.unix.DatagramSocketAddress io.netty.channel.unix.PeerCredentials io.netty.internal.tcnative.CertificateCallbackTask io.netty.internal.tcnative.CertificateVerifierTask io.netty.internal.tcnative.SSLPrivateKeyMethodDecryptTask io.netty.internal.tcnative.SSLPrivateKeyMethodSignTask io.netty.internal.tcnative.SSLPrivateKeyMethodTask io.netty.internal.tcnative.SSLTask {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15239) [flaky in-mem dtest] nodeDownDuringMove - org.apache.cassandra.distributed.test.GossipTest
Jay Zhuang created CASSANDRA-15239: -- Summary: [flaky in-mem dtest] nodeDownDuringMove - org.apache.cassandra.distributed.test.GossipTest Key: CASSANDRA-15239 URL: https://issues.apache.org/jira/browse/CASSANDRA-15239 Project: Cassandra Issue Type: Bug Components: Test/dtest Reporter: Jay Zhuang The in-mem dtest fail from time to time: {noformat} nodeDownDuringMove - org.apache.cassandra.distributed.test.GossipTest java.lang.RuntimeException: java.lang.IllegalStateException: Unable to contact any seeds! {noformat} [https://circleci.com/gh/Instagram/cassandra/98] More details: {noformat} Testcase: nodeDownDuringMove(org.apache.cassandra.distributed.test.GossipTest): Caused an ERROR java.lang.IllegalStateException: Unable to contact any seeds! java.lang.RuntimeException: java.lang.IllegalStateException: Unable to contact any seeds! at org.apache.cassandra.distributed.impl.IsolatedExecutor.waitOn(IsolatedExecutor.java:166) at org.apache.cassandra.distributed.impl.IsolatedExecutor.lambda$sync$4(IsolatedExecutor.java:69) at org.apache.cassandra.distributed.impl.Instance.startup(Instance.java:322) at org.apache.cassandra.distributed.impl.AbstractCluster$Wrapper.startup(AbstractCluster.java:148) at org.apache.cassandra.distributed.test.GossipTest.nodeDownDuringMove(GossipTest.java:96) Caused by: java.lang.IllegalStateException: Unable to contact any seeds! at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1261) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:921) at org.apache.cassandra.distributed.impl.Instance.lambda$startup$6(Instance.java:301) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:83) at java.lang.Thread.run(Thread.java:748) Test org.apache.cassandra.distributed.test.GossipTest FAILED {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15238) Verifier does not detect out-of-order cells (while Scrubber does)
[ https://issues.apache.org/jira/browse/CASSANDRA-15238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giacomo Lo Giusto updated CASSANDRA-15238: -- Authors: Giacomo Lo Giusto Bug Category: Parent values: Correctness(12982)Level 1 values: Persistent Corruption / Loss(12986) > Verifier does not detect out-of-order cells (while Scrubber does) > - > > Key: CASSANDRA-15238 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15238 > Project: Cassandra > Issue Type: Bug > Components: Tool/nodetool >Reporter: Giacomo Lo Giusto >Priority: Normal > Attachments: verifier.patch > > > Hello, > This change was tested only for version {{*2.2.13*}}. > We noticed the {{nodetool verify -e}} command was not able to detect corrupt > {{SSTables}} that exhibited out-of-order cells within a row. > This is in contrast to the {{nodetool scrub}} command, which was able to > detect and scrub such corrupted data files. > The proposed changes (see attached patch) include: > * Reusing Scrub's {{OrderCheckerIterator}} in the Verifier (for its > _extended_ use). > * Some added logging to better debug what was the cause of the verification > failure and which key first showed the issue. > * Added unit tests for the Verifier ({{VerifyTest.java}}). > (Some other unrelated test where sometimes failing on our end and were > therefore changed to enhance their deterministic behavior). > Please let me know if the change has value and is correct and safe for all > possible configurations. Should we introduce an extra flag to enable the > extra cell ordering check? > In the {{Verifier}} code there was this line (n. 189) that seemed to suggest > that the newly introduced check was in fact an intended behavior all along, > although we could not replicate this behavior neither in unit test nor with > our production data: > {code:java} > //mimic the scrub read path > new SSTableIdentityIterator(sstable, dataFile, key, true); > {code} > Thanks in advance for your feedback and consideration, > Giacomo -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15238) Verifier does not detect out-of-order cells (while Scrubber does)
[ https://issues.apache.org/jira/browse/CASSANDRA-15238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giacomo Lo Giusto updated CASSANDRA-15238: -- Attachment: verifier.patch > Verifier does not detect out-of-order cells (while Scrubber does) > - > > Key: CASSANDRA-15238 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15238 > Project: Cassandra > Issue Type: Bug > Components: Tool/nodetool >Reporter: Giacomo Lo Giusto >Priority: Normal > Attachments: verifier.patch > > > Hello, > This change was tested only for version {{*2.2.13*}}. > We noticed the {{nodetool verify -e}} command was not able to detect corrupt > {{SSTables}} that exhibited out-of-order cells within a row. > This is in contrast to the {{nodetool scrub}} command, which was able to > detect and scrub such corrupted data files. > The proposed changes (see attached patch) include: > * Reusing Scrub's {{OrderCheckerIterator}} in the Verifier (for its > _extended_ use). > * Some added logging to better debug what was the cause of the verification > failure and which key first showed the issue. > * Added unit tests for the Verifier ({{VerifyTest.java}}). > (Some other unrelated test where sometimes failing on our end and were > therefore changed to enhance their deterministic behavior). > Please let me know if the change has value and is correct and safe for all > possible configurations. Should we introduce an extra flag to enable the > extra cell ordering check? > In the {{Verifier}} code there was this line (n. 189) that seemed to suggest > that the newly introduced check was in fact an intended behavior all along, > although we could not replicate this behavior neither in unit test nor with > our production data: > {code:java} > //mimic the scrub read path > new SSTableIdentityIterator(sstable, dataFile, key, true); > {code} > Thanks in advance for your feedback and consideration, > Giacomo -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15235) Have Verifier check whether cells are out-of-order
[ https://issues.apache.org/jira/browse/CASSANDRA-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giacomo Lo Giusto updated CASSANDRA-15235: -- Resolution: Abandoned Status: Resolved (was: Triage Needed) Created CASSANDRA-15238 instead > Have Verifier check whether cells are out-of-order > -- > > Key: CASSANDRA-15235 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15235 > Project: Cassandra > Issue Type: Improvement >Reporter: Giacomo Lo Giusto >Priority: Normal > Attachments: verifier.patch > > > Hello, > This change was tested only for version {{*2.2.13*}}. > We noticed the {{nodetool verify -e}} command was not able to detect corrupt > {{SSTables}} that exhibited out-of-order cells within a row. > This is in contrast to the {{nodetool scrub}} command, which was able to > detect and scrub such corrupted data files. > The proposed changes (see attached patch) include: > * Reusing Scrub's {{OrderCheckerIterator}} in the Verifier (for its > _extended_ use). > * Some added logging to better debug what was the cause of the verification > failure and which key first showed the issue. > * Added unit tests for the Verifier ({{VerifyTest.java}}). > (Some other unrelated test where sometimes failing on our end and were > therefore changed to enhance their deterministic behavior). > Please let me know if the change has value and is correct and safe for all > possible configurations. Should we introduce an extra flag to enable the > extra cell ordering check? > In the {{Verifier}} code there was this line (n. 189) that seemed to suggest > that the newly introduced check was in fact an intended behavior all along, > although we could not replicate this behavior neither in unit test nor with > our production data: > {code:java} > //mimic the scrub read path > new SSTableIdentityIterator(sstable, dataFile, key, true); > {code} > Thanks in advance for your feedback and consideration, > Giacomo -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15238) Verifier does not detect out-of-order cells (while Scrubber does)
[ https://issues.apache.org/jira/browse/CASSANDRA-15238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giacomo Lo Giusto updated CASSANDRA-15238: -- Impacts: (was: None) > Verifier does not detect out-of-order cells (while Scrubber does) > - > > Key: CASSANDRA-15238 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15238 > Project: Cassandra > Issue Type: Bug > Components: Tool/nodetool >Reporter: Giacomo Lo Giusto >Priority: Normal > Attachments: verifier.patch > > > Hello, > This change was tested only for version {{*2.2.13*}}. > We noticed the {{nodetool verify -e}} command was not able to detect corrupt > {{SSTables}} that exhibited out-of-order cells within a row. > This is in contrast to the {{nodetool scrub}} command, which was able to > detect and scrub such corrupted data files. > The proposed changes (see attached patch) include: > * Reusing Scrub's {{OrderCheckerIterator}} in the Verifier (for its > _extended_ use). > * Some added logging to better debug what was the cause of the verification > failure and which key first showed the issue. > * Added unit tests for the Verifier ({{VerifyTest.java}}). > (Some other unrelated test where sometimes failing on our end and were > therefore changed to enhance their deterministic behavior). > Please let me know if the change has value and is correct and safe for all > possible configurations. Should we introduce an extra flag to enable the > extra cell ordering check? > In the {{Verifier}} code there was this line (n. 189) that seemed to suggest > that the newly introduced check was in fact an intended behavior all along, > although we could not replicate this behavior neither in unit test nor with > our production data: > {code:java} > //mimic the scrub read path > new SSTableIdentityIterator(sstable, dataFile, key, true); > {code} > Thanks in advance for your feedback and consideration, > Giacomo -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15238) Verifier does not detect out-of-order cells (while Scrubber does)
Giacomo Lo Giusto created CASSANDRA-15238: - Summary: Verifier does not detect out-of-order cells (while Scrubber does) Key: CASSANDRA-15238 URL: https://issues.apache.org/jira/browse/CASSANDRA-15238 Project: Cassandra Issue Type: Bug Components: Tool/nodetool Reporter: Giacomo Lo Giusto Hello, This change was tested only for version {{*2.2.13*}}. We noticed the {{nodetool verify -e}} command was not able to detect corrupt {{SSTables}} that exhibited out-of-order cells within a row. This is in contrast to the {{nodetool scrub}} command, which was able to detect and scrub such corrupted data files. The proposed changes (see attached patch) include: * Reusing Scrub's {{OrderCheckerIterator}} in the Verifier (for its _extended_ use). * Some added logging to better debug what was the cause of the verification failure and which key first showed the issue. * Added unit tests for the Verifier ({{VerifyTest.java}}). (Some other unrelated test where sometimes failing on our end and were therefore changed to enhance their deterministic behavior). Please let me know if the change has value and is correct and safe for all possible configurations. Should we introduce an extra flag to enable the extra cell ordering check? In the {{Verifier}} code there was this line (n. 189) that seemed to suggest that the newly introduced check was in fact an intended behavior all along, although we could not replicate this behavior neither in unit test nor with our production data: {code:java} //mimic the scrub read path new SSTableIdentityIterator(sstable, dataFile, key, true); {code} Thanks in advance for your feedback and consideration, Giacomo -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15225) FileUtils.close() does not handle non-IOException
[ https://issues.apache.org/jira/browse/CASSANDRA-15225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888607#comment-16888607 ] Liudmila Kornilova commented on CASSANDRA-15225: Hi [~n.v.harikrishna], Done All IOExceptions except first one are also added to suppressed (see [updated commit|https://github.com/apache/cassandra/pull/332/files]) > FileUtils.close() does not handle non-IOException > - > > Key: CASSANDRA-15225 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15225 > Project: Cassandra > Issue Type: Bug > Components: Local/SSTable >Reporter: Benedict >Assignee: Liudmila Kornilova >Priority: Normal > Labels: pull-request-available > Fix For: 3.0.x, 3.11.x, 4.0.x > > Time Spent: 10m > Remaining Estimate: 0h > > This can lead to {{close}} not being invoked on remaining items -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org