[
https://issues.apache.org/jira/browse/CASSANDRA-15347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964853#comment-16964853
]
Doug Rohrer edited comment on CASSANDRA-15347 at 11/1/19 3:15 PM:
------------------------------------------------------------------
This set of PRs allows the in-jvm dtest framework to support native protocol
clients, which allows for testing of the Java client and other use-cases where
it makes sense to test from "outside" (Spark, for example).
Four PRs for different Cassandra versions:
2.2 [changes|https://github.com/apache/cassandra/pull/377]
[Circle|https://circleci.com/workflow-run/19f5082f-eedc-4d8e-8d33-558848fddc77]
3.0 [changes|https://github.com/apache/cassandra/pull/376]
[Circle|https://circleci.com/workflow-run/ddf5b452-2a51-4d3a-9cd4-d4b279e0f280]
3.11 [changes|https://github.com/apache/cassandra/pull/375]
[Circle|https://circleci.com/workflow-run/59c4d1b6-c0c2-4179-b719-a8c041c849ff]
Trunk [changes|https://github.com/apache/cassandra/pull/374]
[Circle|https://circleci.com/workflow-run/fea3a793-bf13-4652-8b88-d29e1b513254]
The changes are more extensive than just "Add Native Transport Support," as I
ran into several reliability issues with the tests once we started allowing
connectivity via the native transport, but may have already been causing some
level of instability, and to speed up test execution times. These changes
include:
- Setting {{auto_bootstrap}} to false by default for in-jvm dtests. There was
no reason to wait for instances to bootstrap before starting tests, as the
cluster is empty, which could slow down test execution and caused some test
timing issues where requests could be made before the instance was fully ready.
Tests that may need {{auto_bootstrap}} later can always set it explicitly.
- It was possible, especially in {{trunk}}, for tests to fail to be able to
create the initial keyspace requested in {{DistributedTestBase.init}} because
of a race between a hard-coded 60-second timeout in MigrationManager
{{MIGRATION_DELAY_IN_MS}} and an identical 60-second hard-coded wait timeout in
the {{SchemaChangeMonitor}}. This could occur if the instance where the schema
change was submitted did not yet see one or more other instances in its live
member list when first gossiping the schema change. There were two changes made
to alleviate this issue:
** Extend the {{SchemaChangeMonitor}}'s delay to 70 seconds to accommodate the
{{MigrationManager}}'s 60-second delay
** In order to avoid the root cause, and the potential of a 70 second delay
if tests hit the race, also added a new monitor {{LiveMemberAgreementMonitor}}
which waits for all instances to agree that the live member count is equal to
our expected count of instances running before moving on from Cluster.startup.
This adds a very minor potential delay to cluster startup as we wait for the
members to all see each other, but completely avoids the possibility that the
subsequent schema change will be delayed by up to 60 seconds.
There are a few other minor changes/refactorings that were picked up from
Alex's original patch for this change, which was never submitted to C*, so he
was kind enough to help me put this together and has done some early code
review as well. A new test {{NativeTransportTest}} was added to cover the
native transport functionality and a new {{ResourceLeakTest}} to make sure we
weren't introducing any cross-classloader references that would block
collection of classes and exhaust java's metaspace.
was (Author: drohrer):
This set of PRs allows the in-jvm dtest framework to support native protocol
clients, which allows for testing of the Java client and other use-cases where
it makes sense to test from "outside" (Spark, for example).
Four PRs for different Cassandra versions:
2.2 [changes|https://github.com/apache/cassandra/pull/377]
[Circle|https://circleci.com/workflow-run/19f5082f-eedc-4d8e-8d33-558848fddc77]
3.0 [changes|https://github.com/apache/cassandra/pull/376]
[Circle|https://circleci.com/workflow-run/ddf5b452-2a51-4d3a-9cd4-d4b279e0f280]
3.11 [changes|https://github.com/apache/cassandra/pull/375]
[Circle|https://circleci.com/workflow-run/59c4d1b6-c0c2-4179-b719-a8c041c849ff]
Trunk [changes|https://github.com/apache/cassandra/pull/374]
[Circle|https://circleci.com/workflow-run/fea3a793-bf13-4652-8b88-d29e1b513254]
The changes are more extensive than just "Add Native Transport Support," as I
ran into several reliability issues with the tests once we started allowing
connectivity via the native transport, but may have already been causing some
level of instability, and to speed up test execution times. These changes
include:
- Setting {{auto_bootstrap}} to false by default for in-jvm dtests. There was
no reason to wait for instances to bootstrap before starting tests, as the
cluster is empty, which could slow down test execution and caused some test
timing issues where requests could be made before the instance was fully ready.
Tests that may need {{auto_bootstrap}} later can always set it explicitly.
- It was possible, especially in {{trunk}}, for tests to fail to be able to
create the initial keyspace requested in {{DistributedTestBase.init}} because
of a race between a hard-coded 60-second timeout in MigrationManager
{{MIGRATION_DELAY_IN_MS}) and an identical 60-second hard-coded wait timeout in
the {{SchemaChangeMonitor}}. This could occur if the instance where the schema
change was submitted did not yet see one or more other instances in its live
member list when first gossiping the schema change. There were two changes made
to alleviate this issue:
** Extend the {{SchemaChangeMonitor}}'s delay to 70 seconds to accommodate the
{{MigrationManager}}'s 60-second delay
** In order to avoid the root cause, and the potential of a 70 second delay
if tests hit the race, also added a new monitor {{LiveMemberAgreementMonitor}}
which waits for all instances to agree that the live member count is equal to
our expected count of instances running before moving on from Cluster.startup.
This adds a very minor potential delay to cluster startup as we wait for the
members to all see each other, but completely avoids the possibility that the
subsequent schema change will be delayed by up to 60 seconds.
There are a few other minor changes/refactorings that were picked up from
Alex's original patch for this change, which was never submitted to C*, so he
was kind enough to help me put this together and has done some early code
review as well. A new test {{NativeTransportTest}} was added to cover the
native transport functionality and a new {{ResourceLeakTest}} to make sure we
weren't introducing any cross-classloader references that would block
collection of classes and exhaust java's metaspace.
> Add client testing capabilities to in-jvm tests
> -----------------------------------------------
>
> Key: CASSANDRA-15347
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15347
> Project: Cassandra
> Issue Type: Bug
> Components: Test/dtest
> Reporter: Alex Petrov
> Assignee: Doug Rohrer
> Priority: Normal
> Labels: patch-available, pull-request-available
> Time Spent: 40m
> Remaining Estimate: 0h
>
> Allow testing native transport code path using in-jvm tests.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]