[
https://issues.apache.org/jira/browse/CASSANDRA-18739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17756903#comment-17756903
]
Andres de la Peña edited comment on CASSANDRA-18739 at 8/21/23 1:10 PM:
------------------------------------------------------------------------
The patch for 3.11 looks good, and 3.0 doesn't seem to be affected. As for CI
results:
* {{test_dead_sync_initiator}} is CASSANDRA-17702
* {{test_multiple_concurrent_repairs}} doesn't have a ticket but it is [on
Butler|https://butler.cassandra.apache.org/#/ci/upstream/workflow/Cassandra-3.11/failure/repair_tests.repair_test/TestRepair/test_multiple_concurrent_repairs]
* {{test_readrepair}} doesn't have a ticket but it is [on
Butler|https://ci-cassandra.apache.org/job/Cassandra-3.11/484/testReport/dtest-novnode.consistency_test/TestConsistency/test_readrepair/]
* {{readWriteDuringBootstrapTest}} is CASSANDRA-17139
* {{testReprepareMixedVersionWithoutReset}} is CASSANDRA-18021
* {{test_compactionstats}} might be a timeout?
So I guess the only not-known test failures are
[{{test_multi_dc_replace_with_rf1}}|https://app.circleci.com/pipelines/github/instaclustr/cassandra/2973/workflows/6c1a7eb6-583a-4c51-8703-9555fb4e86ff/jobs/103830/tests]
in 5.0 and maybe
[{{test_compactionstats}}|https://app.circleci.com/pipelines/github/instaclustr/cassandra/2977/workflows/4b077cb3-fd1c-4606-9c9d-62508240c1a6/jobs/104227/tests]
in 3.11. [~smiklosovic] have you seen those before?
was (Author: adelapena):
The patch for 3.11 looks good, and 3.0 doesn't seem to be affected. As for CI
results:
* {{test_dead_sync_initiator}} is CASSANDRA-17702
* {{test_multiple_concurrent_repairs}} doesn't have a ticket but it is [on
Butler|https://butler.cassandra.apache.org/#/ci/upstream/workflow/Cassandra-3.11/failure/repair_tests.repair_test/TestRepair/test_multiple_concurrent_repairs]
* {{test_readrepair}} doesn't have a ticket but it is [on
Butler|https://ci-cassandra.apache.org/job/Cassandra-3.11/484/testReport/dtest-novnode.consistency_test/TestConsistency/test_readrepair/]
* {{readWriteDuringBootstrapTest}} is CASSANDRA-17139
* {{testReprepareMixedVersionWithoutReset}} is CASSANDRA-18021
* {{test_compactionstats}} might be a timeout?
So I guess the only not-known test failures are
[{{test_multi_dc_replace_with_rf1}}|https://app.circleci.com/pipelines/github/instaclustr/cassandra/2973/workflows/6c1a7eb6-583a-4c51-8703-9555fb4e86ff/jobs/103830/tests]
in 5.0 and maybe {{test_compactionstats}} in 3.11. [~smiklosovic] have you
seen those before?
> UDF functions fail to load on rolling restart
> ---------------------------------------------
>
> Key: CASSANDRA-18739
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18739
> Project: Cassandra
> Issue Type: Bug
> Components: Feature/UDF
> Reporter: Claude Warren
> Assignee: Claude Warren
> Priority: High
> Fix For: 4.0.x, 4.1.x, 5.x
>
> Attachments: udf_error.cql, udf_error_data.cql
>
> Time Spent: 1h 40m
> Remaining Estimate: 0h
>
> UDFs fail to reload properly after a rolling restart.
> h3. *Symptom:*
> NPE thrown when used after restart.
> h3. *Steps to recreate:*
> # Create a cluster as per cql file
> # Populate the cluster with data.cql.
> # Execute SELECT city_measurements(city, measurement, 16.5) AS m FROM current
> # expect min and max values for cities.
> # Performing a rolling restart on one server.
> # When the server is back up
> # Execute SELECT city_measurements(city, measurement, 16.5) AS m FROM current
> # expect: error result with NPE message.
> {*}Analysis{*}:
> During system restart the SchemaKeyspace.fetchNonSystemKeyspaces() is called,
> when a keyspace with a UDF is loaded the SchemaKeyspace method
> createUDFFromRow() is called, this in turn calls UDFunction.create() which
> eventually calls back to UDFunction constructor where the
> Schema.instance.getKeyspaceMetadata() is called with the keyspace for the UDF
> name as the argument. However, the keyspace for the UDF name is being
> constructed and is not yet in the instance so the method returns null for the
> KeyspaceMetadata. That null KeyspaceMetadata is then used in the udfContext.
> Later when the UDF method is called, if there is a need to call a method on
> the keyspaceMetadata, such as udfContext.newUDTValue() where the
> implementation uses keyspaceMetadata.types, a null pointer is thrown.
> I have verified this affects version 4.0, 4.1 and trunk. I have not verified
> 3.x but I suspect it is the same there.
> I modified UDFunction constructor to assert that the metadata was not null
> and received the following stack trace
> ERROR [main] 2023-08-09 11:44:46,408 CassandraDaemon.java:911 - Exception
> encountered during startup
> java.lang.AssertionError: No metadata for temperatures.city_measurements_sfunc
> at
> org.apache.cassandra.cql3.functions.UDFunction.<init>(UDFunction.java:240)
> at
> org.apache.cassandra.cql3.functions.JavaBasedUDFunction.<init>(JavaBasedUDFunction.java:195)
> at
> org.apache.cassandra.cql3.functions.UDFunction.create(UDFunction.java:276)
> at
> org.apache.cassandra.schema.SchemaKeyspace.createUDFFromRow(SchemaKeyspace.java:1182)
> at
> org.apache.cassandra.schema.SchemaKeyspace.fetchUDFs(SchemaKeyspace.java:1131)
> at
> org.apache.cassandra.schema.SchemaKeyspace.fetchFunctions(SchemaKeyspace.java:1119)
> at
> org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspace(SchemaKeyspace.java:859)
> at
> org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspacesWithout(SchemaKeyspace.java:848)
> at
> org.apache.cassandra.schema.SchemaKeyspace.fetchNonSystemKeyspaces(SchemaKeyspace.java:836)
> at org.apache.cassandra.schema.Schema.loadFromDisk(Schema.java:132)
> at org.apache.cassandra.schema.Schema.loadFromDisk(Schema.java:121)
> at
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:287)
> at
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:765)
> at
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:889)
>
> {{*Possible solution:*}}
> *Version 4.x*
> Create a KeyspaceMetadata.Builder class that uses accepts the types, tables
> and views but uses a builder for the functions.
> Add a KeyspaceMetadata constructor to accept the KeyspaceMetadata.Builder so
> that the function builder keyspaceMetadata value can be set correctly during
> construction of the KeyspaceMetadata.
> Modify SchemaKeyspace.fetchKeyspace(string) so that it uses the
> KeyspaceMetadata.Builder.
>
> *Version 5.x*
> Similar to 4.x except that the KeyspaceMetadata.Builder will have to have
> builders for Views and Tables because the functions necessary to construct
> those objects will not be available until the KeyspaceMetadata.Builder
> constructs it.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]