[
https://issues.apache.org/jira/browse/CASSANDRA-16259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17239181#comment-17239181
]
Benjamin Lerer edited comment on CASSANDRA-16259 at 11/26/20, 1:10 PM:
-----------------------------------------------------------------------
While working on a fix for this bug I hit another issue that impact the 3.0,
3.X and 4.0 versions. On some scenario the number of cells for a row is counted
twice.
We count the number of cells per partition using the
{{BigTableWriter.StatsCollector}} which will increment the number of cells when
{{onPartitionClose}} is called. {{onPartitionClose}} is called when the
transformed {{UnfilteredRowIterator}} is closed by the {{try-with-resources}}
in
[BigTableWriter.append|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java#L214].
Unfortunately, If the {{UnfilteredRowIterator}} is an instance of
{{UnfilteredRows}}
[Transformation.apply|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/transform/Transformation.java#L138]
will not create a wrapper around the {{UnfilteredRowIterator}} but will simply
add the {{Transformation}} to the list of transformations associated to that
iterator.
When a memtable is flush the {{UnfilteredRowIterator}} is opened in a
{{try-with-resources}} in
[FlushRunnable:writeSortedContent|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Memtable.java#L473]
before being passed to {{BigTableWriter|append}}. If the
{{UnfilteredRowIterator}} is an instance of {{UnfilteredRows}},
{{StatsCollector#onPartitionClose}} will be called twice and the cells for the
row will be counted twice.
The patch contains a change that ensure that we do not hit that problem by
ensuring that if we call {{close}} on a class inheriting from {{BaseIterator}}
only the first close will be processed and the following one will be ignored.
|| Branches || Test results ||
| [trunk|https://github.com/apache/cassandra/pull/835] |
[j8|https://app.circleci.com/pipelines/github/blerer/cassandra/72/workflows/77c51c04-5a76-4530-99bd-cb1604f58931],
[j11|https://app.circleci.com/pipelines/github/blerer/cassandra/72/workflows/ab7be933-239c-41fb-a907-fa76a859ed99]
|
| [3.11|https://github.com/apache/cassandra/pull/836] |
[j8|https://app.circleci.com/pipelines/github/blerer/cassandra/69/workflows/9fb4aca3-3b41-4798-a921-07b4c7026b7e]
|
| [3.0|https://github.com/apache/cassandra/pull/837] |
[j8|https://app.circleci.com/pipelines/github/blerer/cassandra/70/workflows/b0ef5b25-0a4f-4ddb-92cd-b4c76f7908ea]
|
| [2.2|https://github.com/apache/cassandra/pull/838] |
[j8|https://app.circleci.com/pipelines/github/blerer/cassandra/71/workflows/d24c6595-9c8c-4893-86e0-ef3b15966f02]
|
was (Author: blerer):
While working on a fix for this bug I hit another issue that impact the 3.0,
3.X and 4.0 versions. On some scenario the number of cells for a row is counted
twice.
We count the number of cells per partition using the
{{BigTableWriter.StatsCollector}} which will increment the number of cells when
{{onPartitionClose}} is called. {{onPartitionClose}} is called when the
transformed {{UnfilteredRowIterator}} is closed by the {{try-with-resources}}
in
[BigTableWriter.append|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java#L214].
Unfortunately, If the {{UnfilteredRowIterator}} is an instance of
{{UnfilteredRows}}
[Transformation.apply|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/transform/Transformation.java#L138]
will not create a wrapper around the {{UnfilteredRowIterator}} but will simply
add the {{Transformation}} to the list of transformations associated to that
iterator.
When a memtable is flush the {{UnfilteredRowIterator}} is opened in a
{{try-with-resources}} in
[FlushRunnable:writeSortedContent|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Memtable.java#L473]
before being passed to {{BigTableWriter|append}}. If the
{{UnfilteredRowIterator}} is an instance of {{UnfilteredRows}},
{{StatsCollector#onPartitionClose}} will be called twice and the cells for the
row will be counted twice.
The patch contains a change that ensure that we do not hit that problem by
ensuring that if we call {{close}} on a class inheriting from {{BaseIterator}}
only the first close will be processed and the following one will be ignored.
|| Branches || Test results ||
| [trunk|https://github.com/apache/cassandra/pull/835] |
[j8|https://app.circleci.com/pipelines/github/blerer/cassandra/72/workflows/77c51c04-5a76-4530-99bd-cb1604f5893],
[j11|https://app.circleci.com/pipelines/github/blerer/cassandra/72/workflows/ab7be933-239c-41fb-a907-fa76a859ed99]
|
| [3.11|https://github.com/apache/cassandra/pull/836] |
[j8|https://app.circleci.com/pipelines/github/blerer/cassandra/69/workflows/9fb4aca3-3b41-4798-a921-07b4c7026b7e]
|
| [3.0|https://github.com/apache/cassandra/pull/837] |
[j8|https://app.circleci.com/pipelines/github/blerer/cassandra/70/workflows/b0ef5b25-0a4f-4ddb-92cd-b4c76f7908ea]
|
| [2.2|https://github.com/apache/cassandra/pull/838] |
[j8|https://app.circleci.com/pipelines/github/blerer/cassandra/71/workflows/d24c6595-9c8c-4893-86e0-ef3b15966f02]
|
> tablehistograms cause ArrayIndexOutOfBoundsException
> ----------------------------------------------------
>
> Key: CASSANDRA-16259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16259
> Project: Cassandra
> Issue Type: Bug
> Components: Observability/Metrics
> Reporter: Justin Montgomery
> Assignee: Benjamin Lerer
> Priority: Normal
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.0-beta
>
>
> After upgrading some nodes in our cluster from 3.11.8 to 3.11.9 an error
> appeared on the upgraded nodes when trying to access *tablehistograms*. The
> same command run on our .8 nodes return as expected, only the upgraded .9
> nodes fail. Not all tables fail when queried, but about 90% of them do.
> We use Datastax MCAC which appears to query histograms every 30 seconds, this
> outputs to the system.log:
> {noformat}
> WARN [insights-3-1] 2020-11-09 01:11:22,331 UnixSocketClient.java:830 -
> Error reporting:
> java.lang.ArrayIndexOutOfBoundsException: 115
> at
> org.apache.cassandra.metrics.TableMetrics.combineHistograms(TableMetrics.java:261)
> ~[apache-cassandra-3.11.9.jar:3.11.9]
> at
> org.apache.cassandra.metrics.TableMetrics.access$000(TableMetrics.java:48)
> ~[apache-cassandra-3.11.9.jar:3.11.9]
> at
> org.apache.cassandra.metrics.TableMetrics$11.getValue(TableMetrics.java:376)
> ~[apache-cassandra-3.11.9.jar:3.11.9]
> at
> org.apache.cassandra.metrics.TableMetrics$11.getValue(TableMetrics.java:373)
> ~[apache-cassandra-3.11.9.jar:3.11.9]
> at
> com.datastax.mcac.UnixSocketClient.writeMetric(UnixSocketClient.java:839)
> [datastax-mcac-agent.jar:na]
> at
> com.datastax.mcac.UnixSocketClient.access$700(UnixSocketClient.java:78)
> [datastax-mcac-agent.jar:na]
> at
> com.datastax.mcac.UnixSocketClient$2.lambda$onGaugeAdded$0(UnixSocketClient.java:626)
> ~[datastax-mcac-agent.jar:na]
> at
> com.datastax.mcac.UnixSocketClient.writeGroup(UnixSocketClient.java:819)
> [datastax-mcac-agent.jar:na]
> at
> com.datastax.mcac.UnixSocketClient.lambda$restartMetricReporting$2(UnixSocketClient.java:798)
> [datastax-mcac-agent.jar:na]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[na:1.8.0_272]
> at
> io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:126)
> ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
> at
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)
> ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:307)
> ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
> ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
> at
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
> ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
> at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_272]{noformat}
> Manually trying a histogram from the CLI:
> {noformat}
> $ nodetool tablehistograms logdata log_height_index
> error: 115
> -- StackTrace --
> java.lang.ArrayIndexOutOfBoundsException: 115
> at
> org.apache.cassandra.metrics.TableMetrics.combineHistograms(TableMetrics.java:261)
> at
> org.apache.cassandra.metrics.TableMetrics.access$000(TableMetrics.java:48)
> at
> org.apache.cassandra.metrics.TableMetrics$11.getValue(TableMetrics.java:376)
> at
> org.apache.cassandra.metrics.TableMetrics$11.getValue(TableMetrics.java:373)
> at
> org.apache.cassandra.metrics.CassandraMetricsRegistry$JmxGauge.getValue(CassandraMetricsRegistry.java:250)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:72)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:276)
> at
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
> at
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
> at
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
> at
> com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83)
> at
> com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206)
> at
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647)
> at
> com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678)
> at
> com.sun.jmx.remote.security.MBeanServerAccessController.getAttribute(MBeanServerAccessController.java:320)
> at
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1445)
> at
> javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
> at
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
> at java.security.AccessController.doPrivileged(Native Method)
> at
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1408)
> at
> javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:639)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357)
> at sun.rmi.transport.Transport$1.run(Transport.java:200)
> at sun.rmi.transport.Transport$1.run(Transport.java:197)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
> at
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573)
> at
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834)
> at
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688)
> at java.security.AccessController.doPrivileged(Native Method)
> at
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]