[ 
https://issues.apache.org/jira/browse/CASSANDRA-16259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17231379#comment-17231379
 ] 

Benjamin Lerer edited comment on CASSANDRA-16259 at 11/13/20, 11:02 AM:
------------------------------------------------------------------------

{quote}If I understand the change within CASSANDRA-15164 right, then the 
storage format of SSTable statistics has changed, which should also bump the 
SSTable version, shouldn't it?{quote}

The storage format did not change. The encoding was already 
{{<number_of_buckets><bucket_value>\*}}. The old code is able to read the 
statistics produced by the new one. The problem is only at the metric level 
where C* try to merge some SSTable histograms that have a different number of 
buckets with some buggy code.

When you scrub the old SSTable, it is recreated with the new number of buckets 
ensuring that you will not hit the TableMetric bug again. 


was (Author: blerer):
{quote}If I understand the change within CASSANDRA-15164 right, then the 
storage format of SSTable statistics has changed, which should also bump the 
SSTable version, shouldn't it?{quote}

The storage format did not change. The encoding was already 
{{<number_of_buckets><bucket_value>*}}. The old code is able to read the 
statistics produced by the new one. The problem is only at the metric level 
where C* try to merge some SSTable histograms that have a different number of 
buckets with some buggy code.

When you scrub the old SSTable, it is recreated with the new number of buckets 
ensuring that you will not hit the TableMetric bug again. 

> tablehistograms cause ArrayIndexOutOfBoundsException
> ----------------------------------------------------
>
>                 Key: CASSANDRA-16259
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16259
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Observability/Metrics
>            Reporter: Justin Montgomery
>            Assignee: Benjamin Lerer
>            Priority: Normal
>             Fix For: 2.2.x, 3.0.x, 3.11.x, 4.0-beta
>
>
> After upgrading some nodes in our cluster from 3.11.8 to 3.11.9 an error 
> appeared on the upgraded nodes when trying to access *tablehistograms*. The 
> same command run on our .8 nodes return as expected, only the upgraded .9 
> nodes fail. Not all tables fail when queried, but about 90% of them do.
> We use Datastax MCAC which appears to query histograms every 30 seconds, this 
> outputs to the system.log:
> {noformat}
> WARN  [insights-3-1] 2020-11-09 01:11:22,331 UnixSocketClient.java:830 - 
> Error reporting:
> java.lang.ArrayIndexOutOfBoundsException: 115
>     at 
> org.apache.cassandra.metrics.TableMetrics.combineHistograms(TableMetrics.java:261)
>  ~[apache-cassandra-3.11.9.jar:3.11.9]
>     at 
> org.apache.cassandra.metrics.TableMetrics.access$000(TableMetrics.java:48) 
> ~[apache-cassandra-3.11.9.jar:3.11.9]
>     at 
> org.apache.cassandra.metrics.TableMetrics$11.getValue(TableMetrics.java:376) 
> ~[apache-cassandra-3.11.9.jar:3.11.9]
>     at 
> org.apache.cassandra.metrics.TableMetrics$11.getValue(TableMetrics.java:373) 
> ~[apache-cassandra-3.11.9.jar:3.11.9]
>     at 
> com.datastax.mcac.UnixSocketClient.writeMetric(UnixSocketClient.java:839) 
> [datastax-mcac-agent.jar:na]
>     at 
> com.datastax.mcac.UnixSocketClient.access$700(UnixSocketClient.java:78) 
> [datastax-mcac-agent.jar:na]
>     at 
> com.datastax.mcac.UnixSocketClient$2.lambda$onGaugeAdded$0(UnixSocketClient.java:626)
>  ~[datastax-mcac-agent.jar:na]
>     at 
> com.datastax.mcac.UnixSocketClient.writeGroup(UnixSocketClient.java:819) 
> [datastax-mcac-agent.jar:na]
>     at 
> com.datastax.mcac.UnixSocketClient.lambda$restartMetricReporting$2(UnixSocketClient.java:798)
>  [datastax-mcac-agent.jar:na]
>     at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_272]
>     at 
> io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:126)
>  ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
>     at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)
>  ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
>     at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:307) 
> ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
>     at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
>  ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
>     at 
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
>  ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
>     at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_272]{noformat}
> Manually trying a histogram from the CLI:
> {noformat}
> $ nodetool tablehistograms logdata log_height_index
> error: 115
> -- StackTrace --
> java.lang.ArrayIndexOutOfBoundsException: 115
>       at 
> org.apache.cassandra.metrics.TableMetrics.combineHistograms(TableMetrics.java:261)
>       at 
> org.apache.cassandra.metrics.TableMetrics.access$000(TableMetrics.java:48)
>       at 
> org.apache.cassandra.metrics.TableMetrics$11.getValue(TableMetrics.java:376)
>       at 
> org.apache.cassandra.metrics.TableMetrics$11.getValue(TableMetrics.java:373)
>       at 
> org.apache.cassandra.metrics.CassandraMetricsRegistry$JmxGauge.getValue(CassandraMetricsRegistry.java:250)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:72)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:276)
>       at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
>       at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
>       at 
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
>       at 
> com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83)
>       at 
> com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206)
>       at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647)
>       at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678)
>       at 
> com.sun.jmx.remote.security.MBeanServerAccessController.getAttribute(MBeanServerAccessController.java:320)
>       at 
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1445)
>       at 
> javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
>       at 
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at 
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1408)
>       at 
> javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:639)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357)
>       at sun.rmi.transport.Transport$1.run(Transport.java:200)
>       at sun.rmi.transport.Transport$1.run(Transport.java:197)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
>       at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573)
>       at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834)
>       at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to