Author: zznate
Date: Thu Oct 18 01:21:41 2018
New Revision: 1844194

CASSANDRA-14827 - Add content and supporting directory structure for blog post


 Thu Oct 18 01:21:41 2018
@@ -0,0 +1,260 @@
+<!DOCTYPE html>
+  <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
+  <meta name="viewport" content="width=device-width, initial-scale=1">
+  <meta name="description" content="As of September 1st, the Apache Cassandra 
community has shifted the focus of Cassandra 4.0 development from new feature 
work to testing, validation, and hard...">
+  <meta name="keywords" content="cassandra, apache, apache cassandra, 
distributed storage, key value store, scalability, bigtable, dynamo" />
+  <meta name="robots" content="index,follow" />
+  <meta name="language" content="en" />  
+  <title>Finding Bugs in Cassandra&#39;s Internals with Property-based 
+  <link rel="canonical" 
+  <link rel="stylesheet" 
+  <link rel="stylesheet" href="./../../../../css/style.css">
+  <link rel="stylesheet" 
+  <link type="application/atom+xml" rel="alternate" 
href=""; title="Apache Cassandra Website" />
+  <body>
+    <!-- breadcrumbs -->
+<div class="topnav">
+  <div class="container breadcrumb-container">
+    <ul class="breadcrumb">
+      <li>
+        <div class="dropdown">
+          <img class="asf-logo" src="./../../../../img/asf_feather.png" />
+          <a data-toggle="dropdown" href="#">Apache Software Foundation <span 
+          <ul class="dropdown-menu" role="menu" aria-labelledby="dLabel">
+            <li><a href="";>Apache Homepage</a></li>
+            <li><a href="";>License</a></li>
+            <li><a 
+            <li><a 
+            <li><a href="";>Security</a></li>
+          </ul>
+        </div>
+      </li>
+      <li><a href="./../../../../">Apache Cassandra</a></li>
+        <li>Finding Bugs in Cassandra's Internals with Property-based 
+    </ul>
+  </div>
+  <!-- navbar -->
+  <nav class="navbar navbar-default navbar-static-top" role="navigation">
+    <div class="container">
+      <div class="navbar-header">
+        <button type="button" class="navbar-toggle collapsed" 
data-toggle="collapse" data-target="#cassandra-menu" aria-expanded="false">
+          <span class="sr-only">Toggle navigation</span>
+          <span class="icon-bar"></span>
+          <span class="icon-bar"></span>
+          <span class="icon-bar"></span>
+        </button>
+        <a class="navbar-brand" href="./../../../../"><img 
src="./../../../../img/cassandra_logo.png" alt="Apache Cassandra logo" /></a>
+      </div><!-- /.navbar-header -->
+      <div id="cassandra-menu" class="collapse navbar-collapse">
+        <ul class="nav navbar-nav navbar-right">
+          <li><a href="./../../../../">Home</a></li>
+          <li><a href="./../../../../download/">Download</a></li>
+          <li><a href="./../../../../doc/">Documentation</a></li>
+          <li><a href="./../../../../community/">Community</a></li>
+          <li>
+            <a href="./../../../../blog">Blog</a>                    
+        </li>
+        </ul>
+      </div><!-- /#cassandra-menu -->
+    </div>
+  </nav><!-- /.navbar -->
+</div><!-- /.topnav -->
+    <div class="content">
+  <div class="container">
+  <h2>Finding Bugs in Cassandra's Internals with Property-based Testing</h2>
+    <p>Posted on October 17, 2018 by the Apache Cassandra Community</p>
+    <h5><a href="/blog">&laquo; Back to the Apache Cassandra Blog</a></h5>
+    <hr />
+  <p>As of September 1st, the Apache Cassandra community has shifted the focus 
of Cassandra 4.0 development from new feature work to testing, validation, and 
hardening, with the goal of releasing a stable 4.0 that every Cassandra user, 
from small deployments to large corporations, can deploy with confidence. There 
are several projects and methodologies that the community is undertaking to 
this end. One of these is the adoption of property-based testing, which was <a 
 introduced here</a>. This post will take a look at a specific use of this 
approach and how it found a bug in a new feature meant to ensure data integrity 
between the client and Cassandra.</p>
+<h4 id="detecting-corruption-is-a-property">Detecting Corruption is a 
+<p>In this post, we demonstrate property-based testing in Cassandra through 
the integration of the <a 
href="";>QuickTheories</a> library 
introduced as part of the work done for <a 
+<p>This ticket modifies the framing of Cassandra’s native client protocol to 
include checksums in addition to the existing, optional compression. Clients 
can opt-in to this new feature to retain data integrity across the many hops 
between themselves and Cassandra. This is meant to address cases where hardware 
and protocol level checksums fail (due to underlying hardware issues) — a 
case that has been seen in production. A description of the protocol changes 
can be found in the ticket but for the purposes of this discussion the salient 
part is that two checksums are added: one that covers the length(s) of the data 
(if compressed there are two lengths), and one for the data itself. Before 
merging this feature, property-based testing using QuickTheories was used to 
uncover a bug in the calculation of the checksum over the lengths. This bug 
could have led to silent corruption at worst or unexpected errors during 
deserialization at best.</p>
+<p>The test used to find this bug is shown below. This example tests the 
property that when a frame is corrupted, that corruption should be caught by 
checksum comparison. The test is wrapped inside of a standard JUnit test case 
but, once called by JUnit, execution is handed over to QuickTheories to 
generate and execute hundreds of examples. These examples are dictated by the 
types of input that should be generated (the arguments to <code 
class="highlighter-rouge">forAll</code>). The execution of each individual 
example is done by <code class="highlighter-rouge">checkAssert</code> and its 
argument, the <code class="highlighter-rouge">roundTripWithCorruption</code> 
+public void corruptionCausesFailure()
+    qt().withExamples(500)
+        .forAll(inputWithCorruptablePosition(),
+                integers().between(0, Byte.MAX_VALUE).map(Integer::byteValue),
+                compressors(),
+                checksumTypes())
+        .checkAssert(this::roundTripWithCorruption);
+<p>The <code class="highlighter-rouge">roundTripWithCorruption</code> function 
is a generalization of a unit test that worked similarly but for a single case. 
It is given an input to transform and a position in the transformed output to 
insert corruption, as well as what byte to write to the corrupted position. The 
additional arguments (the compressor and checksum type) are used to ensure 
coverage of Cassandra’s various compression and checksumming 
+private void roundTripWithCorruption(Pair&lt;String, Integer&gt; 
+                                     byte corruptionValue,
+                                     Compressor compressor,
+                                     ChecksumType checksum) {
+    String input = inputAndCorruptablePosition.left;
+    ByteBuf expectedBuf = Unpooled.wrappedBuffer(input.getBytes());
+    int byteToCorrupt = inputAndCorruptablePosition.right;
+    ChecksummingTransformer transformer = new 
ChecksummingTransformer(checksum, DEFAULT_BLOCK_SIZE, compressor);
+    ByteBuf outbound = transformer.transformOutbound(expectedBuf);
+    // make sure we're actually expecting to produce some corruption
+    if (outbound.getByte(byteToCorrupt) == corruptionValue)
+        return;
+    if (byteToCorrupt &gt;= outbound.writerIndex())
+        return;
+    try {
+        int oldIndex = outbound.writerIndex();
+        outbound.writerIndex(byteToCorrupt);
+        outbound.writeByte(corruptionValue);
+        outbound.writerIndex(oldIndex);
+        ByteBuf inbound = transformer.transformInbound(outbound, FLAGS);
+        // verify that the content was actually corrupted
+        expectedBuf.readerIndex(0);
+        Assert.assertEquals(expectedBuf, inbound);
+    } catch(ProtocolException e) {
+       return;
+    }
+<p>The remaining piece is how those arguments are generated — the arguments 
to <code class="highlighter-rouge">forAll</code> mentioned above. Each argument 
is a function that returns an input generator. For each example, an input is 
pulled from each generator and passed to <code 
class="highlighter-rouge">roundTripWithCorruption</code>.  The <code 
class="highlighter-rouge">compressors()</code> and <code 
class="highlighter-rouge">checksums()</code> generators aren’t copied here. 
They can be found in the <a 
 and are based on built-in generator methods, provided by QuickTheories, that 
select a value from a list of values. The second argument, <code 
Byte.MAX_VALUE).map(Integer::byteValue)</code>, generates non-negative numbers 
that fit 
 into a single byte. These numbers will be passed as the <code 
class="highlighter-rouge">corruptionValue</code> argument.</p>
+<p>The <code class="highlighter-rouge">inputWithCorruptiblePosition</code> 
generator, copied below, generates strings to use as input to the 
transformation function and a position within the output byte stream to 
corrupt. Because compression prevents knowledge of the output size of the 
frame, the generator tries to choose a somewhat reasonable position to corrupt 
by limiting the choice to the size of the generated string (it’s uncommon for 
compression to generate a larger string and the implementation discards the 
compressed value if it does). It also avoids corrupting the first two bytes of 
the stream which are not covered by a checksum and therefore can be corrupted 
without being caught. The function above ensures that corruption is actually 
introduced and that corrupting a position larger than the size of the output 
does not occur.</p>
+private Gen&lt;Pair&lt;String, Integer&gt;&gt; inputWithCorruptablePosition()
+    return inputs().flatMap(s -&gt; integers().between(2, s.length() + 2)
+                   .map(i -&gt; Pair.create(s, i)));
+<p>With all those pieces in place, if the test were run before the bug were 
fixed, it would fail with the following output.</p>
+java.lang.AssertionError: Property falsified after 2 example(s) 
+Smallest found falsifying value(s) :-
+{(c,3), 0, null, Adler32}
+Cause was :-
+java.lang.IndexOutOfBoundsException: readerIndex(10) + length(16711681) 
exceeds writerIndex(15): UnpooledHeapByteBuf(ridx: 10, widx: 15, cap: 54/54)
+    at 
+    at 
+    at io.netty.buffer.AbstractByteBuf.readBytes(
+    at 
+    at 
+    ...
+Other found falsifying value(s) :- 
+{(c,3), 0, null, CRC32}
+{(c,3), 1, null, CRC32}
+{(c,3), 9, null, CRC32}
+{(c,3), 11, null, CRC32}
+{(c,3), 36, null, CRC32}
+{(c,3), 50, null, CRC32}
+{(c,3), 74, null, CRC32}
+{(c,3), 99, null, CRC32}
+Seed was 179207634899674
+<p>The output shows more than a single failing example. This is because 
QuickTheories, like most property-based testing libraries, comes with a 
shrinker, which performs the task of taking a failure and minimizing its 
inputs. This aids in debugging because there are multiple failing examples to 
look at often removing noise in the process. Additionally, a seed value is 
provided so the same series of tests and failures can be generated again — 
another useful feature when debugging. In this case, the library generated an 
example that contains a single byte of input, which will corrupt the fourth 
byte in the output stream by setting it to zero, using no compression, and 
using Adler32 for checksumming. It can be seen from the other failing examples 
that using CRC32 also fails. This is due to improper calculation of the 
checksum, regardless of the algorithm. In particular, the checksum was only 
calculated over the least significant byte of each length rather than all eight 
bytes. By c
 orrupting the fourth byte of the output stream (the first length’s 
second-most significant byte not covered by the calculation), an invalid length 
is read and later used.</p>
+<h4 id="where-to-find-more">Where to Find More</h4>
+<p>Property-based testing is a broad topic, much of which is not covered by 
this post. In addition to Cassandra, it has been used successfully in several 
places including <a 
href="";>car</a> <a 
+systems</a> and <a href="";>suppliers’ 
products</a>, <a href="";>GNOME 
Glib</a>, <a 
consensus</a>, and other <a 
href="";>distributed</a> <a 
href="";>databases</a>. It can also be 
combined with other approaches such as fault-injection and memory leak 
detection. Stateful models can also be built to generate a series of commands 
instead of running each example on one generated set of inputs. Our goal is to 
evangelize this approach within the Cassandra developer community and encourage 
more testing of this kind as part of our work to deliver the most stable major 
release of Cassandra yet.</p>
+  </div>
+    <hr />
+  <div class="container">
+    <div class="col-md-4 social-blk">
+      <span class="social">
+        <a href="";
+           class="twitter-follow-button"
+           data-show-count="false" data-size="large">Follow @cassandra</a>
+        <script>!function(d,s,id){var 
 'script', 'twitter-wjs');</script>
+        <a href="";
+           class="twitter-hashtag-button"
+           data-size="large"
+           data-related="ApacheCassandra">Tweet #cassandra</a>
+        <script>!function(d,s,id){var 
 'script', 'twitter-wjs');</script>
+      </span>
+      <a class="subscribe-rss icon-link" href="/feed.xml" title="Subscribe to 
Blog via RSS">
+          <span><i class="fa fa-rss"></i></span>
+      </a>
+    </div>
+    <div class="col-md-8 trademark">
+      <p>&copy; 2016 <a href="";>The Apache Software 
+      Apache, the Apache feather logo, and Apache Cassandra are trademarks of 
The Apache Software Foundation.
+      <p>
+    </div>
+  </div><!-- /.container -->
+<!-- Javascript. Placed here so pages load faster -->
+<script src="./../../../../js/underscore-min.js"></script>
+<script type="text/javascript">
+  var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl."; : 
+  document.write(unescape("%3Cscript src='" + gaJsHost + 
"' type='text/javascript'%3E%3C/script%3E"));
+  try {
+    var pageTracker = _gat._getTracker("UA-11583863-1");
+    pageTracker._trackPageview();
+  } catch(err) {}
+  </body>

Modified: cassandra/site/publish/blog/index.html
--- cassandra/site/publish/blog/index.html (original)
+++ cassandra/site/publish/blog/index.html Thu Oct 18 01:21:41 2018
@@ -102,6 +102,15 @@
     <ul class="blog-post-listing">
         <li class="blog-post">
+          <h4><a 
Bugs in Cassandra's Internals with Property-based Testing</a></h4>
+          <p>Posted on October 17, 2018 by the Apache Cassandra Community</p>
+          <p>As of September 1st, the Apache Cassandra community has shifted 
the focus of Cassandra 4.0 development from new feature work to testing, 
validation, and hardening, with the goal of releasing a stable 4.0 that every 
Cassandra user, from small deployments to large corporations, can deploy with 
confidence. There are several projects and methodologies that the community is 
undertaking to this end. One of these is the adoption of property-based 
testing, which was <a 
 introduced here</a>. This post will take a look at a specific use of this 
approach and how it found a bug in a new feature meant to ensure data integrity 
between the client and Cassandra.</p>
+          <h5><a 
href="/blog/2018/10/17/finding_bugs_with_property_based_testing.html">Read more 
+        </li>
+        <li class="blog-post">
           <h4><a href="/blog/2018/08/21/testing_apache_cassandra.html">Testing 
Apache Cassandra 4.0</a></h4>
           <p>Posted on August 21, 2018 by the Apache Cassandra Community</p>
           <p>With the goal of ensuring reliability and stability in Apache 
Cassandra 4.0, the project’s committers have voted to freeze new features on 
September 1 to concentrate on testing and validation before cutting a stable 
beta. Towards that goal, the community is investing in methodologies that can 
be performed at scale to exercise edge cases in the largest Cassandra clusters. 
The result, we hope, is to make Apache Cassandra 4.0 the best-tested and most 
reliable major release right out of the gate.</p>

Modified: cassandra/site/publish/feed.xml
--- cassandra/site/publish/feed.xml (original)
+++ cassandra/site/publish/feed.xml Thu Oct 18 01:21:41 2018
@@ -1,5 +1,109 @@
-<?xml version="1.0" encoding="utf-8"?><feed 
xmlns=""; ><generator uri=""; 
href=""; rel="self" 
type="application/atom+xml" /><link href=""; 
rel="alternate" type="text/html" 
 type="html">Apache Cassandra Website</title><subtitle>The Apache Cassandra 
database is the right choice when you need scalability and high availability 
without compromising performance. Linear scalability and proven fault-tolerance 
on commodity hardware or cloud infrastructure make it the perfect platform for 
mission-critical data. Cassandra's support for replicating across multiple 
datacenters is best-in-class, providing lower latency for your users and the 
peace of mind of knowing that you can survive regional outages.
-</subtitle><entry><title type="html">Testing Apache Cassandra 4.0</title><link 
 rel="alternate" type="text/html" title="Testing Apache Cassandra 4.0" 
 the goal of ensuring reliability and stability in Apache Cassandra 4.0, the 
project’s committers have voted to freeze new features on September 1 to 
concentrate on testing and validation before cutting a stable beta. Towards 
that goal, the community is investing in methodologies that can be performed at 
scale to exercise edge cases in the largest Cassandra clusters. The result, we 
hope, is to make Apache Cassandra 4.0 the best-tested and most reliable major 
release r
 ight out of the gate.&lt;/p&gt;
+<?xml version="1.0" encoding="utf-8"?><feed 
xmlns=""; ><generator uri=""; 
href=""; rel="self" 
type="application/atom+xml" /><link href=""; 
rel="alternate" type="text/html" 
 type="html">Apache Cassandra Website</title><subtitle>The Apache Cassandra 
database is the right choice when you need scalability and high availability 
without compromising performance. Linear scalability and proven fault-tolerance 
on commodity hardware or cloud infrastructure make it the perfect platform for 
mission-critical data. Cassandra's support for replicating across multiple 
datacenters is best-in-class, providing lower latency for your users and the 
peace of mind of knowing that you can survive regional outages.
+</subtitle><entry><title type="html">Finding Bugs in Cassandra’s Internals 
with Property-based Testing</title><link 
 rel="alternate" type="text/html" title="Finding Bugs in Cassandra's Internals 
with Property-based Testing" 
 of September 1st, the Apache Cassandra community has shifted the focus of 
Cassandra 4.0 development from new feature work to testing, validation, and 
hardening, with the goal of releasing a stable 4.0 that every Cassandra user, 
from small deployments to large corporations, can deploy with confidence. There 
are several projects and methodologies that
  the community is undertaking to this end. One of these is the adoption of 
property-based testing, which was &lt;a 
 introduced here&lt;/a&gt;. This post will take a look at a specific use of 
this approach and how it found a bug in a new feature meant to ensure data 
integrity between the client and Cassandra.&lt;/p&gt;
+&lt;h4 id=&quot;detecting-corruption-is-a-property&quot;&gt;Detecting 
Corruption is a Property&lt;/h4&gt;
+&lt;p&gt;In this post, we demonstrate property-based testing in Cassandra 
through the integration of the &lt;a 
 library introduced as part of the work done for &lt;a 
+&lt;p&gt;This ticket modifies the framing of Cassandra’s native client 
protocol to include checksums in addition to the existing, optional 
compression. Clients can opt-in to this new feature to retain data integrity 
across the many hops between themselves and Cassandra. This is meant to address 
cases where hardware and protocol level checksums fail (due to underlying 
hardware issues) — a case that has been seen in production. A description of 
the protocol changes can be found in the ticket but for the purposes of this 
discussion the salient part is that two checksums are added: one that covers 
the length(s) of the data (if compressed there are two lengths), and one for 
the data itself. Before merging this feature, property-based testing using 
QuickTheories was used to uncover a bug in the calculation of the checksum over 
the lengths. This bug could have led to silent corruption at worst or 
unexpected errors during deserialization at best.&lt;/p&gt;
+&lt;p&gt;The test used to find this bug is shown below. This example tests the 
property that when a frame is corrupted, that corruption should be caught by 
checksum comparison. The test is wrapped inside of a standard JUnit test case 
but, once called by JUnit, execution is handed over to QuickTheories to 
generate and execute hundreds of examples. These examples are dictated by the 
types of input that should be generated (the arguments to &lt;code 
class=&quot;highlighter-rouge&quot;&gt;forAll&lt;/code&gt;). The execution of 
each individual example is done by &lt;code 
class=&quot;highlighter-rouge&quot;&gt;checkAssert&lt;/code&gt; and its 
argument, the &lt;code 
+public void corruptionCausesFailure()
+    qt().withExamples(500)
+        .forAll(inputWithCorruptablePosition(),
+                integers().between(0, Byte.MAX_VALUE).map(Integer::byteValue),
+                compressors(),
+                checksumTypes())
+        .checkAssert(this::roundTripWithCorruption);
+&lt;p&gt;The &lt;code 
function is a generalization of a unit test that worked similarly but for a 
single case. It is given an input to transform and a position in the 
transformed output to insert corruption, as well as what byte to write to the 
corrupted position. The additional arguments (the compressor and checksum type) 
are used to ensure coverage of Cassandra’s various compression and 
checksumming implementations.&lt;/p&gt;
+private void roundTripWithCorruption(Pair&amp;lt;String, Integer&amp;gt; 
+                                     byte corruptionValue,
+                                     Compressor compressor,
+                                     ChecksumType checksum) {
+    String input = inputAndCorruptablePosition.left;
+    ByteBuf expectedBuf = Unpooled.wrappedBuffer(input.getBytes());
+    int byteToCorrupt = inputAndCorruptablePosition.right;
+    ChecksummingTransformer transformer = new 
ChecksummingTransformer(checksum, DEFAULT_BLOCK_SIZE, compressor);
+    ByteBuf outbound = transformer.transformOutbound(expectedBuf);
+    // make sure we're actually expecting to produce some corruption
+    if (outbound.getByte(byteToCorrupt) == corruptionValue)
+        return;
+    if (byteToCorrupt &amp;gt;= outbound.writerIndex())
+        return;
+    try {
+        int oldIndex = outbound.writerIndex();
+        outbound.writerIndex(byteToCorrupt);
+        outbound.writeByte(corruptionValue);
+        outbound.writerIndex(oldIndex);
+        ByteBuf inbound = transformer.transformInbound(outbound, FLAGS);
+        // verify that the content was actually corrupted
+        expectedBuf.readerIndex(0);
+        Assert.assertEquals(expectedBuf, inbound);
+    } catch(ProtocolException e) {
+       return;
+    }
+&lt;p&gt;The remaining piece is how those arguments are generated — the 
arguments to &lt;code 
class=&quot;highlighter-rouge&quot;&gt;forAll&lt;/code&gt; mentioned above. 
Each argument is a function that returns an input generator. For each example, 
an input is pulled from each generator and passed to &lt;code 
The &lt;code class=&quot;highlighter-rouge&quot;&gt;compressors()&lt;/code&gt; 
and &lt;code class=&quot;highlighter-rouge&quot;&gt;checksums()&lt;/code&gt; 
generators aren’t copied here. They can be found in the &lt;a 
 and are based on built-in generator methods, provided by QuickTheories, that 
select a value from a list of values. The second argument, &lt;code 
Byte.MAX_VALUE).map(Integer::byteValue)&lt;/code&gt;, generates non-negative 
numbers that fit into a single byte. These numbers will be passed as the 
&lt;code class=&quot;highlighter-rouge&quot;&gt;corruptionValue&lt;/code&gt; 
+&lt;p&gt;The &lt;code 
 generator, copied below, generates strings to use as input to the 
transformation function and a position within the output byte stream to 
corrupt. Because compression prevents knowledge of the output size of the 
frame, the generator tries to choose a somewhat reasonable position to corrupt 
by limiting the choice to the size of the generated string (it’s uncommon for 
compression to generate a larger string and the implementation discards the 
compressed value if it does). It also avoids corrupting the first two bytes of 
the stream which are not covered by a checksum and therefore can be corrupted 
without being caught. The function above ensures that corruption is actually 
introduced and that corrupting a position larger than the size of the output 
does not occur.&lt;/p&gt;
+private Gen&amp;lt;Pair&amp;lt;String, Integer&amp;gt;&amp;gt; 
+    return inputs().flatMap(s -&amp;gt; integers().between(2, s.length() + 2)
+                   .map(i -&amp;gt; Pair.create(s, i)));
+&lt;p&gt;With all those pieces in place, if the test were run before the bug 
were fixed, it would fail with the following output.&lt;/p&gt;
+java.lang.AssertionError: Property falsified after 2 example(s) 
+Smallest found falsifying value(s) :-
+{(c,3), 0, null, Adler32}
+Cause was :-
+java.lang.IndexOutOfBoundsException: readerIndex(10) + length(16711681) 
exceeds writerIndex(15): UnpooledHeapByteBuf(ridx: 10, widx: 15, cap: 54/54)
+    at 
+    at 
+    at io.netty.buffer.AbstractByteBuf.readBytes(
+    at 
+    at 
+    ...
+Other found falsifying value(s) :- 
+{(c,3), 0, null, CRC32}
+{(c,3), 1, null, CRC32}
+{(c,3), 9, null, CRC32}
+{(c,3), 11, null, CRC32}
+{(c,3), 36, null, CRC32}
+{(c,3), 50, null, CRC32}
+{(c,3), 74, null, CRC32}
+{(c,3), 99, null, CRC32}
+Seed was 179207634899674
+&lt;p&gt;The output shows more than a single failing example. This is because 
QuickTheories, like most property-based testing libraries, comes with a 
shrinker, which performs the task of taking a failure and minimizing its 
inputs. This aids in debugging because there are multiple failing examples to 
look at often removing noise in the process. Additionally, a seed value is 
provided so the same series of tests and failures can be generated again — 
another useful feature when debugging. In this case, the library generated an 
example that contains a single byte of input, which will corrupt the fourth 
byte in the output stream by setting it to zero, using no compression, and 
using Adler32 for checksumming. It can be seen from the other failing examples 
that using CRC32 also fails. This is due to improper calculation of the 
checksum, regardless of the algorithm. In particular, the checksum was only 
calculated over the least significant byte of each length rather than all eight 
 . By corrupting the fourth byte of the output stream (the first length’s 
second-most significant byte not covered by the calculation), an invalid length 
is read and later used.&lt;/p&gt;
+&lt;h4 id=&quot;where-to-find-more&quot;&gt;Where to Find More&lt;/h4&gt;
+&lt;p&gt;Property-based testing is a broad topic, much of which is not covered 
by this post. In addition to Cassandra, it has been used successfully in 
several places including &lt;a 
&lt;a href=&quot;;&gt;operating
+systems&lt;/a&gt; and &lt;a 
products&lt;/a&gt;, &lt;a 
Glib&lt;/a&gt;, &lt;a 
 consensus&lt;/a&gt;, and other &lt;a 
href=&quot;;&gt;databases&lt;/a&gt;. It 
can also be combined with other approaches such as fault-injection and memory 
leak detection. Stateful models can also be built to generate a series of 
commands instead of running each example on one generated set of inputs. Our 
goal is to evangelize this approach within the Cassandra developer community 
and encourage more testing of this kind as part of our work to deliver the most 
stable major release of Cassandra yet.&lt;/p&gt;</content><author><name>the 
Apache Cassandra Community</
 name></author><summary type="html">As of September 1st, the Apache Cassandra 
community has shifted the focus of Cassandra 4.0 development from new feature 
work to testing, validation, and hardening, with the goal of releasing a stable 
4.0 that every Cassandra user, from small deployments to large corporations, 
can deploy with confidence. There are several projects and methodologies that 
the community is undertaking to this end. One of these is the adoption of 
property-based testing, which was previously introduced here. This post will 
take a look at a specific use of this approach and how it found a bug in a new 
feature meant to ensure data integrity between the client and 
Cassandra.</summary></entry><entry><title type="html">Testing Apache Cassandra 
 rel="alternate" type="text/html" title="Testing Apache Cassandra 4.0" 
 the goal of ensuring reliability and stability in Apache Cassandra 4.0, the 
project’s committers have voted to freeze new features on September 1 to 
concentrate on testing and validation before cutting a stable beta. Towards 
that goal, the community is investing in methodologies that can be performed at 
scale to exercise edge cases in the largest Cassandra clusters. The result, we 
hope, is to make Apache Cassandra 4.0 the best-tested and most reliable major 
release right out of the gate.&lt;/p&gt;
 &lt;p&gt;In the interests of communication (and hopefully more participation), 
here’s a look at some of the approaches being used to test Apache Cassandra 

 Thu Oct 18 01:21:41 2018
@@ -0,0 +1,116 @@
+layout: post
+title: "Finding Bugs in Cassandra's Internals with Property-based Testing"
+date:   2018-10-17 00:00:00 -0700
+author: the Apache Cassandra Community
+categories: blog
+As of September 1st, the Apache Cassandra community has shifted the focus of 
Cassandra 4.0 development from new feature work to testing, validation, and 
hardening, with the goal of releasing a stable 4.0 that every Cassandra user, 
from small deployments to large corporations, can deploy with confidence. There 
are several projects and methodologies that the community is undertaking to 
this end. One of these is the adoption of property-based testing, which was 
[previously introduced 
 This post will take a look at a specific use of this approach and how it found 
a bug in a new feature meant to ensure data integrity between the client and 
+#### Detecting Corruption is a Property
+In this post, we demonstrate property-based testing in Cassandra through the 
integration of the 
[QuickTheories]( library 
introduced as part of the work done for 
+This ticket modifies the framing of Cassandra's native client protocol to 
include checksums in addition to the existing, optional compression. Clients 
can opt-in to this new feature to retain data integrity across the many hops 
between themselves and Cassandra. This is meant to address cases where hardware 
and protocol level checksums fail (due to underlying hardware issues) — a 
case that has been seen in production. A description of the protocol changes 
can be found in the ticket but for the purposes of this discussion the salient 
part is that two checksums are added: one that covers the length(s) of the data 
(if compressed there are two lengths), and one for the data itself. Before 
merging this feature, property-based testing using QuickTheories was used to 
uncover a bug in the calculation of the checksum over the lengths. This bug 
could have led to silent corruption at worst or unexpected errors during 
deserialization at best.
+The test used to find this bug is shown below. This example tests the property 
that when a frame is corrupted, that corruption should be caught by checksum 
comparison. The test is wrapped inside of a standard JUnit test case but, once 
called by JUnit, execution is handed over to QuickTheories to generate and 
execute hundreds of examples. These examples are dictated by the types of input 
that should be generated (the arguments to `forAll`). The execution of each 
individual example is done by `checkAssert` and its argument, the 
`roundTripWithCorruption` function.
+public void corruptionCausesFailure()
+    qt().withExamples(500)
+        .forAll(inputWithCorruptablePosition(),
+                integers().between(0, Byte.MAX_VALUE).map(Integer::byteValue),
+                compressors(),
+                checksumTypes())
+        .checkAssert(this::roundTripWithCorruption);
+The `roundTripWithCorruption` function is a generalization of a unit test that 
worked similarly but for a single case. It is given an input to transform and a 
position in the transformed output to insert corruption, as well as what byte 
to write to the corrupted position. The additional arguments (the compressor 
and checksum type) are used to ensure coverage of Cassandra's various 
compression and checksumming implementations.
+private void roundTripWithCorruption(Pair<String, Integer> 
+                                     byte corruptionValue,
+                                     Compressor compressor,
+                                     ChecksumType checksum) {
+    String input = inputAndCorruptablePosition.left;
+    ByteBuf expectedBuf = Unpooled.wrappedBuffer(input.getBytes());
+    int byteToCorrupt = inputAndCorruptablePosition.right;
+    ChecksummingTransformer transformer = new 
ChecksummingTransformer(checksum, DEFAULT_BLOCK_SIZE, compressor);
+    ByteBuf outbound = transformer.transformOutbound(expectedBuf);
+    // make sure we're actually expecting to produce some corruption
+    if (outbound.getByte(byteToCorrupt) == corruptionValue)
+        return;
+    if (byteToCorrupt >= outbound.writerIndex())
+        return;
+    try {
+        int oldIndex = outbound.writerIndex();
+        outbound.writerIndex(byteToCorrupt);
+        outbound.writeByte(corruptionValue);
+        outbound.writerIndex(oldIndex);
+        ByteBuf inbound = transformer.transformInbound(outbound, FLAGS);
+        // verify that the content was actually corrupted
+        expectedBuf.readerIndex(0);
+        Assert.assertEquals(expectedBuf, inbound);
+    } catch(ProtocolException e) {
+       return;
+    }
+The remaining piece is how those arguments are generated — the arguments to 
`forAll` mentioned above. Each argument is a function that returns an input 
generator. For each example, an input is pulled from each generator and passed 
to `roundTripWithCorruption`.  The `compressors()` and `checksums()` generators 
aren't copied here. They can be found in the 
 and are based on built-in generator methods, provided by QuickTheories, that 
select a value from a list of values. The second argument, 
`integers().between(0, Byte.MAX_VALUE).map(Integer::byteValue)`, generates 
non-negative numbers that fit into a single byte. These numbers will be passed 
as the `corruptionValue` argument.
+The `inputWithCorruptiblePosition` generator, copied below, generates strings 
to use as input to the transformation function and a position within the output 
byte stream to corrupt. Because compression prevents knowledge of the output 
size of the frame, the generator tries to choose a somewhat reasonable position 
to corrupt by limiting the choice to the size of the generated string (it's 
uncommon for compression to generate a larger string and the implementation 
discards the compressed value if it does). It also avoids corrupting the first 
two bytes of the stream which are not covered by a checksum and therefore can 
be corrupted without being caught. The function above ensures that corruption 
is actually introduced and that corrupting a position larger than the size of 
the output does not occur.
+private Gen<Pair<String, Integer>> inputWithCorruptablePosition()
+    return inputs().flatMap(s -> integers().between(2, s.length() + 2)
+                   .map(i -> Pair.create(s, i)));
+With all those pieces in place, if the test were run before the bug were 
fixed, it would fail with the following output.
+java.lang.AssertionError: Property falsified after 2 example(s) 
+Smallest found falsifying value(s) :-
+{(c,3), 0, null, Adler32}
+Cause was :-
+java.lang.IndexOutOfBoundsException: readerIndex(10) + length(16711681) 
exceeds writerIndex(15): UnpooledHeapByteBuf(ridx: 10, widx: 15, cap: 54/54)
+    at 
+    at 
+    at io.netty.buffer.AbstractByteBuf.readBytes(
+    at 
+    at 
+    ...
+Other found falsifying value(s) :- 
+{(c,3), 0, null, CRC32}
+{(c,3), 1, null, CRC32}
+{(c,3), 9, null, CRC32}
+{(c,3), 11, null, CRC32}
+{(c,3), 36, null, CRC32}
+{(c,3), 50, null, CRC32}
+{(c,3), 74, null, CRC32}
+{(c,3), 99, null, CRC32}
+Seed was 179207634899674
+The output shows more than a single failing example. This is because 
QuickTheories, like most property-based testing libraries, comes with a 
shrinker, which performs the task of taking a failure and minimizing its 
inputs. This aids in debugging because there are multiple failing examples to 
look at often removing noise in the process. Additionally, a seed value is 
provided so the same series of tests and failures can be generated again — 
another useful feature when debugging. In this case, the library generated an 
example that contains a single byte of input, which will corrupt the fourth 
byte in the output stream by setting it to zero, using no compression, and 
using Adler32 for checksumming. It can be seen from the other failing examples 
that using CRC32 also fails. This is due to improper calculation of the 
checksum, regardless of the algorithm. In particular, the checksum was only 
calculated over the least significant byte of each length rather than all eight 
bytes. By corr
 upting the fourth byte of the output stream (the first length's second-most 
significant byte not covered by the calculation), an invalid length is read and 
later used.
+#### Where to Find More
+Property-based testing is a broad topic, much of which is not covered by this 
post. In addition to Cassandra, it has been used successfully in several places 
including [car]( [operating
+systems]( and [suppliers' 
products](, [GNOME 
Glib](, [distributed 
consensus](, and other 
[databases]( It can also be combined with 
other approaches such as fault-injection and memory leak detection. Stateful 
models can also be built to generate a series of commands instead of running 
each example on one generated set of inputs. Our goal is to evangelize this 
approach within the Cassandra developer community and encourage more testing of 
this kind as part of our work to deliver the most stable major release of 
Cassandra yet.

To unsubscribe, e-mail:
For additional commands, e-mail:

Reply via email to