blambov commented on code in PR #2836:
URL: https://github.com/apache/cassandra/pull/2836#discussion_r1371546291
##########
src/java/org/apache/cassandra/db/compaction/unified/Controller.java:
##########
@@ -75,6 +85,32 @@ public class Controller
CassandraRelevantProperties.UCS_TARGET_SSTABLE_SIZE.getSizeInBytes();
static final long MIN_TARGET_SSTABLE_SIZE = 1L << 20;
+ /**
+ * Provision for growth of the constructed SSTables as the size of the
data grows. By default the target SSTable
+ * size is fixed for all levels. In some scenarios is may be better to
reduce the overall number of SSTables when
+ * the data size becomes larger to avoid using too much memory and
processing for the corresponding structures.
+ * The setting enables such control and determines how much we reduce the
growth of the number of split points as
+ * the data size grows. The number specifies the sstable growth part, and
the difference from 1 is the shard count
+ * growth component, which is a multiplier applied to the logarithm of the
data size, before it is rounded and
+ * applied as an exponent in the number of split points. In other words,
the given value applies as a negative
+ * exponent in the calculation of the number of split points.
+ * <p>
+ * Using 0 (the default) applies no correction to the number of split
points, resulting in SSTables close to the
Review Comment:
This patch changes the default to 1/3.
Actually, could we introduce the extra machinery in one commit, and change
the default in a separate one? (Both committed with this ticket, just
separately so it's easy to roll back just one.)
##########
src/java/org/apache/cassandra/db/compaction/unified/Controller.java:
##########
@@ -202,36 +249,180 @@ public int getThreshold(int index) {
/**
* Calculate the number of shards to split the local token space in for
the given sstable density.
* This is calculated as a power-of-two multiple of baseShardCount, so
that the expected size of resulting sstables
- * is between sqrt(0.5) * targetSSTableSize and sqrt(2) *
targetSSTableSize, with a minimum of baseShardCount shards
- * for smaller sstables.
- *
+ * is between sqrt(0.5) and sqrt(2) times the target size, which is
calculated from targetSSTableSize to grow
+ * at the given sstableGrowthModifier of the exponential growth of the
density.
+ * <p>
+ * Additionally, if a minimum sstable size is set, we can go below the
baseShardCount when that would result in
+ * sstables smaller than that minimum. Note that in the case of a
non-power-of-two base count this will cause
+ * smaller sstables to not be aligned with the ones whose size is enough
for the base count.
+ * <p>
* Note that to get the sstables resulting from this splitting within the
bounds, the density argument must be
* normalized to the span that is being split. In other words, if no disks
are defined, the density should be
* scaled by the token coverage of the locally-owned ranges. If multiple
data directories are defined, the density
- * should be scaled by the token coverage of the respective data
directory. That is localDensity = size / span,
+ * should be scaled by the token coverage of the respective data
directory. That is, localDensity = size / span,
* where the span is normalized so that span = 1 when the data covers the
range that is being split.
*/
public int getNumShards(double localDensity)
{
- // How many we would have to aim for the target size. Divided by the
base shard count, so that we can ensure
- // the result is a multiple of it by multiplying back below.
- double count = localDensity / (targetSSTableSize * INVERSE_SQRT_2 *
baseShardCount);
- if (count > MAX_SHARD_SPLIT)
- count = MAX_SHARD_SPLIT;
- assert !(count < 0); // Must be positive, 0 or NaN, which should
translate to baseShardCount
-
- // Make it a power of two multiple of the base count so that split
points for lower levels remain split points
- // for higher.
- // The conversion to int and highestOneBit round down, for which we
compensate by using the sqrt(0.5) multiplier
- // applied above.
- // Setting the bottom bit to 1 ensures the result is at least
baseShardCount.
- int shards = baseShardCount * Integer.highestOneBit((int) count | 1);
- logger.debug("Shard count {} for density {}, {} times target {}",
- shards,
- FBUtilities.prettyPrintBinary(localDensity, "B", " "),
- localDensity / targetSSTableSize,
- FBUtilities.prettyPrintBinary(targetSSTableSize, "B", "
"));
- return shards;
+ int shards;
+ // Check the minimum size first.
+ long minSize = getMinSstableSizeBytes();
+ if (minSize > 0)
+ {
+ double count = localDensity / minSize;
+ // Minimum size only applies if it is smaller than the base count.
+ // Note: the minimum size cannot be larger than the target size's
minimum.
+ if (count < baseShardCount)
+ {
+ // Make it a power of two, rounding down so that sstables are
greater in size than the min.
+ // Setting the bottom bit to 1 ensures the result is at least
1.
+ shards = Integer.highestOneBit((int) count | 1);
+ // If baseShardCount is not a power of 2, split only to powers
of two that are divisors of baseShardCount so boundaries match higher levels
+ if (!isPowerOf2(baseShardCount))
+ {
+ shards = getLargestPowerOf2Divisor(baseShardCount, shards);
+ }
+ if (logger.isDebugEnabled())
+ logger.debug("Shard count {} for density {}, {} times min
size {}",
+ shards,
+ FBUtilities.prettyPrintBinary(localDensity,
"B", " "),
+ localDensity / minSize,
+ FBUtilities.prettyPrintBinary(minSize, "B", "
"));
+
+ return shards;
+ }
+ }
+
+ if (sstableGrowthModifier == 1)
+ {
+ shards = baseShardCount;
+ logger.debug("Shard count {} for density {} in fixed shards mode",
+ shards,
+ FBUtilities.prettyPrintBinary(localDensity, "B", "
"));
+ return shards;
+ }
+ else if (sstableGrowthModifier == 0)
+ {
+ // How many we would have to aim for the target size. Divided by
the base shard count, so that we can ensure
+ // the result is a multiple of it by multiplying back below.
Adjusted by sqrt(0.5) to calculate the split
+ // points needed for the minimum size.
+ double count = localDensity / (targetSSTableSize * INVERSE_SQRT_2
* baseShardCount);
+ if (count > MAX_SHARD_SPLIT)
+ count = MAX_SHARD_SPLIT;
+ assert !(count < 0); // Must be positive, 0 or NaN, which
should translate to baseShardCount
+
+ // Make it a power of two multiple of the base count so that split
points for lower levels remain split points for higher.
+ // The conversion to int and highestOneBit round down, for which
we compensate by using the sqrt(0.5) multiplier
+ // already applied in the count.
+ // Setting the bottom bit to 1 ensures the result is at least
baseShardCount.
+ shards = baseShardCount * Integer.highestOneBit((int) count | 1);
+
+ if (logger.isDebugEnabled())
+ logger.debug("Shard count {} for density {}, {} times target
{}",
+ shards,
+ FBUtilities.prettyPrintBinary(localDensity, "B",
" "),
+ localDensity / targetSSTableSize,
+ FBUtilities.prettyPrintBinary(targetSSTableSize,
"B", " "));
+ return shards;
+ }
+ else
+ {
+ // How many we would have to aim for the target size. Divided by
the base shard count, so that we can ensure
+ // the result is a multiple of it by multiplying back below.
+ double count = localDensity / (targetSSTableSize * baseShardCount);
+ // Take a logarithm of the count (in base 2), and adjust it by the
given growth modifier.
+ // Adjust by 0.5 to round the exponent so that the result falls
between targetSSTableSize * sqrt(0.5) and
+ // targetSSTableSize * sqrt(2). Finally, make sure the exponent is
at least 0 and not greater than the
+ // fixed maximum.
+ // Note: This code also works correctly for the special cases of
sstableGrowthModifier == 0 and 1,
+ // but the above code avoids the floating point arithmetic
for these common cases.
+ // Note: We use log instead of getExponent because we also need
the non-integer part of the logarithm
+ // in order to apply the growth modifier correctly.
+ final double countLog = Math.log(count);
+ double pow = countLog * INVERSE_LOG_2 * (1 -
sstableGrowthModifier) + 0.5;
+ if (pow >= MAX_SHARD_SHIFT)
+ shards = baseShardCount << MAX_SHARD_SHIFT;
+ else if (pow >= 0)
+ shards = baseShardCount << (int) pow;
+ else
+ shards = baseShardCount; // this also covers the case of
pow == NaN
+
+ if (logger.isDebugEnabled())
+ {
+ long targetSize = (long) (targetSSTableSize *
Math.exp(countLog * sstableGrowthModifier));
+ logger.debug("Shard count {} for density {}, {} times target
{}",
+ shards,
+ FBUtilities.prettyPrintBinary(localDensity, "B",
" "),
+ localDensity / targetSize,
+ FBUtilities.prettyPrintBinary(targetSize, "B", "
"));
+ }
+ return shards;
+ }
+ }
+
+ private boolean isPowerOf2(int n)
+ {
+ return (int)(Math.ceil((Math.log(n) / Math.log(2)))) ==
(int)(Math.floor(((Math.log(n) / Math.log(2)))));
Review Comment:
An easier (and more precise) way to do this is `return (x & (x-1)) == 0`.
##########
src/java/org/apache/cassandra/db/compaction/unified/Controller.java:
##########
@@ -53,6 +55,14 @@ public class Controller
private final static String DEFAULT_SCALING_PARAMETERS =
CassandraRelevantProperties.UCS_SCALING_PARAMETER.getString();
+ /**
+ * The minimum sstable size. Sharded writers split sstables over shard
only if they are at least as large as the
+ * minimum size.
+ */
+ static final String MIN_SSTABLE_SIZE_OPTION = "min_sstable_size";
+
+ static final long DEFAULT_MIN_SSTABLE_SIZE =
FBUtilities.parseHumanReadableBytes(System.getProperty(PREFIX +
MIN_SSTABLE_SIZE_OPTION, "100MiB"));
Review Comment:
All these properties must be retrieved through `CassandraRelevantProperties`
(see e.g. `DEFAULT_SCALING_PARAMETER` above).
##########
src/java/org/apache/cassandra/db/compaction/unified/Controller.java:
##########
@@ -53,6 +55,14 @@ public class Controller
private final static String DEFAULT_SCALING_PARAMETERS =
CassandraRelevantProperties.UCS_SCALING_PARAMETER.getString();
+ /**
+ * The minimum sstable size. Sharded writers split sstables over shard
only if they are at least as large as the
+ * minimum size.
+ */
+ static final String MIN_SSTABLE_SIZE_OPTION = "min_sstable_size";
Review Comment:
We currently have special code to change the base shard count to 1 for
system tables (see `DEFAULT_BASE_SHARD_COUNT` below) because they are expected
to be small. The min size makes this unnecessary; can we remove that special
case as part of the commit that changes the default min size?
##########
src/java/org/apache/cassandra/db/compaction/unified/Controller.java:
##########
@@ -202,36 +249,180 @@ public int getThreshold(int index) {
/**
* Calculate the number of shards to split the local token space in for
the given sstable density.
* This is calculated as a power-of-two multiple of baseShardCount, so
that the expected size of resulting sstables
- * is between sqrt(0.5) * targetSSTableSize and sqrt(2) *
targetSSTableSize, with a minimum of baseShardCount shards
- * for smaller sstables.
- *
+ * is between sqrt(0.5) and sqrt(2) times the target size, which is
calculated from targetSSTableSize to grow
+ * at the given sstableGrowthModifier of the exponential growth of the
density.
+ * <p>
+ * Additionally, if a minimum sstable size is set, we can go below the
baseShardCount when that would result in
+ * sstables smaller than that minimum. Note that in the case of a
non-power-of-two base count this will cause
+ * smaller sstables to not be aligned with the ones whose size is enough
for the base count.
+ * <p>
* Note that to get the sstables resulting from this splitting within the
bounds, the density argument must be
* normalized to the span that is being split. In other words, if no disks
are defined, the density should be
* scaled by the token coverage of the locally-owned ranges. If multiple
data directories are defined, the density
- * should be scaled by the token coverage of the respective data
directory. That is localDensity = size / span,
+ * should be scaled by the token coverage of the respective data
directory. That is, localDensity = size / span,
* where the span is normalized so that span = 1 when the data covers the
range that is being split.
*/
public int getNumShards(double localDensity)
{
- // How many we would have to aim for the target size. Divided by the
base shard count, so that we can ensure
- // the result is a multiple of it by multiplying back below.
- double count = localDensity / (targetSSTableSize * INVERSE_SQRT_2 *
baseShardCount);
- if (count > MAX_SHARD_SPLIT)
- count = MAX_SHARD_SPLIT;
- assert !(count < 0); // Must be positive, 0 or NaN, which should
translate to baseShardCount
-
- // Make it a power of two multiple of the base count so that split
points for lower levels remain split points
- // for higher.
- // The conversion to int and highestOneBit round down, for which we
compensate by using the sqrt(0.5) multiplier
- // applied above.
- // Setting the bottom bit to 1 ensures the result is at least
baseShardCount.
- int shards = baseShardCount * Integer.highestOneBit((int) count | 1);
- logger.debug("Shard count {} for density {}, {} times target {}",
- shards,
- FBUtilities.prettyPrintBinary(localDensity, "B", " "),
- localDensity / targetSSTableSize,
- FBUtilities.prettyPrintBinary(targetSSTableSize, "B", "
"));
- return shards;
+ int shards;
+ // Check the minimum size first.
+ long minSize = getMinSstableSizeBytes();
+ if (minSize > 0)
+ {
+ double count = localDensity / minSize;
+ // Minimum size only applies if it is smaller than the base count.
+ // Note: the minimum size cannot be larger than the target size's
minimum.
+ if (count < baseShardCount)
+ {
+ // Make it a power of two, rounding down so that sstables are
greater in size than the min.
+ // Setting the bottom bit to 1 ensures the result is at least
1.
+ shards = Integer.highestOneBit((int) count | 1);
+ // If baseShardCount is not a power of 2, split only to powers
of two that are divisors of baseShardCount so boundaries match higher levels
+ if (!isPowerOf2(baseShardCount))
+ {
+ shards = getLargestPowerOf2Divisor(baseShardCount, shards);
+ }
+ if (logger.isDebugEnabled())
+ logger.debug("Shard count {} for density {}, {} times min
size {}",
+ shards,
+ FBUtilities.prettyPrintBinary(localDensity,
"B", " "),
+ localDensity / minSize,
+ FBUtilities.prettyPrintBinary(minSize, "B", "
"));
+
+ return shards;
+ }
+ }
+
+ if (sstableGrowthModifier == 1)
+ {
+ shards = baseShardCount;
+ logger.debug("Shard count {} for density {} in fixed shards mode",
+ shards,
+ FBUtilities.prettyPrintBinary(localDensity, "B", "
"));
+ return shards;
+ }
+ else if (sstableGrowthModifier == 0)
+ {
+ // How many we would have to aim for the target size. Divided by
the base shard count, so that we can ensure
+ // the result is a multiple of it by multiplying back below.
Adjusted by sqrt(0.5) to calculate the split
+ // points needed for the minimum size.
+ double count = localDensity / (targetSSTableSize * INVERSE_SQRT_2
* baseShardCount);
+ if (count > MAX_SHARD_SPLIT)
+ count = MAX_SHARD_SPLIT;
+ assert !(count < 0); // Must be positive, 0 or NaN, which
should translate to baseShardCount
+
+ // Make it a power of two multiple of the base count so that split
points for lower levels remain split points for higher.
+ // The conversion to int and highestOneBit round down, for which
we compensate by using the sqrt(0.5) multiplier
+ // already applied in the count.
+ // Setting the bottom bit to 1 ensures the result is at least
baseShardCount.
+ shards = baseShardCount * Integer.highestOneBit((int) count | 1);
+
+ if (logger.isDebugEnabled())
+ logger.debug("Shard count {} for density {}, {} times target
{}",
+ shards,
+ FBUtilities.prettyPrintBinary(localDensity, "B",
" "),
+ localDensity / targetSSTableSize,
+ FBUtilities.prettyPrintBinary(targetSSTableSize,
"B", " "));
+ return shards;
+ }
+ else
+ {
+ // How many we would have to aim for the target size. Divided by
the base shard count, so that we can ensure
+ // the result is a multiple of it by multiplying back below.
+ double count = localDensity / (targetSSTableSize * baseShardCount);
+ // Take a logarithm of the count (in base 2), and adjust it by the
given growth modifier.
+ // Adjust by 0.5 to round the exponent so that the result falls
between targetSSTableSize * sqrt(0.5) and
+ // targetSSTableSize * sqrt(2). Finally, make sure the exponent is
at least 0 and not greater than the
+ // fixed maximum.
+ // Note: This code also works correctly for the special cases of
sstableGrowthModifier == 0 and 1,
+ // but the above code avoids the floating point arithmetic
for these common cases.
+ // Note: We use log instead of getExponent because we also need
the non-integer part of the logarithm
+ // in order to apply the growth modifier correctly.
+ final double countLog = Math.log(count);
+ double pow = countLog * INVERSE_LOG_2 * (1 -
sstableGrowthModifier) + 0.5;
+ if (pow >= MAX_SHARD_SHIFT)
+ shards = baseShardCount << MAX_SHARD_SHIFT;
+ else if (pow >= 0)
+ shards = baseShardCount << (int) pow;
+ else
+ shards = baseShardCount; // this also covers the case of
pow == NaN
+
+ if (logger.isDebugEnabled())
+ {
+ long targetSize = (long) (targetSSTableSize *
Math.exp(countLog * sstableGrowthModifier));
+ logger.debug("Shard count {} for density {}, {} times target
{}",
+ shards,
+ FBUtilities.prettyPrintBinary(localDensity, "B",
" "),
+ localDensity / targetSize,
+ FBUtilities.prettyPrintBinary(targetSize, "B", "
"));
+ }
+ return shards;
+ }
+ }
+
+ private boolean isPowerOf2(int n)
+ {
+ return (int)(Math.ceil((Math.log(n) / Math.log(2)))) ==
(int)(Math.floor(((Math.log(n) / Math.log(2)))));
+ }
+
+ /**
+ * @param n number of shards needs to be divisible by n
+ * @param shards the maximum number of shards
+ * @return the largest power of 2 that is a divisor of n (e.g. when n =
10, return 2) and less than shards
+ */
+ private int getLargestPowerOf2Divisor(int n, int shards)
Review Comment:
The largest power-of-2 divisor of `n` can be found via
`Integer.numberOfTrailingZeros(n)`.
I.e. this can be implemented as `Math.min(shards, 1 <<
Integer.numberOfTrailingZeros(n))`.
This would also work without an explicit `isPowerOf2` check.
##########
src/java/org/apache/cassandra/db/compaction/unified/Controller.java:
##########
@@ -202,36 +249,180 @@ public int getThreshold(int index) {
/**
* Calculate the number of shards to split the local token space in for
the given sstable density.
* This is calculated as a power-of-two multiple of baseShardCount, so
that the expected size of resulting sstables
- * is between sqrt(0.5) * targetSSTableSize and sqrt(2) *
targetSSTableSize, with a minimum of baseShardCount shards
- * for smaller sstables.
- *
+ * is between sqrt(0.5) and sqrt(2) times the target size, which is
calculated from targetSSTableSize to grow
+ * at the given sstableGrowthModifier of the exponential growth of the
density.
+ * <p>
+ * Additionally, if a minimum sstable size is set, we can go below the
baseShardCount when that would result in
+ * sstables smaller than that minimum. Note that in the case of a
non-power-of-two base count this will cause
+ * smaller sstables to not be aligned with the ones whose size is enough
for the base count.
+ * <p>
* Note that to get the sstables resulting from this splitting within the
bounds, the density argument must be
* normalized to the span that is being split. In other words, if no disks
are defined, the density should be
* scaled by the token coverage of the locally-owned ranges. If multiple
data directories are defined, the density
- * should be scaled by the token coverage of the respective data
directory. That is localDensity = size / span,
+ * should be scaled by the token coverage of the respective data
directory. That is, localDensity = size / span,
* where the span is normalized so that span = 1 when the data covers the
range that is being split.
*/
public int getNumShards(double localDensity)
{
- // How many we would have to aim for the target size. Divided by the
base shard count, so that we can ensure
- // the result is a multiple of it by multiplying back below.
- double count = localDensity / (targetSSTableSize * INVERSE_SQRT_2 *
baseShardCount);
- if (count > MAX_SHARD_SPLIT)
- count = MAX_SHARD_SPLIT;
- assert !(count < 0); // Must be positive, 0 or NaN, which should
translate to baseShardCount
-
- // Make it a power of two multiple of the base count so that split
points for lower levels remain split points
- // for higher.
- // The conversion to int and highestOneBit round down, for which we
compensate by using the sqrt(0.5) multiplier
- // applied above.
- // Setting the bottom bit to 1 ensures the result is at least
baseShardCount.
- int shards = baseShardCount * Integer.highestOneBit((int) count | 1);
- logger.debug("Shard count {} for density {}, {} times target {}",
- shards,
- FBUtilities.prettyPrintBinary(localDensity, "B", " "),
- localDensity / targetSSTableSize,
- FBUtilities.prettyPrintBinary(targetSSTableSize, "B", "
"));
- return shards;
+ int shards;
+ // Check the minimum size first.
+ long minSize = getMinSstableSizeBytes();
+ if (minSize > 0)
+ {
+ double count = localDensity / minSize;
+ // Minimum size only applies if it is smaller than the base count.
+ // Note: the minimum size cannot be larger than the target size's
minimum.
+ if (count < baseShardCount)
+ {
+ // Make it a power of two, rounding down so that sstables are
greater in size than the min.
+ // Setting the bottom bit to 1 ensures the result is at least
1.
+ shards = Integer.highestOneBit((int) count | 1);
+ // If baseShardCount is not a power of 2, split only to powers
of two that are divisors of baseShardCount so boundaries match higher levels
+ if (!isPowerOf2(baseShardCount))
+ {
+ shards = getLargestPowerOf2Divisor(baseShardCount, shards);
+ }
+ if (logger.isDebugEnabled())
+ logger.debug("Shard count {} for density {}, {} times min
size {}",
+ shards,
+ FBUtilities.prettyPrintBinary(localDensity,
"B", " "),
+ localDensity / minSize,
+ FBUtilities.prettyPrintBinary(minSize, "B", "
"));
+
+ return shards;
+ }
+ }
+
+ if (sstableGrowthModifier == 1)
+ {
+ shards = baseShardCount;
+ logger.debug("Shard count {} for density {} in fixed shards mode",
+ shards,
+ FBUtilities.prettyPrintBinary(localDensity, "B", "
"));
+ return shards;
+ }
+ else if (sstableGrowthModifier == 0)
+ {
+ // How many we would have to aim for the target size. Divided by
the base shard count, so that we can ensure
+ // the result is a multiple of it by multiplying back below.
Adjusted by sqrt(0.5) to calculate the split
+ // points needed for the minimum size.
+ double count = localDensity / (targetSSTableSize * INVERSE_SQRT_2
* baseShardCount);
+ if (count > MAX_SHARD_SPLIT)
+ count = MAX_SHARD_SPLIT;
+ assert !(count < 0); // Must be positive, 0 or NaN, which
should translate to baseShardCount
+
+ // Make it a power of two multiple of the base count so that split
points for lower levels remain split points for higher.
+ // The conversion to int and highestOneBit round down, for which
we compensate by using the sqrt(0.5) multiplier
+ // already applied in the count.
+ // Setting the bottom bit to 1 ensures the result is at least
baseShardCount.
+ shards = baseShardCount * Integer.highestOneBit((int) count | 1);
+
+ if (logger.isDebugEnabled())
+ logger.debug("Shard count {} for density {}, {} times target
{}",
+ shards,
+ FBUtilities.prettyPrintBinary(localDensity, "B",
" "),
+ localDensity / targetSSTableSize,
+ FBUtilities.prettyPrintBinary(targetSSTableSize,
"B", " "));
+ return shards;
+ }
+ else
+ {
+ // How many we would have to aim for the target size. Divided by
the base shard count, so that we can ensure
+ // the result is a multiple of it by multiplying back below.
+ double count = localDensity / (targetSSTableSize * baseShardCount);
+ // Take a logarithm of the count (in base 2), and adjust it by the
given growth modifier.
+ // Adjust by 0.5 to round the exponent so that the result falls
between targetSSTableSize * sqrt(0.5) and
+ // targetSSTableSize * sqrt(2). Finally, make sure the exponent is
at least 0 and not greater than the
+ // fixed maximum.
+ // Note: This code also works correctly for the special cases of
sstableGrowthModifier == 0 and 1,
+ // but the above code avoids the floating point arithmetic
for these common cases.
+ // Note: We use log instead of getExponent because we also need
the non-integer part of the logarithm
+ // in order to apply the growth modifier correctly.
+ final double countLog = Math.log(count);
+ double pow = countLog * INVERSE_LOG_2 * (1 -
sstableGrowthModifier) + 0.5;
+ if (pow >= MAX_SHARD_SHIFT)
+ shards = baseShardCount << MAX_SHARD_SHIFT;
+ else if (pow >= 0)
+ shards = baseShardCount << (int) pow;
+ else
+ shards = baseShardCount; // this also covers the case of
pow == NaN
+
+ if (logger.isDebugEnabled())
+ {
+ long targetSize = (long) (targetSSTableSize *
Math.exp(countLog * sstableGrowthModifier));
+ logger.debug("Shard count {} for density {}, {} times target
{}",
+ shards,
+ FBUtilities.prettyPrintBinary(localDensity, "B",
" "),
+ localDensity / targetSize,
+ FBUtilities.prettyPrintBinary(targetSize, "B", "
"));
+ }
+ return shards;
+ }
+ }
+
+ private boolean isPowerOf2(int n)
+ {
+ return (int)(Math.ceil((Math.log(n) / Math.log(2)))) ==
(int)(Math.floor(((Math.log(n) / Math.log(2)))));
+ }
+
+ /**
+ * @param n number of shards needs to be divisible by n
+ * @param shards the maximum number of shards
+ * @return the largest power of 2 that is a divisor of n (e.g. when n =
10, return 2) and less than shards
+ */
+ private int getLargestPowerOf2Divisor(int n, int shards)
+ {
+ int tmp = 1;
+ int largestDivisor = 1;
+ while (tmp <= shards)
+ {
+ if (n % tmp == 0)
+ {
+ largestDivisor = tmp;
+ }
+ else
+ {
+ return largestDivisor;
+ }
+ tmp = tmp * 2;
+ }
+ return largestDivisor;
+ }
+
+ /**
+ * Return the sstable size in bytes.
+ *
+ * This is either set by the user in the options or calculated by rounding
up the first flush size to 50 MB.
+ *
+ * @return the minimum sstable size in bytes.
+ */
+ public long getMinSstableSizeBytes()
+ {
+ if (minSSTableSize >= 0)
+ return minSSTableSize;
+
+ synchronized (this)
Review Comment:
We should only apply a minimum size if the user has specified it, there is
no need to fall back to the flush size when none is given.
##########
src/java/org/apache/cassandra/db/compaction/unified/Controller.java:
##########
@@ -202,36 +249,180 @@ public int getThreshold(int index) {
/**
* Calculate the number of shards to split the local token space in for
the given sstable density.
* This is calculated as a power-of-two multiple of baseShardCount, so
that the expected size of resulting sstables
- * is between sqrt(0.5) * targetSSTableSize and sqrt(2) *
targetSSTableSize, with a minimum of baseShardCount shards
- * for smaller sstables.
- *
+ * is between sqrt(0.5) and sqrt(2) times the target size, which is
calculated from targetSSTableSize to grow
+ * at the given sstableGrowthModifier of the exponential growth of the
density.
+ * <p>
+ * Additionally, if a minimum sstable size is set, we can go below the
baseShardCount when that would result in
+ * sstables smaller than that minimum. Note that in the case of a
non-power-of-two base count this will cause
+ * smaller sstables to not be aligned with the ones whose size is enough
for the base count.
+ * <p>
* Note that to get the sstables resulting from this splitting within the
bounds, the density argument must be
* normalized to the span that is being split. In other words, if no disks
are defined, the density should be
* scaled by the token coverage of the locally-owned ranges. If multiple
data directories are defined, the density
- * should be scaled by the token coverage of the respective data
directory. That is localDensity = size / span,
+ * should be scaled by the token coverage of the respective data
directory. That is, localDensity = size / span,
* where the span is normalized so that span = 1 when the data covers the
range that is being split.
*/
public int getNumShards(double localDensity)
{
- // How many we would have to aim for the target size. Divided by the
base shard count, so that we can ensure
- // the result is a multiple of it by multiplying back below.
- double count = localDensity / (targetSSTableSize * INVERSE_SQRT_2 *
baseShardCount);
- if (count > MAX_SHARD_SPLIT)
- count = MAX_SHARD_SPLIT;
- assert !(count < 0); // Must be positive, 0 or NaN, which should
translate to baseShardCount
-
- // Make it a power of two multiple of the base count so that split
points for lower levels remain split points
- // for higher.
- // The conversion to int and highestOneBit round down, for which we
compensate by using the sqrt(0.5) multiplier
- // applied above.
- // Setting the bottom bit to 1 ensures the result is at least
baseShardCount.
- int shards = baseShardCount * Integer.highestOneBit((int) count | 1);
- logger.debug("Shard count {} for density {}, {} times target {}",
- shards,
- FBUtilities.prettyPrintBinary(localDensity, "B", " "),
- localDensity / targetSSTableSize,
- FBUtilities.prettyPrintBinary(targetSSTableSize, "B", "
"));
- return shards;
+ int shards;
+ // Check the minimum size first.
+ long minSize = getMinSstableSizeBytes();
+ if (minSize > 0)
+ {
+ double count = localDensity / minSize;
+ // Minimum size only applies if it is smaller than the base count.
+ // Note: the minimum size cannot be larger than the target size's
minimum.
+ if (count < baseShardCount)
+ {
+ // Make it a power of two, rounding down so that sstables are
greater in size than the min.
+ // Setting the bottom bit to 1 ensures the result is at least
1.
+ shards = Integer.highestOneBit((int) count | 1);
+ // If baseShardCount is not a power of 2, split only to powers
of two that are divisors of baseShardCount so boundaries match higher levels
+ if (!isPowerOf2(baseShardCount))
+ {
+ shards = getLargestPowerOf2Divisor(baseShardCount, shards);
+ }
+ if (logger.isDebugEnabled())
+ logger.debug("Shard count {} for density {}, {} times min
size {}",
+ shards,
+ FBUtilities.prettyPrintBinary(localDensity,
"B", " "),
+ localDensity / minSize,
+ FBUtilities.prettyPrintBinary(minSize, "B", "
"));
+
+ return shards;
+ }
+ }
+
+ if (sstableGrowthModifier == 1)
+ {
+ shards = baseShardCount;
+ logger.debug("Shard count {} for density {} in fixed shards mode",
+ shards,
+ FBUtilities.prettyPrintBinary(localDensity, "B", "
"));
+ return shards;
+ }
+ else if (sstableGrowthModifier == 0)
+ {
+ // How many we would have to aim for the target size. Divided by
the base shard count, so that we can ensure
+ // the result is a multiple of it by multiplying back below.
Adjusted by sqrt(0.5) to calculate the split
+ // points needed for the minimum size.
+ double count = localDensity / (targetSSTableSize * INVERSE_SQRT_2
* baseShardCount);
+ if (count > MAX_SHARD_SPLIT)
+ count = MAX_SHARD_SPLIT;
+ assert !(count < 0); // Must be positive, 0 or NaN, which
should translate to baseShardCount
+
+ // Make it a power of two multiple of the base count so that split
points for lower levels remain split points for higher.
+ // The conversion to int and highestOneBit round down, for which
we compensate by using the sqrt(0.5) multiplier
+ // already applied in the count.
+ // Setting the bottom bit to 1 ensures the result is at least
baseShardCount.
+ shards = baseShardCount * Integer.highestOneBit((int) count | 1);
+
+ if (logger.isDebugEnabled())
+ logger.debug("Shard count {} for density {}, {} times target
{}",
+ shards,
+ FBUtilities.prettyPrintBinary(localDensity, "B",
" "),
+ localDensity / targetSSTableSize,
+ FBUtilities.prettyPrintBinary(targetSSTableSize,
"B", " "));
+ return shards;
+ }
+ else
+ {
+ // How many we would have to aim for the target size. Divided by
the base shard count, so that we can ensure
+ // the result is a multiple of it by multiplying back below.
+ double count = localDensity / (targetSSTableSize * baseShardCount);
+ // Take a logarithm of the count (in base 2), and adjust it by the
given growth modifier.
+ // Adjust by 0.5 to round the exponent so that the result falls
between targetSSTableSize * sqrt(0.5) and
+ // targetSSTableSize * sqrt(2). Finally, make sure the exponent is
at least 0 and not greater than the
+ // fixed maximum.
+ // Note: This code also works correctly for the special cases of
sstableGrowthModifier == 0 and 1,
+ // but the above code avoids the floating point arithmetic
for these common cases.
Review Comment:
Nit: We can add "imprecise" before "floating point arithmetic" as the loss
of precision is the main reason to avoid this path.
##########
src/java/org/apache/cassandra/db/compaction/unified/Controller.java:
##########
@@ -202,36 +249,180 @@ public int getThreshold(int index) {
/**
* Calculate the number of shards to split the local token space in for
the given sstable density.
* This is calculated as a power-of-two multiple of baseShardCount, so
that the expected size of resulting sstables
- * is between sqrt(0.5) * targetSSTableSize and sqrt(2) *
targetSSTableSize, with a minimum of baseShardCount shards
- * for smaller sstables.
- *
+ * is between sqrt(0.5) and sqrt(2) times the target size, which is
calculated from targetSSTableSize to grow
+ * at the given sstableGrowthModifier of the exponential growth of the
density.
+ * <p>
+ * Additionally, if a minimum sstable size is set, we can go below the
baseShardCount when that would result in
+ * sstables smaller than that minimum. Note that in the case of a
non-power-of-two base count this will cause
+ * smaller sstables to not be aligned with the ones whose size is enough
for the base count.
Review Comment:
This is not true as we will only split to divisors of `baseShardCount`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]