yifan-c commented on code in PR #34:
URL:
https://github.com/apache/cassandra-analytics/pull/34#discussion_r1465746084
##########
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/token/RangeUtils.java:
##########
@@ -84,58 +79,78 @@ public static List<Range<BigInteger>>
split(Range<BigInteger> range, int nrSplit
{
Preconditions.checkArgument(range.lowerEndpoint().compareTo(range.upperEndpoint())
<= 0,
"RangeUtils assume ranges are not
wrap-around");
+ Preconditions.checkArgument(range.lowerBoundType() == BoundType.OPEN
+ && range.upperBoundType() ==
BoundType.CLOSED,
+ "Input must be an open-closed range");
if (range.isEmpty())
{
return Collections.emptyList();
}
+ if (nrSplits == 1 || sizeOf(range).equals(BigInteger.ONE))
+ {
+ // no split required; exit early
+ return Collections.singletonList(range);
+ }
+
Preconditions.checkArgument(nrSplits >= 1, "nrSplits must be greater
than or equal to 1");
// Make sure split size is not 0
BigInteger splitSize =
sizeOf(range).divide(BigInteger.valueOf(nrSplits));
- if (splitSize.compareTo(BigInteger.ZERO) == 0)
+ boolean isTinyRange = splitSize.compareTo(BigInteger.ZERO) == 0; // a
tiny range that cannot be split this many times
+ if (isTinyRange)
{
splitSize = BigInteger.ONE;
}
// Start from range lower endpoint and spit ranges of size splitSize,
until we cross the range
- BigInteger nextLowerEndpoint = range.lowerBoundType() ==
BoundType.CLOSED
- ? range.lowerEndpoint()
- : range.lowerEndpoint().add(BigInteger.ONE);
+ BigInteger lowerEndpoint = range.lowerEndpoint();
List<Range<BigInteger>> splits = new ArrayList<>();
- while (range.contains(nextLowerEndpoint))
+ for (int i = 0; i < nrSplits; i++)
{
- BigInteger upperEndpoint = nextLowerEndpoint.add(splitSize);
- splits.add(range.intersection(Range.closedOpen(nextLowerEndpoint,
upperEndpoint)));
Review Comment:
calculate `intersection` is not necessary. We do not want to add 2 identical
result from the intersection calculation.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]