aherbert commented on code in PR #877:
URL: https://github.com/apache/commons-lang/pull/877#discussion_r845025971
##########
src/main/java/org/apache/commons/lang3/Conversion.java:
##########
@@ -447,16 +447,16 @@ public static char binaryBeMsb0ToHexDigit(boolean[] src,
int srcPos) {
src = paddedSrc;
srcPos = 0;
Review Comment:
This method appears to have been rewritten from a version that used `src`
directly to a version that pads `src` to 4 values. However after the rewrite
all the redundant length checks have not been removed. Looking at the git
history the original version also had the padding. So this is a strange legacy
bit of code from 10 years ago.
IIUC the `srcPos` is set to zero and then never changed after this point. So
its use could be removed, e.g. `srcPos + 3` is the same as `3`. Likewise all
the checks against length are not required as `paddedSrc` is length 4. So you
have removed some of them but not the rest starting at line 461.
However a further look into this finds that this method and
`binaryToHexDigit` are essentially the same method but with the endianness
inverted. One method starts at the given position for the the lsb and
increments the position up to 3 times to read the boolean bits. The other
starts at the provided lsb counting **from the end** of the input array (this
is not explicitly documented but is demonstrated in the javadoc) and decrements
up to 3 times. The later method does this by extracting the range into a padded
array of 4, the full logic of which has some issues when the provided srcPos is
not inside the array (this is not tested in the unit tests).
I think it makes more sense to remove the padding to 4 characters and leave
the length checks in place. The method should be the same as `binaryToHexDigit`
but with the indexing reversed. Thus:
```Java
boolean[] src = ...;
int srcPos = ...;
boolean[] reversed = src.clone();
ArrayUtils.reverse(reversed);
binaryToHexDigit(src, srcPos) == binaryBeMsb0ToHexDigit(reversed, srcPos);
```
I've tested the above and it seems to be the intention of the the method:
```Java
@Test
public void testBinaryToHexDigitReverse() {
SplittableRandom rng = new SplittableRandom();
boolean[] x = new boolean[8];
for (int i = 0; i < 100; i++) {
Conversion.longToBinary(rng.nextLong(), 0, x, 0, 8);
for (int j = 1; j <= 8; j++) {
boolean[] a = Arrays.copyOf(x, j);
boolean[] b = a.clone();
ArrayUtils.reverse(b);
for (int k = 0; k < j; k++) {
Assertions.assertEquals(Conversion.binaryToHexDigit(a, k),
Conversion.binaryBeMsb0ToHexDigit(b,
k));
}
}
}
}
```
The unit tests for these methods do not cover using a bad input `srcPos`
outside the array. In the case of `binaryToHexDigit` it would throw an IOOBE
when using `src[srcPos]`. In the case of `binaryBeMsb0ToHexDigit` the padding
logic may throw an exception but I have not checked.
I would update both methods with a check that `srcPos` is inside the input
array, something like the `Objects.checkIndex` method from JDK 9:
```Java
if (Integer.compareUnsigned(srcPos, src.length) >= 0) {
throw new IndexOutOfBoundsException(srcPos + " is not within array
length " + src.length);
}
```
This would make the check on `src.length` redundant as 0 is not within an
array of length 0.
The updated method becomes:
```Java
public static char binaryBeMsb0ToHexDigit(boolean[] src, int srcPos) {
if (src.length == 0) {
throw new IllegalArgumentException("Cannot convert an empty array.");
}
// JDK 9: Objects.checkIndex(int index, int length)
if (Integer.compareUnsigned(srcPos, src.length) >= 0) {
throw new IndexOutOfBoundsException(srcPos + " is not within array
length " + src.length);
}
// Little-endian bit 0 position
final int pos = src.length - 1 - srcPos;
if (3 <= pos && src[pos - 3]) {
if (src[pos - 2]) {
if (src[pos - 1]) {
return src[pos] ? 'f' : 'e';
}
return src[pos] ? 'd' : 'c';
}
if (src[pos - 1]) {
return src[pos] ? 'b' : 'a';
}
return src[pos] ? '9' : '8';
}
if (2 <= pos && src[pos - 2]) {
if (src[pos - 1]) {
return src[pos] ? '7' : '6';
}
return src[pos] ? '5' : '4';
}
if (1 <= pos && src[pos - 1]) {
return src[pos] ? '3' : '2';
}
return src[pos] ? '1' : '0';
```
This passes the existing unit tests and the extra test for bit reversal
shown above. The check for a zero length array is redundant with the index test
but I left it in as it would be a breaking change. The existing unit tests do
expect an IllegalArgumentException for this case.
It has the advantage that it is clearly the same as `binaryToHexDigit` but
working from the end of the array backwards, i.e. the input array is big-endian
most-significant-byte at position 0 (BeMsb0)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]