aherbert commented on code in PR #877:
URL: https://github.com/apache/commons-lang/pull/877#discussion_r845025971


##########
src/main/java/org/apache/commons/lang3/Conversion.java:
##########
@@ -447,16 +447,16 @@ public static char binaryBeMsb0ToHexDigit(boolean[] src, 
int srcPos) {
         src = paddedSrc;
         srcPos = 0;

Review Comment:
   This method appears to have been rewritten from a version that used `src` 
directly to a version that pads `src` to 4 values. However after the rewrite 
all the redundant length checks have not been removed. Looking at the git 
history the original version also had the padding. So this is a strange legacy 
bit of code from 10 years ago.
   
   IIUC the `srcPos` is set to zero and then never changed after this point. So 
its use could be removed, e.g. `srcPos + 3` is the same as `3`. Likewise all 
the checks against length are not required as `paddedSrc` is length 4. So you 
have removed some of them but not the rest starting at line 461.
   
   However a further look into this finds that this method and 
`binaryToHexDigit` are essentially the same method but with the endianness 
inverted. One method starts at the given position for the the lsb and 
increments the position up to 3 times to read the boolean bits. The other 
starts at the provided lsb counting **from the end** of the input array (this 
is not explicitly documented but is demonstrated in the javadoc) and decrements 
up to 3 times. The later method does this by extracting the range into a padded 
array of 4, the full logic of which has some issues when the provided srcPos is 
not inside the array (this is not tested in the unit tests).
   
   I think it makes more sense to remove the padding to 4 characters and leave 
the length checks in place. The method should be the same as `binaryToHexDigit` 
but with the indexing reversed. Thus:
   ```Java
   boolean[] src = ...;
   int srcPos = ...;
   boolean[] reversed = src.clone();
   ArrayUtils.reverse(reversed);
   binaryToHexDigit(src, srcPos) == binaryBeMsb0ToHexDigit(reversed, srcPos);
   ```
   I've tested the above and it seems to be the intention of the the method:
   ```Java
   @Test
   public void testBinaryToHexDigitReverse() {
       SplittableRandom rng = new SplittableRandom();
       boolean[] x = new boolean[8];
       for (int i = 0; i < 100; i++) {
           Conversion.longToBinary(rng.nextLong(), 0, x, 0, 8);
           for (int j = 1; j <= 8; j++) {
               boolean[] a = Arrays.copyOf(x, j);
               boolean[] b = a.clone();
               ArrayUtils.reverse(b);
               for (int k = 0; k < j; k++) {
                   Assertions.assertEquals(Conversion.binaryToHexDigit(a, k), 
                                           Conversion.binaryBeMsb0ToHexDigit(b, 
k));
               }
           }
       }
   }
   ```
   
   The unit tests for these methods do not cover using a bad input `srcPos` 
outside the array. In the case of `binaryToHexDigit` it would throw an IOOBE 
when using `src[srcPos]`. In the case of `binaryBeMsb0ToHexDigit` the padding 
logic may throw an exception but I have not checked.
   
   I would update both methods with a check that `srcPos` is inside the input 
array, something like the `Objects.checkIndex` method from JDK 9:
   ```Java
         if (Integer.compareUnsigned(srcPos, src.length) >= 0) {
           throw new IndexOutOfBoundsException(srcPos + " is not within array 
length " + src.length);
         }
   ```
   This would make the check on `src.length` redundant as 0 is not within an 
array of length 0.
   
   The updated method becomes:
   ```Java
   public static char binaryBeMsb0ToHexDigit(boolean[] src, int srcPos) {
       if (src.length == 0) {
           throw new IllegalArgumentException("Cannot convert an empty array.");
       }
       // JDK 9: Objects.checkIndex(int index, int length)
       if (Integer.compareUnsigned(srcPos, src.length) >= 0) {
           throw new IndexOutOfBoundsException(srcPos + " is not within array 
length " + src.length);
       }
       // Little-endian bit 0 position
       final int pos = src.length - 1 - srcPos;
       if (3 <= pos && src[pos - 3]) {
           if (src[pos - 2]) {
               if (src[pos - 1]) {
                   return src[pos] ? 'f' : 'e';
               }
               return src[pos] ? 'd' : 'c';
           }
           if (src[pos - 1]) {
               return src[pos] ? 'b' : 'a';
           }
           return src[pos] ? '9' : '8';
       }
       if (2 <= pos && src[pos - 2]) {
           if (src[pos - 1]) {
               return src[pos] ? '7' : '6';
           }
           return src[pos] ? '5' : '4';
       }
       if (1 <= pos && src[pos - 1]) {
           return src[pos] ? '3' : '2';
       }
       return src[pos] ? '1' : '0';
   ```
   
   This passes the existing unit tests and the extra test for bit reversal 
shown above. The check for a zero length array is redundant with the index test 
but I left it in as it would be a breaking change. The existing unit tests do 
expect an IllegalArgumentException for this case.
   
   It has the advantage that it is clearly the same as `binaryToHexDigit` but 
working from the end of the array backwards, i.e. the input array is big-endian 
most-significant-byte at position 0 (BeMsb0)
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to