rdblue commented on a change in pull request #828:
URL: https://github.com/apache/iceberg/pull/828#discussion_r434854008



##########
File path: spark/src/test/java/org/apache/iceberg/spark/data/RandomData.java
##########
@@ -442,4 +474,226 @@ private static BigInteger randomUnscaled(int precision, 
Random random) {
 
     return new BigInteger(sb.toString());
   }
+
+  private static class DictionaryEncodedDataGenerator extends 
RandomDataGenerator {
+
+    private DictionaryEncodedDataGenerator(Schema schema, long seed) {
+      super(schema, seed);
+    }
+
+    @Override
+    public Object primitive(Type.PrimitiveType primitive) {
+      Object result = generateDictionaryEncodablePrimitive(primitive, random);
+      return super.getPrimitive(primitive, result);
+    }
+
+    @SuppressWarnings("checkstyle:CyclomaticComplexity")

Review comment:
       This implementation seems over-complicated with a lot of unnecessary 
`switch` statements. I think checkstyle was right.
   
   I think you can simplify most of the implementations for types by just 
converting the choice variable (maybe name it better?)
   
   ```java
       @Override
       protected Object randomValue(Type.PrimitiveType primitive, Random 
random) {
         // 3 choices
         int choice = random.nextInt(3);
         switch (primitive.typeId()) {
           case BOOLEAN:
             return true; // doesn't really matter for booleans since they are 
not dictionary encoded
           case INTEGER:
           case DATE:
             return choice;
           case FLOAT:
             return (float) choice;
           case DOUBLE:
             return (double) choice;
           case LONG:
           case TIME:
           case TIMESTAMP:
             return (long) choice;
           case STRING:
             return UTF8String.fromString(String.valueOf(choice));
           case FIXED:
             byte[] fixed = new byte[((Types.FixedType) primitive).length()];
             Arrays.fill(fixed, (byte) choice);
             return fixed;
           case BINARY:
             byte[] binary = new byte[choice + 1];
             Arrays.fill(binary, (byte) choice);
             return binary;
           case DECIMAL:
             Types.DecimalType type = (Types.DecimalType) primitive;
             BigInteger unscaled = new BigInteger(String.valueOf(choice + 1));
             return Decimal.apply(new BigDecimal(unscaled, type.scale()));
           default:
             throw new IllegalArgumentException(
                 "Cannot generate random value for unknown type: " + primitive);
         }
       }
   ```
   
   This also uses `Arrays.fill` like I suggested below and updates binary to 
test different lengths.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to