Dan Burkert has posted comments on this change.

Change subject: KUDU-1065: [java client] Flexible Partition Pruning

Patch Set 1:



Line 311:    * Not for public consumption: use either predicates or primary key 
bounds instead.
> If Impala is no longer using these two, should we just remove them?
I tried that, but they are used internally.  I did move them from public to 
package private, so they shouldn't be accessible.


Line 142:   /** The partition pruner */
> Not a useful comment, just drop it.

PS1, Line 260:     if (!table.getPartitionSchema().isSimpleRangePartitioning() 
             :         (startPrimaryKey.length != 0 ||
             :          endPrimaryKey.length != 0) &&
             :         PARTITION_PRUNE_WARN.getAndSet(false)) {
             :       LOG.warn("Starting full table scan. " +
             :                "In the future this scan may be automatically 
optimized with partition pruning.");
> Is this still relevant?

File java/kudu-client/src/main/java/org/apache/kudu/client/KeyEncoder.java:

Line 348
> This is still a parameter though.

Line 31: import java.io.ByteArrayOutputStream;
> Not used anymore?

Line 45:   private final ByteVec buf = ByteVec.create();
> Not really seeing the point of this class member given that the first thing
I agree, I was just trying to preserve the existing structure of KeyEncoder.  
I'll switch it over to be a non-instantiable utility class.  I don't think 
preserving ByteVec's is particularly important for performance, it should be a 
very shortlived object.  In fact there are a lot of cases where a KeyEncoder 
was getting constructed just to call one of these methods once, so I think it 
will be a total wash.

Line 63:   public static int getHashBucket(PartialRow row, HashBucketSchema 
hashSchema) {
> Javadoc?

File java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java:

Line 328:           List <LocatedTablet> newTablets = table.getTabletsLocations(
> Nit: List<LocatedTablet>

File java/kudu-client/src/main/java/org/apache/kudu/client/PartitionPruner.java:

PS1, Line 292: { break; }
> Nit: why the braces if the break is on the same line as the condition?

PS1, Line 360: Step 1:
> In this function there's only Step 1.

Line 398:   private static byte[] pushPredicatesIntoUpperBoundRangeKey(Schema 
> I wonder if these push methods would be clearer if combined with a bool (or
I don't disagree, but these are pretty straight translations of existing 
methods in the C++ side.  I think keeping the methods in-sync is more important 
than anything, since this is some pretty dense code.  They are different in 
pretty subtle ways.  It is true that they are always called as a pair, though.

Line 433:         throw new IllegalArgumentException("unexpected predicate type 
can not be pushed into key");
> Nit: can you add the predicate as you did in pushPredicatesIntoLowerBoundRa

File java/kudu-client/src/main/java/org/apache/kudu/util/ByteVec.java:

PS1, Line 60:   /**
            :    * Creates a new vector.
            :    * @return the new vector.
            :    */
            :   public static ByteVec create() {
            :     return new ByteVec(DEFAULT_CAPACITY);
            :   }
            :   /**
            :    * Creates a new vector with the specified capacity.
            :    * @param capacity the initial capacity of the vector
            :    * @return a new vector with the specified capacity
            :    */
            :   public static ByteVec withCapacity(int capacity) {
            :     return new ByteVec(capacity);
            :   }
            :   /**
            :    * Wrap an existing array with a vector.
            :    * The array should not be modified after this call.
            :    * @param data the initial data for the vector
            :    * @return a vector wrapping the data
            :    */
            :   public static ByteVec wrap(byte[] data) {
            :     return new ByteVec(data);
            :   }
> You find these more useful than exposing the constructors directly? You cou
Yah I'm a fan of static ctor functions.  I can change it if you prefer proper 
ctors, though.

Line 115:   public void reserve(int additional) {
> Seems like the first two checks can be removed; they are duplicated in rese
The checks are necessary here because 'additional' is transformed before being 
passed in.  I've changed this to not call into reserveExact.

Line 116:     if (additional < 0) throw new IllegalArgumentException("negative 
> Nit: how about Preconditions.checkArgument(additional > 0) instead? Elsewhe

PS1, Line 130: len
> Shouldn't this be 'additional'?

PS1, Line 131:  >
> Shouldn't this be >= ?

Line 177:     if (index >= len) throw new IndexOutOfBoundsException();
> May want to include a helpful message in the exception.

PS1, Line 178: data[index] = value;
> Won't this do the bounds check for you?
no, the bounds check is against len, not data.length.

PS1, Line 181:   /**
             :    * Appends the bytes from another byte array to this vec.
             :    * @param values the values to append
             :    * @param offset the offset to append from
             :    * @param len the number of bytes to append
             :    */
> Nit: reword a bit to make it clear that we're appending some subset of the 

Line 194:    * Appends the bytes from another byte array to this vec.
> Nit: likewise, here you should say something like "Appends all the bytes...

PS1, Line 216:     if (index >= len) throw new IndexOutOfBoundsException();
             :     return data[index];
> Nit: add a helpful message, or let Java do the bounds check for you.

PS1, Line 229:    * The vector should not be concurrently modified while the 
iterator is in use.
> Will a CME be thrown if someone modifies during iteration?

Line 232:   public Iterator iterator() {
> How about having ByteVec inherit Iterable<Byte> and then have the nested It
The whole point of this was that it's an iterator of primitive bytes, not Bytes 
(and asList could be used to get that functionality).  I'm actually going to 
remove this since it's not used, and I can't see why it would ever be.  It was 
important in kudu-ts where this class was originally lifted from.

Line 281:   public int hashCode() {
> Maybe use Guava's Objects.hashCode(...) to simplify?
that doesn't handle arrays properly.

PS1, Line 290:     ByteVec clone = new ByteVec(0);
             :     clone.data = Arrays.copyOf(data, data.length);
             :     clone.len = len;
             :     return clone;
> How about:


> Copyright

Line 17:   @Test
> Can you reorder the contents of this class so that the "helper" methods pre

File java/kudu-client/src/test/java/org/apache/kudu/util/TestByteVec.java:

Line 47:   private void checkByteVec(List<Byte> vals) {
> How about testing wrap or withCapacity?

PS1, Line 55: assertEquals(vals, vec.asList());
> This is going to rely on the correctness of ByteVec.equals(), but presumabl
No, this is actually calling into the equals of whatever type of list asList 

PS1, Line 66: copy.truncate(vals.size() / 2);
> For symmetry with the assertEquals() you just called, perhaps assert that v

File java/kudu-client/src/test/resources/flags:

Line 1
> Sneaky sneaky. At least mention it in your commit description?

To view, visit http://gerrit.cloudera.org:8080/4299
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib27b54841d87cf854175ab8cdfa8798b337d71f9
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <d...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <d...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-HasComments: Yes

Reply via email to