Attila Bukor has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14050 )

Change subject: KUDU-1938 [java] Add support for CHAR/VARCHAR pt 4
......................................................................


Patch Set 7:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/14050/2/java/kudu-client/src/main/java/org/apache/kudu/ColumnSchema.java
File java/kudu-client/src/main/java/org/apache/kudu/ColumnSchema.java:

http://gerrit.cloudera.org:8080/#/c/14050/2/java/kudu-client/src/main/java/org/apache/kudu/ColumnSchema.java@121
PS2, Line 121:     this.typeSize = type.getSize(typeAttributes);
             :     this.wireType = wireType;
             :     this.comment = comment;
             :   }
             :
             :   /**
             :    * Get the column's Type
             :    * @return the type
             :    */
             :   public Type getType() {
             :     return type;
             :   }
             :
             :   /**
             :    * Get the column's name
             :    * @return A string representation of the name
             :    */
> hmm, I can't remember if we do anything special for Decimal. I thought the
I couldn't find any validation in your Decimal patch and I checked the 
server-side and don't see anything there. There wasn't anything besides basic 
validations like authz, nullability that aren't type-specific.


http://gerrit.cloudera.org:8080/#/c/14050/2/java/kudu-client/src/main/java/org/apache/kudu/client/PartialRow.java
File java/kudu-client/src/main/java/org/apache/kudu/client/PartialRow.java:

http://gerrit.cloudera.org:8080/#/c/14050/2/java/kudu-client/src/main/java/org/apache/kudu/client/PartialRow.java@642
PS2, Line 642:    *
> I think with the name change it's clear enough.
Done


http://gerrit.cloudera.org:8080/#/c/14050/4/java/kudu-client/src/main/java/org/apache/kudu/client/PartialRow.java
File java/kudu-client/src/main/java/org/apache/kudu/client/PartialRow.java:

http://gerrit.cloudera.org:8080/#/c/14050/4/java/kudu-client/src/main/java/org/apache/kudu/client/PartialRow.java@692
PS4, Line 692:    * @param val value to add
> If KeyEncoder can use the string version I think that makes sense to conver
Done


http://gerrit.cloudera.org:8080/#/c/14050/7/java/kudu-client/src/main/java/org/apache/kudu/client/PartialRow.java
File java/kudu-client/src/main/java/org/apache/kudu/client/PartialRow.java:

http://gerrit.cloudera.org:8080/#/c/14050/7/java/kudu-client/src/main/java/org/apache/kudu/client/PartialRow.java@762
PS7, Line 762:       String substring = val.substring(0, length);
> Is it technically correct to substring by characters while the string is st
I think so. If we converted to bytes first, we'd have to do the same kind of 
manual byte juggling that we do in char_utils.cc for the C++ version of 
KuduPartialRow.

We do the truncation with String.substring() which uses characters instead of 
bytes, so we should be fine there. Then we get the byte representation with 
Bytes.fromString() which returns String.getBytes(CharsetUtil.UTF_8) so the 
return byte[] is already converted to the UTF-8 representation of the string.

Then we check if the string is a valid UTF8 string with checkUtf8 - to be 
honest this might be unnecessary as the byte[] array returned should always be 
UTF-8 encoded. I'm not 100% sure i's unnecessary though and I couldn't figure 
out a way to test it, based on what I found on the topic, the characters that 
can be encoded with UTF-8 and UTF-16 are the same (Unicode), only the 
representation differs - which on the other hand makes it more likely that the 
checkUTF8 stuff is unnecessary. Maybe I should remove it?

Please double check my thought process.


http://gerrit.cloudera.org:8080/#/c/14050/7/java/kudu-client/src/main/java/org/apache/kudu/client/PartialRow.java@764
PS7, Line 764:         substring = substring.replaceAll(" +$", "");
> What happens in the case where a CHAR column has trailing spaces to so that
We already truncated everything beyond max length and we still need to remove 
leftover trailing spaces. For example if we have a CHAR(5) and the input is "A  
      A", it should be truncated to "A". If we move this replaceAll above the 
val.substring() line, we would get "A    " instead.


http://gerrit.cloudera.org:8080/#/c/14050/7/java/kudu-client/src/main/java/org/apache/kudu/client/PartialRow.java@769
PS7, Line 769:     } else {
> This else statement  is the same as the end the if statement. Can this be r
Done



--
To view, visit http://gerrit.cloudera.org:8080/14050
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I03edf5e65409e895512d5cd81a607180632e8995
Gerrit-Change-Number: 14050
Gerrit-PatchSet: 7
Gerrit-Owner: Attila Bukor <abu...@apache.org>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <abu...@apache.org>
Gerrit-Reviewer: Grant Henke <granthe...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Comment-Date: Fri, 16 Aug 2019 20:31:31 +0000
Gerrit-HasComments: Yes

Reply via email to