Ian Maxon has posted comments on this change.

Change subject: ASTERIXDB-1102: VarSize Encoding to store length of String and 
ByteArray
......................................................................


Patch Set 5: Code-Review+1

(8 comments)

Few thoughts... very cool though overall. I should try this on the data that 
caused this to be raised in the first place, hopefully it fails in a different 
way now ;)

https://asterix-gerrit.ics.uci.edu/#/c/449/5/hyracks/hyracks-data/hyracks-data-std/src/main/java/org/apache/hyracks/data/std/primitive/UTF8StringPointable.java
File 
hyracks/hyracks-data/hyracks-data-std/src/main/java/org/apache/hyracks/data/std/primitive/UTF8StringPointable.java:

Line 65:     public static UTF8StringPointable generateUTF8Pointable(String 
testString) {
Is there a reason here this parameter is named testString?


Line 316:             GrowableArray out) throws IOException {
I agree with your comments here. If we throw an RTE when this happens, is it 
breaking things? If upstream things are relying on undefined behavior we should 
take note of it while we have the chance.


https://asterix-gerrit.ics.uci.edu/#/c/449/5/hyracks/hyracks-dataflow-common/src/main/java/org/apache/hyracks/dataflow/common/data/marshalling/UTF8StringSerializerDeserializer.java
File 
hyracks/hyracks-dataflow-common/src/main/java/org/apache/hyracks/dataflow/common/data/marshalling/UTF8StringSerializerDeserializer.java:

Line 35: 
Ah ok, I think this was the Singleton thing you were referring to. Is there a 
reason here to have this method, then? Maybe I'm being dense but from just 
looking at it, I don't see the point of having a method that just calls the 
default constructor, if this is suppose to come out of a Factory instead.


https://asterix-gerrit.ics.uci.edu/#/c/449/5/hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/bytes/Base64Parser.java
File 
hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/bytes/Base64Parser.java:

Line 23: 
So one question I have, is that is there no allocation-free library we might be 
able to use for this instead of maintaining our own base64 conversions? An 
example is what we do with UUIDs, though I think that is maybe a bad one since 
it isn't allocation free...


Line 106:      */
I guess I can't think of a clean way to avoid having this method duplicate code 
off the top of my head, but it kind of feels like there might be? I can think 
of some ugly things that involved boxed numeric types but obviously we want to 
avoid that...


https://asterix-gerrit.ics.uci.edu/#/c/449/5/hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/encoding/VarLenIntEncoderDecoder.java
File 
hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/encoding/VarLenIntEncoderDecoder.java:

Line 133: 
Is there a reason this is commented out? Is it just not used?


https://asterix-gerrit.ics.uci.edu/#/c/449/5/hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/string/UTF8StringUtil.java
File 
hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/string/UTF8StringUtil.java:

Line 27: 
Note from discussion: This is really just migrated from asterix-fuzzyjoin and 
hyracks-storage-am-invertedindex


Line 143:         // a little speedup for ascii chars
>From discussion, let's just cut this and always return Character.toLowerCase();


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/449
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7e95df0f06984b784ebac2c84b97e56a50207d27
Gerrit-PatchSet: 5
Gerrit-Project: hyracks
Gerrit-Branch: master
Gerrit-Owner: Jianfeng Jia <[email protected]>
Gerrit-Reviewer: Ian Maxon <[email protected]>
Gerrit-Reviewer: Jenkins <[email protected]>
Gerrit-Reviewer: Preston Carman <[email protected]>
Gerrit-Reviewer: Taewoo Kim <[email protected]>
Gerrit-Reviewer: Till Westmann <[email protected]>
Gerrit-HasComments: Yes

Reply via email to