Preston Carman has posted comments on this change. Change subject: ASTERIXDB-1102: VarSize Encoding to store length of String and ByteArray ......................................................................
Patch Set 7: (4 comments) Here are a few of my comments. Also have the same static question I posted on asterix code review. Lets talk before you do any huge changes. https://asterix-gerrit.ics.uci.edu/#/c/449/7/hyracks/hyracks-data/hyracks-data-std/src/main/java/org/apache/hyracks/data/std/primitive/UTF8StringLowercasePointable.java File hyracks/hyracks-data/hyracks-data-std/src/main/java/org/apache/hyracks/data/std/primitive/UTF8StringLowercasePointable.java: Line 29: public final class UTF8StringLowercasePointable extends AbstractPointable implements IHashable, IComparable { Why do we need a new type of pointable for special case of strings? https://asterix-gerrit.ics.uci.edu/#/c/449/7/hyracks/hyracks-data/hyracks-data-std/src/main/java/org/apache/hyracks/data/std/primitive/UTF8StringPointable.java File hyracks/hyracks-data/hyracks-data-std/src/main/java/org/apache/hyracks/data/std/primitive/UTF8StringPointable.java: Line 177: public int ignoreCaseCompareTo(UTF8StringPointable other) { Do you think these helper functions are common between all users of Hyracks? Is this the right place for these functions? (just a question) https://asterix-gerrit.ics.uci.edu/#/c/449/7/hyracks/hyracks-storage-am-lsm-invertedindex/src/main/java/org/apache/hyracks/storage/am/lsm/invertedindex/tokenizers/AbstractUTF8StringBinaryTokenizer.java File hyracks/hyracks-storage-am-lsm-invertedindex/src/main/java/org/apache/hyracks/storage/am/lsm/invertedindex/tokenizers/AbstractUTF8StringBinaryTokenizer.java: Line 59: //TODO: This UTF8Tokenizer strongly rely on the Asterix data format, I think this would be a requirement to share the use of this function. At least for removing the tag for the data type, I think the length may be useful to keep the UTF8StringPointable format. https://asterix-gerrit.ics.uci.edu/#/c/449/7/hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/encoding/VarLenIntEncoderDecoder.java File hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/encoding/VarLenIntEncoderDecoder.java: Line 136: * public void encode(int val) { while(val > ENCODE_MASK) { bytes[pos++] = Remove if unused. (also has tabs.) -- To view, visit https://asterix-gerrit.ics.uci.edu/449 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7e95df0f06984b784ebac2c84b97e56a50207d27 Gerrit-PatchSet: 7 Gerrit-Project: hyracks Gerrit-Branch: master Gerrit-Owner: Jianfeng Jia <[email protected]> Gerrit-Reviewer: Ian Maxon <[email protected]> Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Reviewer: Jianfeng Jia <[email protected]> Gerrit-Reviewer: Preston Carman <[email protected]> Gerrit-Reviewer: Taewoo Kim <[email protected]> Gerrit-Reviewer: Till Westmann <[email protected]> Gerrit-HasComments: Yes
