Jianfeng Jia has posted comments on this change. Change subject: ASTERIXDB-1102: VarSize Encoding to store length of String and ByteArray ......................................................................
Patch Set 4: (10 comments) https://asterix-gerrit.ics.uci.edu/#/c/450/4/asterix-app/src/main/java/org/apache/asterix/file/ExternalIndexingOperations.java File asterix-app/src/main/java/org/apache/asterix/file/ExternalIndexingOperations.java: Line 238: FilesIndexDescription filesIndexDescription = new FilesIndexDescription(); > Why change from a static reference? Because this FileIndexDescription now contains the non-static serializer/deserializer. So we need to use it as a Factory pattern. https://asterix-gerrit.ics.uci.edu/#/c/450/4/asterix-app/src/test/resources/runtimets/only.xml File asterix-app/src/test/resources/runtimets/only.xml: Line 1: <?xml version="1.0" encoding="UTF-8" standalone="yes"?> > Keep the license header. Done https://asterix-gerrit.ics.uci.edu/#/c/450/4/asterix-app/src/test/resources/runtimets/queries/boolean/and_01/and_01.3.query.aql File asterix-app/src/test/resources/runtimets/queries/boolean/and_01/and_01.3.query.aql: Line 21: let $x := boolean("true") > Are these queries testing the same thing? By specifying boolean(), it seems Yes, they are not doing the same thing. The previous test is wrong. It was supposed to test boolean, but it was testing string instead. https://asterix-gerrit.ics.uci.edu/#/c/450/4/asterix-app/src/test/resources/runtimets/queries/string/matches11/matches11.3.query.aql File asterix-app/src/test/resources/runtimets/queries/string/matches11/matches11.3.query.aql: Line 28: matches("hello",null), > Does this test the null issue that was raised on the mailing list? Is this Yes, it is related to the null-element-list issue. I add the null to work around that issue and make the matches test work. https://asterix-gerrit.ics.uci.edu/#/c/450/4/asterix-app/src/test/resources/runtimets/queries/string/startwith02/startwith02.3.query.aql File asterix-app/src/test/resources/runtimets/queries/string/startwith02/startwith02.3.query.aql: Line 27: null, > Same question about null. Yes, I add this null to walk around that null-element-list issue which break the expected test result. https://asterix-gerrit.ics.uci.edu/#/c/450/4/asterix-external-data/src/main/java/org/apache/asterix/external/indexing/dataflow/FileIndexTupleTranslator.java File asterix-external-data/src/main/java/org/apache/asterix/external/indexing/dataflow/FileIndexTupleTranslator.java: Line 44: private final FilesIndexDescription filesIndexDescription = new FilesIndexDescription(); > why change from static? Because some of its members are not static anymore. https://asterix-gerrit.ics.uci.edu/#/c/450/4/asterix-external-data/src/main/java/org/apache/asterix/external/library/java/JObjectAccessors.java File asterix-external-data/src/main/java/org/apache/asterix/external/library/java/JObjectAccessors.java: Line 241: //v = new String(b, s+1, l, "UTF-8"); > Can this comment be removed? Done https://asterix-gerrit.ics.uci.edu/#/c/450/4/asterix-external-data/src/main/java/org/apache/asterix/external/library/java/JObjectUtil.java File asterix-external-data/src/main/java/org/apache/asterix/external/library/java/JObjectUtil.java: Line 395: fieldNames[i] = new AStringSerializerDeserializer().deserialize(dis).getStringValue(); > Why change? The long answer is in the summary comments. Short answer for this, the AStringSerializerDeserializer now contains the stateful member. It is not thread-safe to using the Singleton. https://asterix-gerrit.ics.uci.edu/#/c/450/4/asterix-om/src/main/java/org/apache/asterix/dataflow/data/nontagged/serde/AStringSerializerDeserializer.java File asterix-om/src/main/java/org/apache/asterix/dataflow/data/nontagged/serde/AStringSerializerDeserializer.java: Line 34: private final UTF8StringSerializerDeserializer utf8SerDer = new UTF8StringSerializerDeserializer(); > Can this be kept static? Seems like that would be a good idea. Yes, it can be still static, but it will generate objects for sure. The previous Java outputstream will generate new String during every put and get, which is a lot... Or, an alternate way is to let the caller handle the object creation, which will make the code a lot messier. Now, since we have our own serializer we can control the object generation to only once to parse millions of records. The "down" side is it isn't static method anymore because we want to save the object creation during the different calls. Actually, if we stick to factory pattern it won't be any issues. However, these INSTANCE-like design introduces the singleton pattern, which actually doesn't compatible with the factory pattern. I think the source of problem is this singleton pattern, not the static method. So I removed the INSTANCE to let the caller use it as a factory way( initials once, uses many times). https://asterix-gerrit.ics.uci.edu/#/c/450/4/asterix-runtime/src/main/java/org/apache/asterix/runtime/evaluators/functions/records/RecordFieldsUtil.java File asterix-runtime/src/main/java/org/apache/asterix/runtime/evaluators/functions/records/RecordFieldsUtil.java: Line 80: @SuppressWarnings("unchecked") > Why not? Aha, this is the only place that initialize the Serder statically. Since now the Serializer/Der have the states, it not thread-safe anymore, which took me a day to fix one test case. Luckily, 99% of usage is not static. Basically, unless we are directly calling a static method, we can't assume it is thread-safe. So even without my changes, I think it will still be better to create a local object. -- To view, visit https://asterix-gerrit.ics.uci.edu/450 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: I41fff780f5c071742ef10129d83c8f945d5886d7 Gerrit-PatchSet: 4 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Jianfeng Jia <[email protected]> Gerrit-Reviewer: Ian Maxon <[email protected]> Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Reviewer: Jianfeng Jia <[email protected]> Gerrit-Reviewer: Preston Carman <[email protected]> Gerrit-Reviewer: Taewoo Kim <[email protected]> Gerrit-Reviewer: Till Westmann <[email protected]> Gerrit-HasComments: Yes
