Dmitry Lychagin has posted comments on this change. Change subject: [NO ISSUE][FUN] Implement object-remove() ......................................................................
Patch Set 2: (23 comments) https://asterix-gerrit.ics.uci.edu/#/c/2693/2/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/objects/object_remove/object_remove.3.query.sqlpp File asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/objects/object_remove/object_remove.3.query.sqlpp: Line 40: select value object_remove(o, "lang") let's avoid subqueries in cases "t5" and "t6". The problem is that object_remove() will work on records produced by each subquery. Those records I believe will be of closed types in this case. Instead we can run object_remove on a variable bound to an item in each dataset ( u or m ). so , select value object_remove(u, lang) from TwitterUsers u Line 45: order by 1 "order by 1" is different in SQL++. It will treat "1" as a constant expression, not as a reference to the first column. Instead use a path expression on the variable bound to the input dataset. o.name in this case. https://asterix-gerrit.ics.uci.edu/#/c/2693/2/asterixdb/asterix-doc/src/main/markdown/builtins/8_record.md File asterixdb/asterix-doc/src/main/markdown/builtins/8_record.md: PS2, Line 295: A should this be lower case? Line 301: * `missing` if the argument `field_name` is missing, missing also also returned if input_object is missing Line 302: * `null` if the argument `field_name` is null or any other non-string value, `null` (in back quotes) Line 303: * `null` if the argument `input_object` is `null` or any other non-object value we usually have one list item per return type, so let's combine this two `null` lines into one https://asterix-gerrit.ics.uci.edu/#/c/2693/2/asterixdb/asterix-om/src/main/java/org/apache/asterix/om/functions/BuiltinFunctions.java File asterixdb/asterix-om/src/main/java/org/apache/asterix/om/functions/BuiltinFunctions.java: Line 225: public static final FunctionIdentifier RECORD_REMOVE_STRICT = we don't need object-remove-string. we do have object-concat-strict as an internal function to support SELECT t.* use-case, so it's an exception. object-remove (i.e. non-strict) is sufficient. In the future it'll need to generate warnings if there's a type mismatch in its arguments. https://asterix-gerrit.ics.uci.edu/#/c/2693/2/asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/evaluators/functions/records/RecordRemoveEvalFactory.java File asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/evaluators/functions/records/RecordRemoveEvalFactory.java: Line 55: private final boolean failOnArgTypeMismatch; as I mentioned earlier we don't need object_remove_strict(), so can remote this field. Line 78: private final IPointable[] argsPointables; arg pointables are of different types. the second one needs to be UTF8StringPointable (see my comments below), so it's better to create two pointable fields here, instead of a single array of generic IPointalbe type with two elements Line 93: if (hasStringFieldToRemoveArg(tuple) && hasValidObjectArg(tuple)) { when using auto-generated byte code (addGenerated()) we need to evaluate both arguments here before doing any if-then-else statements. Otherwise our byte-code generator does not work properly. Line 100: final IPointable pointable = new VoidPointable(); Pointable allocation needs to be done in the constructor, to avoid creating objects (pointables) for each input tuple. Line 106: if (typeTag == ATypeTag.SERIALIZED_MISSING_TYPE_TAG) { without object_remove_strict() to support, missing/null handling will be done by the generated bytecode if we use addGenerated() in FunctionCollection Line 121: final IPointable pointable = new VoidPointable(); same. we need to create pointable in the constructor to avoid object allocation for each input tuple Line 150: return UTF8StringPointable.ofAStringPointable(argsPointables[FIELD_NAME_ARG_IDX]).toString(); we try to avoid conversion to java.lang.String at runtime. The comparison operation can be done on bytes directly. See ARecordCaster.compare(), PointableHelper.compareStringBinValues(), UTF8StringPointable.compareTo() for examples. Line 155: final RecordBuilder outRecordBuilder = new RecordBuilder(); create RecordBuilder instance if the constructor and just reset here. Line 164: if (!utf8StringPointable.toString().equals(fieldToRemove)) { see my comment above on getFieldToRemove(). we can compare bytes directly instead of converting both to java.lang.String. Line 171: private ARecordVisitablePointable getInputRecordVisitablePointable() throws HyracksDataException { create instance of ARecordVisitablePointable in the constructor Line 174: if (hasDerivedType(recType.getFieldTypes())) { recType is available in the constructor, so you can compute hasDerivedType() there instead of doing it here for each input tuple. Line 190: final ARecordVisitablePointable openRecordPointable = create instance in the constructor and reuse here. Line 192: final ACastVisitor castVisitor = new ACastVisitor(); create instance in the constructor and reuse here Line 193: final Triple<IVisitablePointable, IAType, Boolean> castVisitorArg = create Triple instance in the constructor and reuse here https://asterix-gerrit.ics.uci.edu/#/c/2693/2/asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/functions/FunctionCollection.java File asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/functions/FunctionCollection.java: Line 649: fc.add(RecordRemoveDescriptor.FACTORY); we can use addGenerated() here, because this function returns missing if any input is missing, otherwise null if any input is null https://asterix-gerrit.ics.uci.edu/#/c/2693/2/hyracks-fullstack/hyracks/hyracks-data/hyracks-data-std/src/main/java/org/apache/hyracks/data/std/primitive/UTF8StringPointable.java File hyracks-fullstack/hyracks/hyracks-data/hyracks-data-std/src/main/java/org/apache/hyracks/data/std/primitive/UTF8StringPointable.java: Line 656: public static UTF8StringPointable ofAStringPointable(IPointable aStringPointable) { I suspect these methods will no longer be needed once we use byte comparison for strings in the evaluator. -- To view, visit https://asterix-gerrit.ics.uci.edu/2693 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8d4acfa0ef00ccdcb95e189b989a16f06acf0119 Gerrit-PatchSet: 2 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Murtadha Hubail <[email protected]> Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Dmitry Lychagin <[email protected]> Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Reviewer: Murtadha Hubail <[email protected]> Gerrit-Reviewer: Till Westmann <[email protected]> Gerrit-Reviewer: abdullah alamoudi <[email protected]> Gerrit-HasComments: Yes
