>From Ian Maxon <[email protected]>: Attention is currently required from: Ali Alsuliman, Calvin Thomas Dani.
Ian Maxon has posted comments on this change by Calvin Thomas Dani. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20126?usp=email ) Change subject: [NO ISSUE] Vector Distance KNN Function ...................................................................... Patch Set 17: (13 comments) Commit Message: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20126/comment/8f768fa7_591f10e1?usp=email : PS17, Line 7: NO ISSUE add the issue here https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20126/comment/1413ae56_9f83be79?usp=email : PS17, Line 9: no it does add functions, so yes https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20126/comment/6348aa65_1c7d9aad?usp=email : PS17, Line 14: vector_distance_constant() i removed this because it was basically entirely duplicated. if we need this, it can't be like that https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20126/comment/b762605b_3113dc6a?usp=email : PS17, Line 17: KNN this really isn't KNN, right? maybe we will use it as part of KNN but this itself isn't File asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/vector/VectorQueries.xml: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20126/comment/dd67a161_c3f83867?usp=email : PS15, Line 1: <!-- we need tests for manhattan, dot, l2 File asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/vector/distance-functions/vector_distance.1.ddl.sqlpp: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20126/comment/99cb305e_8d05f142?usp=email : PS15, Line 1: /* these files shouldn't be executable File asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/evaluators/functions/vector/VectorDistanceArrScalarEvaluator.java: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20126/comment/a5282868_c5c17859?usp=email : PS17, Line 76: private static final UTF8StringPointable EUCLIDEAN_DISTANCE_L2 = UTF8StringPointable.generateUTF8Pointable("l2"); : private static final UTF8StringPointable EUCLIDEAN_DISTANCE = : UTF8StringPointable.generateUTF8Pointable("euclidean"); : private static final UTF8StringPointable EUCLIDEAN_DISTANCE_L2_SQUARED = : UTF8StringPointable.generateUTF8Pointable("l2_squared"); : private static final UTF8StringPointable EUCLIDEAN_DISTANCE_SQUARED = : UTF8StringPointable.generateUTF8Pointable("euclidean_squared"); : private static final UTF8StringPointable MANHATTAN_FORMAT = : UTF8StringPointable.generateUTF8Pointable("manhattan distance"); : private static final UTF8StringPointable COSINE_FORMAT = UTF8StringPointable.generateUTF8Pointable("cosine"); : private static final UTF8StringPointable DOT_PRODUCT_FORMAT = UTF8StringPointable.generateUTF8Pointable("dot"); delete https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20126/comment/ea450178_9b637b0d?usp=email : PS17, Line 99: private static final Map<Integer, DistanceFunction> DISTANCE_MAP = Map.of(MANHATTAN_FORMAT.hash(), : VectorDistanceArrCalculation::manhattan, EUCLIDEAN_DISTANCE.hash(), VectorDistanceArrCalculation::euclidean, : EUCLIDEAN_DISTANCE_L2.hash(), VectorDistanceArrCalculation::euclidean, EUCLIDEAN_DISTANCE_SQUARED.hash(), : VectorDistanceArrCalculation::euclidean_squared, EUCLIDEAN_DISTANCE_L2_SQUARED.hash(), : VectorDistanceArrCalculation::euclidean_squared, COSINE_FORMAT.hash(), VectorDistanceArrCalculation::cosine, : DOT_PRODUCT_FORMAT.hash(), VectorDistanceArrCalculation::dot); delete https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20126/comment/7b6e4956_de94c409?usp=email : PS17, Line 116: 3 2 https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20126/comment/241dd31e_059e62d1?usp=email : PS17, Line 123: vectorPool = new ListObjectPool((IObjectFactory<double[], Integer>) double[]::new); this is better than it was but it's still bad. we need to consider the adversarial case where every vector is a different length. there needs to be a version of ListObjectPool that is capped at some size. the "happy" path here is basically reusing the same 2 double[] for the entire query, but there's no guarantee. https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20126/comment/e5de37bc_95ad713b?usp=email : PS17, Line 176: listAccessor1 maybe left/right would be better naming? https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20126/comment/f23d2902_2f6b4e50?usp=email : PS17, Line 189: if(!isConstant[0]) { : vectorPool.free(primitiveArray1); : } : if(!isConstant[1]) { : vectorPool.free(primitiveArray2); : } probably change this so that the array is given as a parameter to createPrimitiveList, and maybe change the name to fillVectorFromArray or something https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20126/comment/1b5a6774_c58a03b0?usp=email : PS17, Line 237: return getValueFromTag(typeTag, data, offset); what happens in an array of ANY when the types are not the same? should this be allowed? -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20126?usp=email To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings?usp=email Gerrit-MessageType: comment Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: I45176e7da13ed222f48bb36bd1f0ab4074a0bcb9 Gerrit-Change-Number: 20126 Gerrit-PatchSet: 17 Gerrit-Owner: Calvin Thomas Dani <[email protected]> Gerrit-Reviewer: Ali Alsuliman <[email protected]> Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Ian Maxon <[email protected]> Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Attention: Calvin Thomas Dani <[email protected]> Gerrit-Attention: Ali Alsuliman <[email protected]> Gerrit-Comment-Date: Tue, 02 Dec 2025 01:00:50 +0000 Gerrit-HasComments: Yes Gerrit-Has-Labels: No
