Wenhai Li has posted comments on this change.

Change subject: Applied the multiway fuzzyjoin based on the prefix-based join 
and the selectFuzzyJoin testCases.
......................................................................


Patch Set 38:

(18 comments)

Since the newly introduced rules inside FuzzyJoinRuleCollection is triggered 
before its following ABC rule collections (remember ABC F ABC?). You know, we 
configured CondPushDownAndJoinInferenceRuleCollection as noDfs to avoid the 
dead loop of the iterative RemoveUnusedVars/AggsRule after 
PushSelectIntoJoinRule and its related rules. I guess the variable has been 
substituted by FuzzyCollections before the second round of ABC, which results 
in this inconsistencies. But after we check that difference, we find they are 
isomorphic, i.e., the only difference aims at the variable number $101 -> $102, 
does it matter?

https://asterix-gerrit.ics.uci.edu/#/c/1076/38/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/compiler/provider/DefaultRuleSetFactory.java
File 
asterixdb/asterix-algebra/src/main/java/org/apache/asterix/compiler/provider/DefaultRuleSetFactory.java:

Line 54:         defaultLogicalRewrites
> This line should not be changed. Let's keep it as it is.
Refer to below, it was wrong. Most importantly, I cannot turn off the automatic 
grammatical checking of IntelliJ.


https://asterix-gerrit.ics.uci.edu/#/c/1076/38/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/FuzzyJoinRule.java
File 
asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/FuzzyJoinRule.java:

Line 421:             throw new AlgebricksException(e);
> I think this should be a CompilationException and it needs to have an error
Done


Line 428:         } catch (AsterixException e) {
> We will not use AsterixException. So, after this patch set: https://asterix
Done


Line 429:             throw new AlgebricksException(e);
> I think this should be a CompilationException and it needs to have an error
Done


https://asterix-gerrit.ics.uci.edu/#/c/1076/38/asterixdb/asterix-app/src/test/resources/optimizerts/queries/fj-dblp-csx-hybrid.aql
File 
asterixdb/asterix-app/src/test/resources/optimizerts/queries/fj-dblp-csx-hybrid.aql:

Line 20:  * Description    : Multiple fuzzy join on three tables, with a star 
join condition.
> tables -> datasets
Done


https://asterix-gerrit.ics.uci.edu/#/c/1076/38/asterixdb/asterix-app/src/test/resources/optimizerts/queries/fj-dblp-csx-selflink.aql
File 
asterixdb/asterix-app/src/test/resources/optimizerts/queries/fj-dblp-csx-selflink.aql:

Line 21:  *                  The base table DBLP will fuzzy join with CSX and 
propagate the results
> table -> dataset
Done


https://asterix-gerrit.ics.uci.edu/#/c/1076/38/asterixdb/asterix-app/src/test/resources/optimizerts/queries/fj-dblp-csx-simple.aql
File 
asterixdb/asterix-app/src/test/resources/optimizerts/queries/fj-dblp-csx-simple.aql:

Line 19: drop dataverse fj-dblp-csx if exists;
> Please put a comment here - what this test does.
Done


Line 26:   id: int32, 
> Remove this white space.
Done


Line 34:   id: int32, 
> Remove this white space.
Done


https://asterix-gerrit.ics.uci.edu/#/c/1076/38/asterixdb/asterix-app/src/test/resources/optimizerts/queries/fj-dblp-csx-star.aql
File 
asterixdb/asterix-app/src/test/resources/optimizerts/queries/fj-dblp-csx-star.aql:

Line 21:  *                  The CSX and DBLP tables are used twice and will be 
propagated onto
> table -> dataset
Done


https://asterix-gerrit.ics.uci.edu/#/c/1076/38/asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.1.1/word-jaccard.1.ddl.aql
File 
asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.1.1/word-jaccard.1.ddl.aql:

Line 21:  *                  We expect the join to be transformed into an 
prefix-based fuzzy join following with a < select.
> an -> a
"an 'less than' select" ?


https://asterix-gerrit.ics.uci.edu/#/c/1076/38/hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/logical/visitors/IsomorphismUtilities.java
File 
hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/logical/visitors/IsomorphismUtilities.java:

Line 81:     // Iteratively get all the produced variables from the downstream 
operators of root.
> This returns an operator that produced the given PK variable. Right? Please
Done


Line 82:     private static ILogicalOperator 
extractPKProduction(ILogicalOperator root, LogicalVariable pk)
> extractPKProduction -> getOpThatProducePK,
Done


Line 92:                 validate = true;
> Rather than using validate variable, you can just break here. Please check.
Done


Line 96:                     validate = true;
> Rather than using validate variable, you can just break here. Please check.
Done


Line 107:     public static void mergeHomogeneousPK(ILogicalOperator op, 
List<LogicalVariable> pks) throws AlgebricksException {
> pks -> PKvars,
Done


Line 112:                 throw new AlgebricksException("Illegal variable 
production.");
> Please follow the new exception handling proposal.
How about the AlgebricksException standard? I noticed almost all the 
AlgebricksExceptions use this form, any more suggestion?


Line 116:         Map<LogicalVariable, LogicalVariable> variableMapping = new 
HashMap<>();
> What's key and value? Can we put a comment?
Done


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1076
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8736f104905eeda763d39709e002c2b9629278cc
Gerrit-PatchSet: 38
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Wenhai Li <[email protected]>
Gerrit-Reviewer: Chen Li <[email protected]>
Gerrit-Reviewer: Jenkins <[email protected]>
Gerrit-Reviewer: Taewoo Kim <[email protected]>
Gerrit-Reviewer: Wenhai Li <[email protected]>
Gerrit-HasComments: Yes

Reply via email to