[ 
https://issues.apache.org/jira/browse/ASTERIXDB-1736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15698708#comment-15698708
 ] 

Taewoo Kim commented on ASTERIXDB-1736:
---------------------------------------

Removed Grace Hash Join.

> Grace Hash Join and Hybrid Hash Join are not being used.
> --------------------------------------------------------
>
>                 Key: ASTERIXDB-1736
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1736
>             Project: Apache AsterixDB
>          Issue Type: Improvement
>            Reporter: Taewoo Kim
>            Assignee: Taewoo Kim
>
> As the title says, Grace Hash Join and Hybrid Hash Join are not being used. I 
> suggest that we remove these two join methods. Here are my findings for these 
> two joins. 
> 1) Grace Hash Join
> GraceHashJoinOperatorDescriptor is only called from two places: 
> org.apache.hyracks.examples.tpch.client.join and 
> TPCHCustomerOrderHashJoinTest.
> One is a Hyracks example (tpch.client) and the other is a unit test. This 
> join is not used currently (not chosen during the compilation).
> 2) Hybrid Hash Join
> During the compilation, the optimizer decides whether it will use Hybrid Hash 
> Join or Optimized Hybrid Hash Join. 
> If the hash function family for each key variable is set, then we use the 
> optimized hybrid hash join. 
> If not, we use the hybrid hash join. However, in fact, this path - hybrid 
> hash join path will never be chosen. Let's check the code. 
> {code:title=HybridHashJoinPOperator.java|borderStyle=solid}   
>         IBinaryHashFunctionFamily[] hashFunFamilies = 
> JobGenHelper.variablesToBinaryHashFunctionFamilies(keysLeftBranch,
>                 env, context);
>                 
>         ...
>         
>         boolean optimizedHashJoin = true;
>         for (IBinaryHashFunctionFamily family : hashFunFamilies) {
>             if (family == null) {
>                 optimizedHashJoin = false;
>                 break;
>             }
>         }
>         if (optimizedHashJoin) {
>             opDesc = generateOptimizedHashJoinRuntime(context, inputSchemas, 
> keysLeft, keysRight, hashFunFamilies,
>                     comparatorFactories, predEvaluatorFactory, recDescriptor, 
> spec);
>         } else {
>             opDesc = generateHashJoinRuntime(context, inputSchemas, keysLeft, 
> keysRight, hashFunFactories,
>                     comparatorFactories, predEvaluatorFactory, recDescriptor, 
> spec);
>         }
> {code}
>         
> As we can see, optimizedHashJoin is set to false only when the hash family is 
> null. 
> Then, how do we assign the hashfamily for each key variable?          
> {code:title=JobGenHelper.java|borderStyle=solid}
>     public static IBinaryHashFunctionFamily[] 
> variablesToBinaryHashFunctionFamilies(
>             Collection<LogicalVariable> varLogical, IVariableTypeEnvironment 
> env, JobGenContext context)
>                     throws AlgebricksException {
>         IBinaryHashFunctionFamily[] funFamilies = new 
> IBinaryHashFunctionFamily[varLogical.size()];
>         int i = 0;
>         IBinaryHashFunctionFamilyProvider bhffProvider = 
> context.getBinaryHashFunctionFamilyProvider();
>         for (LogicalVariable var : varLogical) {
>             Object type = env.getVarType(var);
>             funFamilies[i++] = bhffProvider.getBinaryHashFunctionFamily(type);
>         }
>         return funFamilies;
>     }
> {code}
> For each variable type, we try to get hash function family. In the current 
> codebase, AqlBinaryHashFunctionFamilyProvider is the only class that 
> implements IBinaryHashFunctionFamilyProvider.
> And for any type, it returns AMurmurHash3BinaryHashFunctionFamily. 
> So, there is no way that the hash function family is null.
> {code:title= AqlBinaryHashFunctionFamilyProvider.java|borderStyle=solid}
> public class AqlBinaryHashFunctionFamilyProvider implements 
> IBinaryHashFunctionFamilyProvider, Serializable {
>     private static final long serialVersionUID = 1L;
>     public static final AqlBinaryHashFunctionFamilyProvider INSTANCE = new 
> AqlBinaryHashFunctionFamilyProvider();
>     private AqlBinaryHashFunctionFamilyProvider() {
>     }
>     @Override
>     public IBinaryHashFunctionFamily getBinaryHashFunctionFamily(Object type) 
> throws AlgebricksException {
>         // AMurmurHash3BinaryHashFunctionFamily converts numeric type to 
> double type before doing hash()
>         return AMurmurHash3BinaryHashFunctionFamily.INSTANCE;
>     }
> }
> {code}
>  
>     



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to