Github user chunhui-shi commented on a diff in the pull request:
https://github.com/apache/drill/pull/430#discussion_r57202442
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/HashHelper.java
---
@@ -17,47 +17,77 @@
*/
package org.apache.drill.exec.expr.fn.impl;
+import io.netty.buffer.DrillBuf;
+import org.apache.drill.common.config.DrillConfig;
+import org.apache.drill.common.exceptions.DrillConfigurationException;
+
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
-public class HashHelper {
+public abstract class HashHelper {
static final org.slf4j.Logger logger =
org.slf4j.LoggerFactory.getLogger(HashHelper.class);
+ public static final String defaultHashClassName = new
String("org.apache.drill.exec.expr.fn.impl.MurmurHash3");
+ static final String HASH_CLASS_PROP = "drill.exec.hash.class";
+ static String actualHashClassName = defaultHashClassName;
+ static DrillHash hashCall = new MurmurHash3();
+ static {
- /** taken from mahout **/
- public static int hash(ByteBuffer buf, int seed) {
- // save byte order for later restoration
-
- int m = 0x5bd1e995;
- int r = 24;
+ try {
+ DrillConfig config = DrillConfig.create();
+ String configuredClassName = config.getString(HASH_CLASS_PROP);
+ if(configuredClassName != null && configuredClassName != "") {
+ actualHashClassName = configuredClassName;
+ hashCall = config.getInstanceOf(HASH_CLASS_PROP, DrillHash.class);
+ }
+ logger.debug("HashHelper initializes with " + actualHashClassName);
+ }
+ catch(Exception ex){
+ logger.error("Could not initialize Hash %s", ex.getMessage());
+ }
+ }
- int h = seed ^ buf.remaining();
+ public static String getHashClassName(){
+ return actualHashClassName;
+ }
- while (buf.remaining() >= 4) {
- int k = buf.getInt();
+ public static int hash32(int val, long seed) {
+ double converted = val;
+ return hash32(converted, seed);
+ }
+ public static int hash32(long val, long seed) {
+ double converted = val;
+ return hash32(converted, seed);
+ }
+ public static int hash32(float val, long seed){
+ double converted = val;
+ return hash32(converted, seed);
+ }
- k *= m;
- k ^= k >>> r;
- k *= m;
+ public static int hash32(double val, long seed){
+ return hashCall.hash32(val, seed);
+ }
- h *= m;
- h ^= k;
- }
+ public static int hash32(int start, int end, DrillBuf buffer, int seed){
+ return hashCall.hash32(start, end, buffer, seed);
--- End diff --
I looked at the JIT optimized result (C2) at runtime, you are right that
this 'hashCall' did not get optimized, the optimization effort seems adding an
extra comparing logic of 3 instructions (checking if it is hashCall) before the
inline optimization as shown below:
0x00007f8bb75f38ac: mov $0x7c5f3b438,%r10 ; {oop(a
'java/lang/Class' = 'org/apache/drill/exec/expr/fn/impl/HashHelper')}
0x00007f8bb75f38b6: cmp 0x68(%r10),%r12d
0x00007f8bb75f38ba: je 0x00007f8bb75f38d6 ;*invokevirtual hash32
; -
org.apache.drill.exec.expr.fn.impl.HashHelper::hash32@7 (line 80)
So I will be removing the indirection. Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---