Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/20568#discussion_r168870192
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/FeatureHasher.scala ---
@@ -218,4 +221,32 @@ object FeatureHasher extends
DefaultParamsReadable[FeatureHasher] {
@Since("2.3.0")
override def load(path: String): FeatureHasher = super.load(path)
+
+ private val seed = OldHashingTF.seed
+
+ /**
+ * Calculate a hash code value for the term object using
+ * Austin Appleby's MurmurHash 3 algorithm (MurmurHash3_x86_32).
+ * This is the default hash algorithm used from Spark 2.0 onwards.
+ * Use hashUnsafeBytes2 to match the original algorithm with the value.
+ * See SPARK-23381.
+ */
+ @Since("2.3.0")
+ def murmur3Hash(term: Any): Int = {
--- End diff --
I would also address this comment.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]