[GitHub] spark pull request #17092: [SPARK-18450][ML] Scala API Change for LSH AND-am...

2018-07-18 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17092


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17092: [SPARK-18450][ML] Scala API Change for LSH AND-am...

2018-04-12 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17092#discussion_r180999421
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/MinHashLSH.scala 
---
@@ -119,6 +118,9 @@ class MinHashLSH(override val uid: String) extends 
LSH[MinHashLSHModel] with Has
   @Since("2.1.0")
   override def setNumHashTables(value: Int): this.type = 
super.setNumHashTables(value)
 
+  @Since("2.2.0")
--- End diff --

Ditto.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17092: [SPARK-18450][ML] Scala API Change for LSH AND-am...

2018-04-12 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17092#discussion_r180998595
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/feature/BucketedRandomProjectionLSH.scala
 ---
@@ -137,6 +136,9 @@ class BucketedRandomProjectionLSH(override val uid: 
String)
   @Since("2.1.0")
   override def setNumHashTables(value: Int): this.type = 
super.setNumHashTables(value)
 
+  @Since("2.2.0")
--- End diff --

`@Since("2.4.0")`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17092: [SPARK-18450][ML] Scala API Change for LSH AND-am...

2017-02-27 Thread Yunni
GitHub user Yunni opened a pull request:

https://github.com/apache/spark/pull/17092

[SPARK-18450][ML] Scala API Change for LSH AND-amplification

## What changes were proposed in this pull request?
Implemented a new Param numHashFunctions as the dimension of 
AND-amplification for Locality Sensitive Hashing. Now the hash of each feature 
in LSH is an array of size numHashTables while each element in the array is a 
vector of size numHashFunctions.

Two features are in the same hash bucket iff ANY pair of the vectors are 
equal (OR-amplification). Two vectors are equal iff ALL pair of the vector 
entries are equal (AND-amplification).

Will create follow-up PRs for Python API and Doc/Examples.

## How was this patch tested?
By running unit tests MinHashLSHSuite and BucketedRandomProjectionLSHSuite.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Yunni/spark SPARK-18450

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17092.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17092


commit e6f9f9541f0b00c14b7c5a201b22aeb400eb9f19
Author: Yun Ni 
Date:   2017-02-16T20:54:22Z

Scala API Change for AND-amplification

commit 010acb2caf69ca0822db6aeb866cce21cdfcce4b
Author: Yunni 
Date:   2017-02-27T03:43:21Z

Merge branch 'SPARK-18450' of https://github.com/Yunni/spark into 
SPARK-18450

commit 83a155699df4b15f1ab1fc427730613b63f7d1d6
Author: Yunni 
Date:   2017-02-27T04:04:37Z

Fix typos in unit tests

commit 9dd87ba21a025939df7020ff1491a2c6c29f2d93
Author: Yunni 
Date:   2017-02-28T02:04:10Z

Merge branch 'master' of https://github.com/apache/spark into SPARK-18450




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org