[GitHub] [spark] holdenk commented on a change in pull request #26197: Implement p-value simulation and unit tests for chi2 test

2019-10-23 Thread GitBox
holdenk commented on a change in pull request #26197: Implement p-value 
simulation and unit tests for chi2 test
URL: https://github.com/apache/spark/pull/26197#discussion_r338180024
 
 

 ##
 File path: mllib/pom.xml
 ##
 @@ -130,6 +130,11 @@
   org.apache.spark
   spark-tags_${scala.binary.version}
 
+
+  com.tdunning
 
 Review comment:
   Generally speaking we try and not pick up new dependencies for small 
features, especially those which aren't maintained by a community. Looking at 
tdunning's tdigest package there's been ~5 distinct contributors for 2019, and 
if we expand the window up to 2017 that goes up to ~7.
   
   Do we know how actively maintined this going to be and if he is going to 
maintain the 3.X line once the 4.X release is out?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] holdenk commented on a change in pull request #26197: Implement p-value simulation and unit tests for chi2 test

2019-10-23 Thread GitBox
holdenk commented on a change in pull request #26197: Implement p-value 
simulation and unit tests for chi2 test
URL: https://github.com/apache/spark/pull/26197#discussion_r338180477
 
 

 ##
 File path: 
mllib/src/main/scala/org/apache/spark/mllib/stat/test/ChiSqTest.scala
 ##
 @@ -151,6 +155,8 @@ private[spark] object ChiSqTest extends Logging {
*/
   def chiSquared(observed: Vector,
   expected: Vector = Vectors.dense(Array.empty[Double]),
+  simulatePValue: Boolean = false,
 
 Review comment:
   Spark MLLib is in maintaince mode, see 
https://spark.apache.org/docs/latest/ml-guide.html . If we do want this I think 
we need to make sure it is exposed in Spark ML.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org