[jira] [Created] (SPARK-32060) Huber loss Convergence

2020-06-22 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-32060: Summary: Huber loss Convergence Key: SPARK-32060 URL: https://issues.apache.org/jira/browse/SPARK-32060 Project: Spark Issue Type: Bug Components:

[jira] [Created] (SPARK-31976) use MemoryUsage to control the size of block

2020-06-12 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-31976: Summary: use MemoryUsage to control the size of block Key: SPARK-31976 URL: https://issues.apache.org/jira/browse/SPARK-31976 Project: Spark Issue Type:

[jira] [Updated] (SPARK-31976) use MemoryUsage to control the size of block

2020-06-12 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-31976: - Description: According to the performance test in

[jira] [Commented] (SPARK-31948) expose mapSideCombine in aggByKey/reduceByKey/foldByKey

2020-06-10 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17132930#comment-17132930 ] zhengruifeng commented on SPARK-31948: -- [~viirya] [~srowen]   I will test whether there is

[jira] [Updated] (SPARK-31948) expose mapSideCombine in aggByKey/reduceByKey/foldByKey

2020-06-09 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-31948: - Description: {{1, aggregateByKey,}} {{reduceByKey}} and  {{foldByKey}} will always perform

[jira] [Updated] (SPARK-31948) expose mapSideCombine in aggByKey/reduceByKey/foldByKey

2020-06-09 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-31948: - Description: {{1, aggregateByKey,}} {{reduceByKey}} and  {{foldByKey}} will always perform

[jira] [Created] (SPARK-31948) expose mapSideCombine in aggByKey/reduceByKey/foldByKey

2020-06-09 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-31948: Summary: expose mapSideCombine in aggByKey/reduceByKey/foldByKey Key: SPARK-31948 URL: https://issues.apache.org/jira/browse/SPARK-31948 Project: Spark

[jira] [Commented] (SPARK-31925) Summary.totalIterations greater than maxIters

2020-06-09 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130002#comment-17130002 ] zhengruifeng commented on SPARK-31925: -- also ping [~srowen]  [~weichenxu123] >

[jira] [Created] (SPARK-31925) Summary.totalIterations greater than maxIters

2020-06-08 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-31925: Summary: Summary.totalIterations greater than maxIters Key: SPARK-31925 URL: https://issues.apache.org/jira/browse/SPARK-31925 Project: Spark Issue Type:

[jira] [Commented] (SPARK-31783) Performance test on dense and sparse datasets

2020-05-21 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17112850#comment-17112850 ] zhengruifeng commented on SPARK-31783: -- if blockSize==1, original code path dealing with each

[jira] [Commented] (SPARK-31783) Performance test on dense and sparse datasets

2020-05-21 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17112823#comment-17112823 ] zhengruifeng commented on SPARK-31783: -- ping [~srowen] [~mengxr] [~weichenxu123] [~huaxingao] I am

[jira] [Commented] (SPARK-31783) Performance test on dense and sparse datasets

2020-05-20 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17112818#comment-17112818 ] zhengruifeng commented on SPARK-31783: -- The test code is in 'blockify_total', and current result is

[jira] [Updated] (SPARK-31783) Performance test on dense and sparse datasets

2020-05-20 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-31783: - Attachment: blockify_perf_20200521.xlsx > Performance test on dense and sparse datasets >

[jira] [Updated] (SPARK-31783) Performance test on dense and sparse datasets

2020-05-20 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-31783: - Attachment: blockify_total > Performance test on dense and sparse datasets >

[jira] [Assigned] (SPARK-31783) Performance test on dense and sparse datasets

2020-05-20 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-31783: Assignee: zhengruifeng > Performance test on dense and sparse datasets >

[jira] [Created] (SPARK-31783) Performance test on dense and sparse datasets

2020-05-20 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-31783: Summary: Performance test on dense and sparse datasets Key: SPARK-31783 URL: https://issues.apache.org/jira/browse/SPARK-31783 Project: Spark Issue Type:

[jira] [Created] (SPARK-31782) Performance test on dense and sparse datasets

2020-05-20 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-31782: Summary: Performance test on dense and sparse datasets Key: SPARK-31782 URL: https://issues.apache.org/jira/browse/SPARK-31782 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-31782) Performance test on dense and sparse datasets

2020-05-20 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-31782. -- Resolution: Duplicate > Performance test on dense and sparse datasets >

[jira] [Comment Edited] (SPARK-31714) Performance test on java vectorization vs dot vs gemv vs gemm

2020-05-15 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108018#comment-17108018 ] zhengruifeng edited comment on SPARK-31714 at 5/15/20, 7:33 AM:

[jira] [Commented] (SPARK-31714) Performance test on java vectorization vs dot vs gemv vs gemm

2020-05-15 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108018#comment-17108018 ] zhengruifeng commented on SPARK-31714: -- additionally test on impl of gemv: {code:java}

[jira] [Commented] (SPARK-31714) Performance test on java vectorization vs dot vs gemv vs gemm

2020-05-14 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17107905#comment-17107905 ] zhengruifeng commented on SPARK-31714: -- test code: {code:java} test("performance: gemv vs dot") {

[jira] [Updated] (SPARK-31714) Performance test on java vectorization vs dot vs gemv vs gemm

2020-05-14 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-31714: - Attachment: blas-perf > Performance test on java vectorization vs dot vs gemv vs gemm >

[jira] [Updated] (SPARK-31714) Performance test on java vectorization vs dot vs gemv vs gemm

2020-05-14 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-31714: - Attachment: BLASSuite.scala > Performance test on java vectorization vs dot vs gemv vs gemm >

[jira] [Assigned] (SPARK-31714) Performance test on java vectorization vs dot vs gemv vs gemm

2020-05-14 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-31714: Assignee: zhengruifeng > Performance test on java vectorization vs dot vs gemv vs gemm >

[jira] [Created] (SPARK-31714) Performance test on java vectorization vs dot vs gemv vs gemm

2020-05-14 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-31714: Summary: Performance test on java vectorization vs dot vs gemv vs gemm Key: SPARK-31714 URL: https://issues.apache.org/jira/browse/SPARK-31714 Project: Spark

[jira] [Resolved] (SPARK-30699) GMM blockify input vectors

2020-05-11 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-30699. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 27473

[jira] [Resolved] (SPARK-31656) AFT blockify input vectors

2020-05-08 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-31656. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 28473

[jira] [Created] (SPARK-31661) Document usage of blockSize

2020-05-07 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-31661: Summary: Document usage of blockSize Key: SPARK-31661 URL: https://issues.apache.org/jira/browse/SPARK-31661 Project: Spark Issue Type: Sub-task

[jira] [Assigned] (SPARK-31652) Add ANOVASelector and FValueSelector to PySpark

2020-05-07 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-31652: Assignee: Huaxin Gao > Add ANOVASelector and FValueSelector to PySpark >

[jira] [Resolved] (SPARK-31652) Add ANOVASelector and FValueSelector to PySpark

2020-05-07 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-31652. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 28464

[jira] [Assigned] (SPARK-31659) Add VarianceThresholdSelector examples and doc

2020-05-07 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-31659: Assignee: Huaxin Gao > Add VarianceThresholdSelector examples and doc >

[jira] [Resolved] (SPARK-31659) Add VarianceThresholdSelector examples and doc

2020-05-07 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-31659. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 28478

[jira] [Resolved] (SPARK-30660) LinearRegression blockify input vectors

2020-05-07 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-30660. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 28471

[jira] [Assigned] (SPARK-31656) AFT blockify input vectors

2020-05-06 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-31656: Assignee: zhengruifeng > AFT blockify input vectors > -- > >

[jira] [Created] (SPARK-31656) AFT blockify input vectors

2020-05-06 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-31656: Summary: AFT blockify input vectors Key: SPARK-31656 URL: https://issues.apache.org/jira/browse/SPARK-31656 Project: Spark Issue Type: Sub-task

[jira] [Resolved] (SPARK-30659) LogisticRegression blockify input vectors

2020-05-06 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-30659. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 28458

[jira] [Assigned] (SPARK-31127) Add abstract Selector

2020-05-06 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-31127: Assignee: Huaxin Gao > Add abstract Selector > - > >

[jira] [Resolved] (SPARK-31127) Add abstract Selector

2020-05-06 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-31127. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 27978

[jira] [Reopened] (SPARK-30661) KMeans blockify input vectors

2020-05-05 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reopened SPARK-30661: -- > KMeans blockify input vectors > - > > Key:

[jira] [Resolved] (SPARK-30642) LinearSVC blockify input vectors

2020-05-05 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-30642. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 28349

[jira] [Updated] (SPARK-31603) AFT uses common functions in RDDLossFunction

2020-04-29 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-31603: - Summary: AFT uses common functions in RDDLossFunction (was: AFT uses common RDDLossFunction)

[jira] [Created] (SPARK-31603) AFT uses common RDDLossFunction

2020-04-29 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-31603: Summary: AFT uses common RDDLossFunction Key: SPARK-31603 URL: https://issues.apache.org/jira/browse/SPARK-31603 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-31182) PairRDD support aggregateByKeyWithinPartitions

2020-04-28 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-31182. -- Resolution: Not A Problem > PairRDD support aggregateByKeyWithinPartitions >

[jira] [Assigned] (SPARK-31494) flatten the result dataframe of ANOVATest

2020-04-20 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-31494: Assignee: zhengruifeng > flatten the result dataframe of ANOVATest >

[jira] [Resolved] (SPARK-31494) flatten the result dataframe of ANOVATest

2020-04-20 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-31494. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 28270

[jira] [Assigned] (SPARK-31492) flatten the result dataframe of FValueTest

2020-04-20 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-31492: Assignee: zhengruifeng > flatten the result dataframe of FValueTest >

[jira] [Resolved] (SPARK-31492) flatten the result dataframe of FValueTest

2020-04-20 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-31492. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 28268

[jira] [Resolved] (SPARK-30661) KMeans blockify input vectors

2020-04-20 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-30661. -- Resolution: Not A Problem > KMeans blockify input vectors > - > >

[jira] [Resolved] (SPARK-30202) impl QuantileTransform

2020-04-19 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-30202. -- Resolution: Not A Problem > impl QuantileTransform > -- > >

[jira] [Created] (SPARK-31494) flatten the result dataframe of ANOVATest

2020-04-19 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-31494: Summary: flatten the result dataframe of ANOVATest Key: SPARK-31494 URL: https://issues.apache.org/jira/browse/SPARK-31494 Project: Spark Issue Type:

[jira] [Created] (SPARK-31492) flatten the result dataframe of FValueTest

2020-04-19 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-31492: Summary: flatten the result dataframe of FValueTest Key: SPARK-31492 URL: https://issues.apache.org/jira/browse/SPARK-31492 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-31433) Summarizer supports string arguments

2020-04-19 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-31433. -- Resolution: Not A Problem > Summarizer supports string arguments >

[jira] [Resolved] (SPARK-31301) flatten the result dataframe of tests in stat

2020-04-14 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-31301. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 28176

[jira] [Assigned] (SPARK-31301) flatten the result dataframe of tests in stat

2020-04-14 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-31301: Assignee: zhengruifeng > flatten the result dataframe of tests in stat >

[jira] [Created] (SPARK-31436) MinHash keyDistance optimization

2020-04-13 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-31436: Summary: MinHash keyDistance optimization Key: SPARK-31436 URL: https://issues.apache.org/jira/browse/SPARK-31436 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-31433) Summarizer supports string arguments

2020-04-13 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-31433: Summary: Summarizer supports string arguments Key: SPARK-31433 URL: https://issues.apache.org/jira/browse/SPARK-31433 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-31301) flatten the result dataframe of tests in stat

2020-04-09 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080175#comment-17080175 ] zhengruifeng commented on SPARK-31301: -- (One small question: the new method takes a Dataset[_]

[jira] [Commented] (SPARK-31301) flatten the result dataframe of tests in stat

2020-04-09 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17078958#comment-17078958 ] zhengruifeng commented on SPARK-31301: -- [~srowen] There are two methods now: {code:java}

[jira] [Commented] (SPARK-31301) flatten the result dataframe of tests in stat

2020-04-08 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17078898#comment-17078898 ] zhengruifeng commented on SPARK-31301: -- [~srowen] How do you think about changing the return type

[jira] [Resolved] (SPARK-31309) Migrate the ChiSquareTest from MLlib to ML

2020-04-08 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-31309. -- Resolution: Not A Problem > Migrate the ChiSquareTest from MLlib to ML >

[jira] [Commented] (SPARK-31301) flatten the result dataframe of tests in stat

2020-04-01 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072408#comment-17072408 ] zhengruifeng commented on SPARK-31301: -- Another advantage that returning rows is that: high-dim

[jira] [Reopened] (SPARK-31309) Migrate the ChiSquareTest from MLlib to ML

2020-04-01 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reopened SPARK-31309: -- > Migrate the ChiSquareTest from MLlib to ML > -- > >

[jira] [Commented] (SPARK-31301) flatten the result dataframe of tests in stat

2020-03-31 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072319#comment-17072319 ] zhengruifeng commented on SPARK-31301: -- {code:java} @Since("3.1.0") def testChiSquare( dataset:

[jira] [Resolved] (SPARK-31300) Migrate the implementation of algorithms from MLlib to ML

2020-03-31 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-31300. -- Resolution: Not A Problem > Migrate the implementation of algorithms from MLlib to ML >

[jira] [Resolved] (SPARK-31309) Migrate the ChiSquareTest from MLlib to ML

2020-03-31 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-31309. -- Resolution: Not A Problem > Migrate the ChiSquareTest from MLlib to ML >

[jira] [Created] (SPARK-31309) Migrate the ChiSquareTest from MLlib to ML

2020-03-30 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-31309: Summary: Migrate the ChiSquareTest from MLlib to ML Key: SPARK-31309 URL: https://issues.apache.org/jira/browse/SPARK-31309 Project: Spark Issue Type:

[jira] [Updated] (SPARK-31309) Migrate the ChiSquareTest from MLlib to ML

2020-03-30 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-31309: - Priority: Minor (was: Major) > Migrate the ChiSquareTest from MLlib to ML >

[jira] [Commented] (SPARK-31301) flatten the result dataframe of tests in stat

2020-03-30 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071452#comment-17071452 ] zhengruifeng commented on SPARK-31301: -- [~srowen] It is not targeted 3.0.0. I agree that we must

[jira] [Assigned] (SPARK-31222) Make ANOVATest Sparsity-Aware

2020-03-30 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-31222: Assignee: zhengruifeng > Make ANOVATest Sparsity-Aware > - >

[jira] [Resolved] (SPARK-31222) Make ANOVATest Sparsity-Aware

2020-03-30 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-31222. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 27982

[jira] [Updated] (SPARK-31301) flatten the result dataframe of tests in stat

2020-03-30 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-31301: - Description: {code:java} scala> import org.apache.spark.ml.linalg.{Vector, Vectors} import

[jira] [Updated] (SPARK-31301) flatten the result dataframe of tests in stat

2020-03-30 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-31301: - Description: {code:java} scala> import org.apache.spark.ml.linalg.{Vector, Vectors} import

[jira] [Updated] (SPARK-31301) flatten the result dataframe of tests in stat

2020-03-30 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-31301: - Description: {code:java} scala> import org.apache.spark.ml.linalg.{Vector, Vectors} import

[jira] [Commented] (SPARK-31301) flatten the result dataframe of tests in stat

2020-03-30 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17070827#comment-17070827 ] zhengruifeng commented on SPARK-31301: -- friendly ping [~srowen] [~huaxingao] > flatten the result

[jira] [Created] (SPARK-31301) flatten the result dataframe of tests in stat

2020-03-30 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-31301: Summary: flatten the result dataframe of tests in stat Key: SPARK-31301 URL: https://issues.apache.org/jira/browse/SPARK-31301 Project: Spark Issue Type:

[jira] [Created] (SPARK-31300) Migrate the implementation of algorithms from MLlib to ML

2020-03-30 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-31300: Summary: Migrate the implementation of algorithms from MLlib to ML Key: SPARK-31300 URL: https://issues.apache.org/jira/browse/SPARK-31300 Project: Spark

[jira] [Assigned] (SPARK-31283) Simplify ChiSq by adding a common method

2020-03-29 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-31283: Assignee: zhengruifeng > Simplify ChiSq by adding a common method >

[jira] [Resolved] (SPARK-31283) Simplify ChiSq by adding a common method

2020-03-29 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-31283. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 28045

[jira] [Created] (SPARK-31283) Simplify ChiSq by adding a common method

2020-03-27 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-31283: Summary: Simplify ChiSq by adding a common method Key: SPARK-31283 URL: https://issues.apache.org/jira/browse/SPARK-31283 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-31243) add ANOVATest and FValueTest to PySpark

2020-03-27 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-31243. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 28012

[jira] [Assigned] (SPARK-31243) add ANOVATest and FValueTest to PySpark

2020-03-27 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-31243: Assignee: Huaxin Gao > add ANOVATest and FValueTest to PySpark >

[jira] [Resolved] (SPARK-31223) Update py code to generate data in testsuites

2020-03-25 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-31223. -- Resolution: Fixed > Update py code to generate data in testsuites >

[jira] [Assigned] (SPARK-31223) Update py code to generate data in testsuites

2020-03-25 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-31223: Assignee: Huaxin Gao (was: zhengruifeng) > Update py code to generate data in

[jira] [Assigned] (SPARK-31223) Update py code to generate data in testsuites

2020-03-25 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-31223: Assignee: zhengruifeng > Update py code to generate data in testsuites >

[jira] [Resolved] (SPARK-30923) Spark MLlib, GraphX 3.0 QA umbrella

2020-03-25 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-30923. -- Resolution: Fixed > Spark MLlib, GraphX 3.0 QA umbrella > ---

[jira] [Commented] (SPARK-31223) Update py code to generate dates in testsuites

2020-03-22 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17064554#comment-17064554 ] zhengruifeng commented on SPARK-31223: -- ping [~huaxingao] > Update py code to generate dates in

[jira] [Created] (SPARK-31223) Update py code to generate dates in testsuites

2020-03-22 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-31223: Summary: Update py code to generate dates in testsuites Key: SPARK-31223 URL: https://issues.apache.org/jira/browse/SPARK-31223 Project: Spark Issue Type:

[jira] [Created] (SPARK-31222) Make ANOVATest Sparsity-Aware

2020-03-22 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-31222: Summary: Make ANOVATest Sparsity-Aware Key: SPARK-31222 URL: https://issues.apache.org/jira/browse/SPARK-31222 Project: Spark Issue Type: Improvement

[jira] [Assigned] (SPARK-31185) implement VarianceThresholdSelector

2020-03-21 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-31185: Assignee: Huaxin Gao > implement VarianceThresholdSelector >

[jira] [Resolved] (SPARK-31185) implement VarianceThresholdSelector

2020-03-21 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-31185. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 27954

[jira] [Assigned] (SPARK-30932) ML 3.0 QA: API: Java compatibility, docs

2020-03-19 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-30932: Assignee: zhengruifeng > ML 3.0 QA: API: Java compatibility, docs >

[jira] [Resolved] (SPARK-30932) ML 3.0 QA: API: Java compatibility, docs

2020-03-19 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-30932. -- Resolution: Fixed > ML 3.0 QA: API: Java compatibility, docs >

[jira] [Assigned] (SPARK-30935) Update MLlib, GraphX websites for 3.0

2020-03-19 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-30935: Assignee: Huaxin Gao > Update MLlib, GraphX websites for 3.0 >

[jira] [Assigned] (SPARK-30931) ML 3.0 QA: API: Python API coverage

2020-03-19 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-30931: Assignee: Huaxin Gao > ML 3.0 QA: API: Python API coverage >

[jira] [Created] (SPARK-31182) PairRDD support aggregateByKeyWithinPartitions

2020-03-18 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-31182: Summary: PairRDD support aggregateByKeyWithinPartitions Key: SPARK-31182 URL: https://issues.apache.org/jira/browse/SPARK-31182 Project: Spark Issue Type:

[jira] [Created] (SPARK-31180) Implement PowerTransform

2020-03-18 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-31180: Summary: Implement PowerTransform Key: SPARK-31180 URL: https://issues.apache.org/jira/browse/SPARK-31180 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-30931) ML 3.0 QA: API: Python API coverage

2020-03-11 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057592#comment-17057592 ] zhengruifeng commented on SPARK-30931: -- Thanks [~huaxingao] for your work > ML 3.0 QA: API: Python

[jira] [Resolved] (SPARK-30931) ML 3.0 QA: API: Python API coverage

2020-03-11 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-30931. -- Resolution: Fixed > ML 3.0 QA: API: Python API coverage > ---

[jira] [Commented] (SPARK-30935) Update MLlib, GraphX websites for 3.0

2020-03-11 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057587#comment-17057587 ] zhengruifeng commented on SPARK-30935: -- [~huaxingao] Thanks! > Update MLlib, GraphX websites for

[jira] [Resolved] (SPARK-30935) Update MLlib, GraphX websites for 3.0

2020-03-11 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-30935. -- Resolution: Fixed > Update MLlib, GraphX websites for 3.0 >

<    1   2   3   4   5   6   7   8   9   10   >