[jira] [Updated] (SPARK-18812) Clarify "Spark ML"

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18812: -- Target Version/s: 2.1.0 (was: 2.1.1, 2.2.0) > Clarify "Spark ML" > --

[jira] [Updated] (SPARK-17822) JVMObjectTracker.objMap may leak JVM objects

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-17822: -- Target Version/s: 2.1.0, 2.0.3 (was: 2.0.3, 2.1.1, 2.2.0) > JVMObjectTracker.objMap

[jira] [Updated] (SPARK-18924) Improve collect/createDataFrame performance in SparkR

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18924: -- Target Version/s: (was: 2.2.0) > Improve collect/createDataFrame performance in

[jira] [Commented] (SPARK-19337) Documentation and examples for LinearSVC

2017-02-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870512#comment-15870512 ] Joseph K. Bradley commented on SPARK-19337: --- @yuhao yang Do you have time to work on this?

[jira] [Commented] (SPARK-14523) Feature parity for Statistics ML with MLlib

2017-02-10 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15861789#comment-15861789 ] Joseph K. Bradley commented on SPARK-14523: --- I'd like to keep this open until we have linked

[jira] [Reopened] (SPARK-14523) Feature parity for Statistics ML with MLlib

2017-02-10 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reopened SPARK-14523: --- > Feature parity for Statistics ML with MLlib >

[jira] [Resolved] (SPARK-18613) spark.ml LDA classes should not expose spark.mllib in APIs

2017-02-10 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-18613. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16860

[jira] [Commented] (SPARK-9478) Add sample weights to Random Forest

2017-02-10 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15861714#comment-15861714 ] Joseph K. Bradley commented on SPARK-9478: -- [~sethah] Thanks for researching this! +1 for not

[jira] [Commented] (SPARK-10802) Let ALS recommend for subset of data

2017-02-09 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15860110#comment-15860110 ] Joseph K. Bradley commented on SPARK-10802: --- Linking related issue for feature parity in

[jira] [Created] (SPARK-19535) ALSModel recommendAll analogs

2017-02-09 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-19535: - Summary: ALSModel recommendAll analogs Key: SPARK-19535 URL: https://issues.apache.org/jira/browse/SPARK-19535 Project: Spark Issue Type: New

[jira] [Commented] (SPARK-13857) Feature parity for ALS ML with MLLIB

2017-02-09 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15860108#comment-15860108 ] Joseph K. Bradley commented on SPARK-13857: --- Hi all, catching up these many ALS discussions

[jira] [Closed] (SPARK-10141) Number of tasks on executors still become negative after failures

2017-02-09 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley closed SPARK-10141. - Resolution: Done > Number of tasks on executors still become negative after failures >

[jira] [Assigned] (SPARK-17975) EMLDAOptimizer fails with ClassCastException on YARN

2017-02-09 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-17975: - Assignee: Tathagata Das > EMLDAOptimizer fails with ClassCastException on YARN

[jira] [Resolved] (SPARK-17975) EMLDAOptimizer fails with ClassCastException on YARN

2017-02-09 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-17975. --- Resolution: Fixed Fix Version/s: 2.2.0 2.1.1

[jira] [Updated] (SPARK-14804) Graph vertexRDD/EdgeRDD checkpoint results ClassCastException:

2017-02-09 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14804: -- Fix Version/s: (was: 3.0.0) 2.2.0 > Graph vertexRDD/EdgeRDD

[jira] [Commented] (SPARK-17975) EMLDAOptimizer fails with ClassCastException on YARN

2017-02-09 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859984#comment-15859984 ] Joseph K. Bradley commented on SPARK-17975: --- Will do, thanks! > EMLDAOptimizer fails with

[jira] [Commented] (SPARK-10141) Number of tasks on executors still become negative after failures

2017-02-09 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859982#comment-15859982 ] Joseph K. Bradley commented on SPARK-10141: --- I'll close this if no one has seen it in Spark 2.0

[jira] [Assigned] (SPARK-18613) spark.ml LDA classes should not expose spark.mllib in APIs

2017-02-08 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-18613: - Assignee: Sue Ann Hong > spark.ml LDA classes should not expose spark.mllib in

[jira] [Commented] (SPARK-17139) Add model summary for MultinomialLogisticRegression

2017-02-08 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15858373#comment-15858373 ] Joseph K. Bradley commented on SPARK-17139: --- @sethah Yep, that looks like what I had in mind.

[jira] [Resolved] (SPARK-19400) GLM fails for intercept only model

2017-02-08 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-19400. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16740

[jira] [Assigned] (SPARK-19400) GLM fails for intercept only model

2017-02-08 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-19400: - Assignee: Wayne Zhang > GLM fails for intercept only model >

[jira] [Assigned] (SPARK-17629) Add local version of Word2Vec findSynonyms for spark.ml

2017-02-07 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-17629: - Shepherd: Joseph K. Bradley Assignee: Asher Krim

[jira] [Commented] (SPARK-17498) StringIndexer.setHandleInvalid should have another option 'new'

2017-02-07 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856918#comment-15856918 ] Joseph K. Bradley commented on SPARK-17498: --- Linking related issue for QuantileDiscretizer

[jira] [Updated] (SPARK-17498) StringIndexer.setHandleInvalid should have another option 'new'

2017-02-07 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-17498: -- Summary: StringIndexer.setHandleInvalid should have another option 'new' (was:

[jira] [Comment Edited] (SPARK-17139) Add model summary for MultinomialLogisticRegression

2017-02-07 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856612#comment-15856612 ] Joseph K. Bradley edited comment on SPARK-17139 at 2/7/17 7:25 PM: ---

[jira] [Commented] (SPARK-17139) Add model summary for MultinomialLogisticRegression

2017-02-07 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856612#comment-15856612 ] Joseph K. Bradley commented on SPARK-17139: --- I'll offer a few thoughts first: * A

[jira] [Updated] (SPARK-10817) ML abstraction umbrella

2017-02-07 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-10817: -- Priority: Major (was: Critical) > ML abstraction umbrella > --- >

[jira] [Updated] (SPARK-19498) Discussion: Making MLlib APIs extensible for 3rd party libraries

2017-02-07 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-19498: -- Description: Per the recent discussion on the dev list, this JIRA is for discussing

[jira] [Created] (SPARK-19498) Discussion: Making MLlib APIs extensible for 3rd party libraries

2017-02-07 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-19498: - Summary: Discussion: Making MLlib APIs extensible for 3rd party libraries Key: SPARK-19498 URL: https://issues.apache.org/jira/browse/SPARK-19498 Project:

[jira] [Closed] (SPARK-7258) spark.ml API taking Graph instead of DataFrame

2017-02-07 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley closed SPARK-7258. Resolution: Won't Fix > spark.ml API taking Graph instead of DataFrame >

[jira] [Commented] (SPARK-7258) spark.ml API taking Graph instead of DataFrame

2017-02-07 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856318#comment-15856318 ] Joseph K. Bradley commented on SPARK-7258: -- This made more sense in the past...will close for

[jira] [Assigned] (SPARK-19467) PySpark ML shouldn't use circular imports

2017-02-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-19467: - Assignee: Maciej Szymkiewicz > PySpark ML shouldn't use circular imports >

[jira] [Resolved] (SPARK-19467) PySpark ML shouldn't use circular imports

2017-02-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-19467. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16814

[jira] [Commented] (SPARK-15573) Backwards-compatible persistence for spark.ml

2017-02-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15855097#comment-15855097 ] Joseph K. Bradley commented on SPARK-15573: --- It's a good point that we can't make updates to

[jira] [Commented] (SPARK-19208) MultivariateOnlineSummarizer performance optimization

2017-02-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15855090#comment-15855090 ] Joseph K. Bradley commented on SPARK-19208: --- You're right that sharing intermediate results

[jira] [Commented] (SPARK-12157) Support numpy types as return values of Python UDFs

2017-02-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15855060#comment-15855060 ] Joseph K. Bradley commented on SPARK-12157: --- I don't know of any Python UDF perf tests. Ad hoc

[jira] [Commented] (SPARK-16824) Add API docs for VectorUDT

2017-02-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15855057#comment-15855057 ] Joseph K. Bradley commented on SPARK-16824: --- I think we didn't document it since the future of

[jira] [Updated] (SPARK-16824) Add API docs for VectorUDT

2017-02-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-16824: -- Issue Type: Documentation (was: Improvement) > Add API docs for VectorUDT >

[jira] [Updated] (SPARK-16824) Add API docs for VectorUDT

2017-02-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-16824: -- Component/s: MLlib > Add API docs for VectorUDT > -- > >

[jira] [Resolved] (SPARK-19247) Improve ml word2vec save/load scalability

2017-02-05 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-19247. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16607

[jira] [Assigned] (SPARK-19247) Improve ml word2vec save/load scalability

2017-02-05 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-19247: - Assignee: Asher Krim > Improve ml word2vec save/load scalability >

[jira] [Updated] (SPARK-19247) Improve ml word2vec save/load scalability

2017-02-02 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-19247: -- Shepherd: Joseph K. Bradley Affects Version/s: 2.2.0 Target

[jira] [Commented] (SPARK-14503) spark.ml Scala API for FPGrowth

2017-02-02 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15850707#comment-15850707 ] Joseph K. Bradley commented on SPARK-14503: --- Sounds good, thank you! > spark.ml Scala API for

[jira] [Resolved] (SPARK-19389) Minor doc fixes, including Since tags in Python Params

2017-02-02 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-19389. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16723

[jira] [Commented] (SPARK-14503) spark.ml Scala API for FPGrowth

2017-02-01 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848747#comment-15848747 ] Joseph K. Bradley commented on SPARK-14503: --- There are a couple of design issues which have

[jira] [Comment Edited] (SPARK-12965) Indexer setInputCol() doesn't resolve column names like DataFrame.col()

2017-01-31 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847786#comment-15847786 ] Joseph K. Bradley edited comment on SPARK-12965 at 2/1/17 12:43 AM:

[jira] [Commented] (SPARK-12965) Indexer setInputCol() doesn't resolve column names like DataFrame.col()

2017-01-31 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847786#comment-15847786 ] Joseph K. Bradley commented on SPARK-12965: --- I'd say this is both a SQL and MLlib issue. I'm

[jira] [Updated] (SPARK-12965) Indexer setInputCol() doesn't resolve column names like DataFrame.col()

2017-01-31 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-12965: -- Affects Version/s: (was: 1.6.0) 2.2.0

[jira] [Updated] (SPARK-12965) Indexer setInputCol() doesn't resolve column names like DataFrame.col()

2017-01-31 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-12965: -- Component/s: (was: Spark Core) > Indexer setInputCol() doesn't resolve column

[jira] [Created] (SPARK-19416) Dataset.schema is inconsistent with Dataset in handling columns with periods

2017-01-31 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-19416: - Summary: Dataset.schema is inconsistent with Dataset in handling columns with periods Key: SPARK-19416 URL: https://issues.apache.org/jira/browse/SPARK-19416

[jira] [Updated] (SPARK-19247) Improve ml word2vec save/load scalability

2017-01-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-19247: -- Component/s: ML > Improve ml word2vec save/load scalability >

[jira] [Updated] (SPARK-19294) improve LocalLDAModel save/load scaling for large models

2017-01-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-19294: -- Component/s: ML > improve LocalLDAModel save/load scaling for large models >

[jira] [Updated] (SPARK-19247) Improve ml word2vec save/load scalability

2017-01-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-19247: -- Summary: Improve ml word2vec save/load scalability (was: improve ml word2vec

[jira] [Updated] (SPARK-19247) Improve ml word2vec save/load scalability

2017-01-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-19247: -- Issue Type: Improvement (was: Bug) > Improve ml word2vec save/load scalability >

[jira] [Updated] (SPARK-19294) improve LocalLDAModel save/load scaling for large models

2017-01-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-19294: -- Summary: improve LocalLDAModel save/load scaling for large models (was: improve ml

[jira] [Updated] (SPARK-19294) improve LocalLDAModel save/load scaling for large models

2017-01-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-19294: -- Issue Type: Improvement (was: Bug) > improve LocalLDAModel save/load scaling for

[jira] [Created] (SPARK-19389) Minor doc fixes, including Since tags in Python Params

2017-01-27 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-19389: - Summary: Minor doc fixes, including Since tags in Python Params Key: SPARK-19389 URL: https://issues.apache.org/jira/browse/SPARK-19389 Project: Spark

[jira] [Resolved] (SPARK-19336) LinearSVC Python API

2017-01-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-19336. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16694

[jira] [Commented] (SPARK-19208) MultivariateOnlineSummarizer performance optimization

2017-01-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15843670#comment-15843670 ] Joseph K. Bradley commented on SPARK-19208: --- Thanks for writing out your ideas. Here are my

[jira] [Updated] (SPARK-19208) MultivariateOnlineSummarizer performance optimization

2017-01-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-19208: -- Summary: MultivariateOnlineSummarizer performance optimization (was:

[jira] [Created] (SPARK-19382) Test sparse vectors in LinearSVCSuite

2017-01-26 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-19382: - Summary: Test sparse vectors in LinearSVCSuite Key: SPARK-19382 URL: https://issues.apache.org/jira/browse/SPARK-19382 Project: Spark Issue Type:

[jira] [Updated] (SPARK-18218) Optimize BlockMatrix multiplication, which may cause OOM and low parallelism usage problem in several cases

2017-01-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18218: -- Shepherd: Burak Yavuz (was: Yanbo Liang) > Optimize BlockMatrix multiplication, which

[jira] [Commented] (SPARK-4638) Spark's MLlib SVM classification to include Kernels like Gaussian / (RBF) to find non linear boundaries

2017-01-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15840711#comment-15840711 ] Joseph K. Bradley commented on SPARK-4638: -- Commenting here b/c of the recent dev list thread:

[jira] [Updated] (SPARK-18080) Locality Sensitive Hashing (LSH) Python API

2017-01-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18080: -- Assignee: Yun Ni (was: Yanbo Liang) > Locality Sensitive Hashing (LSH) Python API >

[jira] [Updated] (SPARK-14804) Graph vertexRDD/EdgeRDD checkpoint results ClassCastException:

2017-01-25 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14804: -- Assignee: Tathagata Das > Graph vertexRDD/EdgeRDD checkpoint results

[jira] [Commented] (SPARK-17975) EMLDAOptimizer fails with ClassCastException on YARN

2017-01-25 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838995#comment-15838995 ] Joseph K. Bradley commented on SPARK-17975: --- [SPARK-14804] was just fixed. [~jvstein], do you

[jira] [Commented] (SPARK-17265) EdgeRDD Difference throws an exception

2017-01-25 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838990#comment-15838990 ] Joseph K. Bradley commented on SPARK-17265: --- [SPARK-14804] was just fixed. [~shishir167], are

[jira] [Comment Edited] (SPARK-17265) EdgeRDD Difference throws an exception

2017-01-25 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838990#comment-15838990 ] Joseph K. Bradley edited comment on SPARK-17265 at 1/26/17 1:41 AM:

[jira] [Commented] (SPARK-17877) Can not checkpoint connectedComponents resulting graph

2017-01-25 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838993#comment-15838993 ] Joseph K. Bradley commented on SPARK-17877: --- [SPARK-14804] was just fixed. [~apivovarov], are

[jira] [Resolved] (SPARK-18036) Decision Trees do not handle edge cases

2017-01-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-18036. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16377

[jira] [Updated] (SPARK-18036) Decision Trees do not handle edge cases

2017-01-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18036: -- Assignee: Ilya Matiach > Decision Trees do not handle edge cases >

[jira] [Commented] (SPARK-19208) MaxAbsScaler and MinMaxScaler are very inefficient

2017-01-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829014#comment-15829014 ] Joseph K. Bradley commented on SPARK-19208: --- +1 for [~mlnick]'s suggestion. If we're

[jira] [Updated] (SPARK-19208) MaxAbsScaler and MinMaxScaler are very inefficient

2017-01-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-19208: -- Assignee: (was: Apache Spark) > MaxAbsScaler and MinMaxScaler are very inefficient

[jira] [Updated] (SPARK-14975) Predicted Probability per training instance for Gradient Boosted Trees

2017-01-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14975: -- Summary: Predicted Probability per training instance for Gradient Boosted Trees (was:

[jira] [Resolved] (SPARK-14975) Predicted Probability per training instance for Gradient Boosted Trees

2017-01-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-14975. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16441

[jira] [Closed] (SPARK-8855) Python API for Association Rules

2017-01-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley closed SPARK-8855. Resolution: Won't Fix > Python API for Association Rules >

[jira] [Created] (SPARK-19281) spark.ml Python API for FPGrowth

2017-01-18 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-19281: - Summary: spark.ml Python API for FPGrowth Key: SPARK-19281 URL: https://issues.apache.org/jira/browse/SPARK-19281 Project: Spark Issue Type:

[jira] [Commented] (SPARK-8855) Python API for Association Rules

2017-01-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828808#comment-15828808 ] Joseph K. Bradley commented on SPARK-8855: -- I'm going to close this issue in favor of the

[jira] [Updated] (SPARK-14503) spark.ml Scala API for FPGrowth

2017-01-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14503: -- Summary: spark.ml Scala API for FPGrowth (was: spark.ml API for FPGrowth) > spark.ml

[jira] [Commented] (SPARK-17136) Design optimizer interface for ML algorithms

2017-01-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828803#comment-15828803 ] Joseph K. Bradley commented on SPARK-17136: --- CC [~avulanov], who has thought a lot about these

[jira] [Updated] (SPARK-5256) Improving MLlib optimization APIs

2017-01-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5256: - Description: *Goal*: Improve APIs for optimization *Motivation*: There have been several

[jira] [Commented] (SPARK-13610) Create a Transformer to disassemble vectors in DataFrames

2017-01-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828756#comment-15828756 ] Joseph K. Bradley commented on SPARK-13610: --- One more: Would these selected subsets of elements

[jira] [Commented] (SPARK-19053) Supporting multiple evaluation metrics in DataFrame-based API: discussion

2017-01-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828755#comment-15828755 ] Joseph K. Bradley commented on SPARK-19053: --- After thinking about this more and hearing your

[jira] [Updated] (SPARK-14975) Predicted Probability per training instance for Gradient Boosted Trees in mllib.

2017-01-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14975: -- Shepherd: Joseph K. Bradley > Predicted Probability per training instance for Gradient

[jira] [Updated] (SPARK-14975) Predicted Probability per training instance for Gradient Boosted Trees in mllib.

2017-01-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14975: -- Assignee: Ilya Matiach > Predicted Probability per training instance for Gradient

[jira] [Updated] (SPARK-17747) WeightCol support non-double datatypes

2017-01-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-17747: -- Shepherd: Joseph K. Bradley Assignee: zhengruifeng Target

[jira] [Resolved] (SPARK-14567) Add instrumentation logs to MLlib training algorithms

2017-01-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-14567. --- Resolution: Fixed Fix Version/s: 2.2.0 > Add instrumentation logs to MLlib

[jira] [Resolved] (SPARK-18206) Log instrumentation in MPC, NB, LDA, AFT, GLR, Isotonic, LinReg

2017-01-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-18206. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 15671

[jira] [Updated] (SPARK-7146) Should ML sharedParams be a public API?

2017-01-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7146: - Target Version/s: (was: 2.2.0) > Should ML sharedParams be a public API? >

[jira] [Updated] (SPARK-10931) PySpark ML Models should contain Param values

2017-01-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-10931: -- Shepherd: (was: Joseph K. Bradley) > PySpark ML Models should contain Param values >

[jira] [Updated] (SPARK-7424) spark.ml classification, regression abstractions should add metadata to output column

2017-01-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7424: - Target Version/s: (was: 2.2.0) > spark.ml classification, regression abstractions

[jira] [Commented] (SPARK-14501) spark.ml parity for fpm - frequent items

2017-01-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827052#comment-15827052 ] Joseph K. Bradley commented on SPARK-14501: --- I'm doing a general pass to encourage the Shepherd

[jira] [Updated] (SPARK-10931) PySpark ML Models should contain Param values

2017-01-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-10931: -- Target Version/s: (was: 2.2.0) > PySpark ML Models should contain Param values >

[jira] [Updated] (SPARK-15571) Pipeline unit test improvements for 2.2

2017-01-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-15571: -- Shepherd: Joseph K. Bradley > Pipeline unit test improvements for 2.2 >

[jira] [Commented] (SPARK-14659) OneHotEncoder support drop first category alphabetically in the encoded vector

2017-01-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827041#comment-15827041 ] Joseph K. Bradley commented on SPARK-14659: --- I'm doing a general pass to encourage the Shepherd

[jira] [Updated] (SPARK-14706) Python ML persistence integration test

2017-01-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14706: -- Target Version/s: (was: 2.2.0) > Python ML persistence integration test >

[jira] [Commented] (SPARK-15799) Release SparkR on CRAN

2017-01-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827038#comment-15827038 ] Joseph K. Bradley commented on SPARK-15799: --- I'm doing a general pass to encourage the Shepherd

[jira] [Commented] (SPARK-16578) Configurable hostname for RBackend

2017-01-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827036#comment-15827036 ] Joseph K. Bradley commented on SPARK-16578: --- I'm doing a general pass to encourage the Shepherd

[jira] [Updated] (SPARK-17455) IsotonicRegression takes non-polynomial time for some inputs

2017-01-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-17455: -- Shepherd: Joseph K. Bradley > IsotonicRegression takes non-polynomial time for some

[jira] [Commented] (SPARK-18348) Improve tree ensemble model summary

2017-01-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827023#comment-15827023 ] Joseph K. Bradley commented on SPARK-18348: --- I'm doing a general pass to enforce the Shepherd

<    5   6   7   8   9   10   11   12   13   14   >