[jira] [Updated] (SPARK-21088) CrossValidator, TrainValidationSplit should collect all models when fitting: Python API

2017-09-14 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-21088: --- Summary: CrossValidator, TrainValidationSplit should collect all models when fitting: Python API

[jira] [Updated] (SPARK-21088) CrossValidator, TrainValidationSplit should collect all models when fitting: Python API

2017-09-14 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-21088: --- Description: In pyspark: We add a parameter whether to collect the full model list when

[jira] [Updated] (SPARK-21087) CrossValidator, TrainValidationSplit should collect all models when fitting: Scala API

2017-09-14 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-21087: --- Summary: CrossValidator, TrainValidationSplit should collect all models when fitting: Scala API

[jira] [Updated] (SPARK-21087) CrossValidator, TrainValidationSplit should preserve all models after fitting: Scala API

2017-09-14 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-21087: --- Summary: CrossValidator, TrainValidationSplit should preserve all models after fitting: Scala API

[jira] [Updated] (SPARK-21087) CrossValidator, TrainValidationSplit should preserve all models after fitting: Scala

2017-09-14 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-21087: --- Description: We add a parameter whether to collect the full model list when

[jira] [Commented] (SPARK-22004) CrossValidator, TrainValidationSplit dump sub models to disk when fitting: Scala API

2017-09-14 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165853#comment-16165853 ] Weichen Xu commented on SPARK-22004: I will create PR once SPARK-21086 merged. > CrossValidator,

[jira] [Updated] (SPARK-22004) CrossValidator, TrainValidationSplit dump sub models to disk when fitting: Scala API

2017-09-14 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-22004: --- Description: We add a parameter indicating whether to persist models to disk during training

[jira] [Updated] (SPARK-22004) CrossValidator, TrainValidationSplit dump sub models to disk when fitting: Scala API

2017-09-14 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-22004: --- Issue Type: Sub-task (was: New Feature) Parent: SPARK-21086 > CrossValidator,

[jira] [Created] (SPARK-22004) CrossValidator, TrainValidationSplit dump sub models to disk when fitting: Scala API

2017-09-14 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-22004: -- Summary: CrossValidator, TrainValidationSplit dump sub models to disk when fitting: Scala API Key: SPARK-22004 URL: https://issues.apache.org/jira/browse/SPARK-22004

[jira] [Closed] (SPARK-21802) Make sparkR MLP summary() expose probability column

2017-09-09 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu closed SPARK-21802. -- Resolution: Not A Problem > Make sparkR MLP summary() expose probability column >

[jira] [Updated] (SPARK-21911) Parallel Model Evaluation for ML Tuning: PySpark

2017-09-04 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-21911: --- Summary: Parallel Model Evaluation for ML Tuning: PySpark (was: Parallel Model Evaluation for ML

[jira] [Updated] (SPARK-19357) Parallel Model Evaluation for ML Tuning: Scala

2017-09-04 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-19357: --- Summary: Parallel Model Evaluation for ML Tuning: Scala (was: Parallel Model Evaluation for ML

[jira] [Created] (SPARK-21911) Parallel Model Evaluation for ML Tuning: Python

2017-09-04 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-21911: -- Summary: Parallel Model Evaluation for ML Tuning: Python Key: SPARK-21911 URL: https://issues.apache.org/jira/browse/SPARK-21911 Project: Spark Issue Type: New

[jira] [Updated] (SPARK-21898) Feature parity for KolmogorovSmirnovTest in MLlib

2017-09-02 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-21898: --- Issue Type: Sub-task (was: Bug) Parent: SPARK-4591 > Feature parity for

[jira] [Created] (SPARK-21898) Feature parity for KolmogorovSmirnovTest in MLlib

2017-09-02 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-21898: -- Summary: Feature parity for KolmogorovSmirnovTest in MLlib Key: SPARK-21898 URL: https://issues.apache.org/jira/browse/SPARK-21898 Project: Spark Issue Type:

[jira] [Comment Edited] (SPARK-21802) Make sparkR MLP summary() expose probability column

2017-08-31 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136234#comment-16136234 ] Weichen Xu edited comment on SPARK-21802 at 8/31/17 1:48 PM: - cc

[jira] [Created] (SPARK-21862) Add overflow check in PCA

2017-08-29 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-21862: -- Summary: Add overflow check in PCA Key: SPARK-21862 URL: https://issues.apache.org/jira/browse/SPARK-21862 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-21856) Update Python API for MultilayerPerceptronClassifierModel

2017-08-28 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-21856: -- Summary: Update Python API for MultilayerPerceptronClassifierModel Key: SPARK-21856 URL: https://issues.apache.org/jira/browse/SPARK-21856 Project: Spark Issue

[jira] [Created] (SPARK-21854) Python interface for MLOR summary

2017-08-28 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-21854: -- Summary: Python interface for MLOR summary Key: SPARK-21854 URL: https://issues.apache.org/jira/browse/SPARK-21854 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-21799) KMeans performance regression (5-6x slowdown) in Spark 2.2

2017-08-25 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16141319#comment-16141319 ] Weichen Xu commented on SPARK-21799: [~zahili] hmm..You're right. We are hard to get the precise

[jira] [Issue Comment Deleted] (SPARK-21799) KMeans performance regression (5-6x slowdown) in Spark 2.2

2017-08-25 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-21799: --- Comment: was deleted (was: I suggest check both `df.storageLevel` and `df.rdd.getStorageLevel` for

[jira] [Commented] (SPARK-21799) KMeans performance regression (5-6x slowdown) in Spark 2.2

2017-08-24 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16140162#comment-16140162 ] Weichen Xu commented on SPARK-21799: I suggest check both `df.storageLevel` and

[jira] [Commented] (SPARK-21799) KMeans performance regression (5-6x slowdown) in Spark 2.2

2017-08-24 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139802#comment-16139802 ] Weichen Xu commented on SPARK-21799: [~Siddharth Murching] Already have another jira & PR, take a

[jira] [Commented] (SPARK-21799) KMeans performance regression (5-6x slowdown) in Spark 2.2

2017-08-24 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139799#comment-16139799 ] Weichen Xu commented on SPARK-21799: [~Siddharth Murching] +1 This will cause double cache. > KMeans

[jira] [Commented] (SPARK-21770) ProbabilisticClassificationModel: Improve normalization of all-zero raw predictions

2017-08-23 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139364#comment-16139364 ] Weichen Xu commented on SPARK-21770: Hmm... `normalizeToProbabilitiesInPlace` is only effective in

[jira] [Created] (SPARK-21818) MultivariateOnlineSummarizer.variance generate negative result

2017-08-23 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-21818: -- Summary: MultivariateOnlineSummarizer.variance generate negative result Key: SPARK-21818 URL: https://issues.apache.org/jira/browse/SPARK-21818 Project: Spark

[jira] [Commented] (SPARK-21729) Generic test for ProbabilisticClassifier to ensure consistent output columns

2017-08-23 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137946#comment-16137946 ] Weichen Xu commented on SPARK-21729: I will work on this, thanks! > Generic test for

[jira] [Comment Edited] (SPARK-21802) Make sparkR MLP summary() expose probability column

2017-08-21 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136234#comment-16136234 ] Weichen Xu edited comment on SPARK-21802 at 8/22/17 4:25 AM: - cc

[jira] [Commented] (SPARK-21802) Make sparkR MLP summary() expose probability column

2017-08-21 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136234#comment-16136234 ] Weichen Xu commented on SPARK-21802: cc [~felixcheung] > Make sparkR MLP summary() expose

[jira] [Created] (SPARK-21802) Make sparkR MLP summary() expose probability column

2017-08-21 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-21802: -- Summary: Make sparkR MLP summary() expose probability column Key: SPARK-21802 URL: https://issues.apache.org/jira/browse/SPARK-21802 Project: Spark Issue Type:

[jira] [Created] (SPARK-21801) SparkR unit test randomly fail on trees

2017-08-21 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-21801: -- Summary: SparkR unit test randomly fail on trees Key: SPARK-21801 URL: https://issues.apache.org/jira/browse/SPARK-21801 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-21801) SparkR unit test randomly fail on trees

2017-08-21 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136229#comment-16136229 ] Weichen Xu commented on SPARK-21801: cc [~felixcheung] Can you help fix this ? > SparkR unit test

[jira] [Commented] (SPARK-21741) Python API for DataFrame-based multivariate summarizer

2017-08-15 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128226#comment-16128226 ] Weichen Xu commented on SPARK-21741: OK I will work on this. I will post a design doc first. >

[jira] [Updated] (SPARK-21681) MLOR do not work correctly when featureStd contains zero

2017-08-15 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-21681: --- Description: MLOR do not work correctly when featureStd contains zero. We can reproduce the bug

[jira] [Created] (SPARK-21681) MLOR do not work correctly when featureStd contains zero

2017-08-09 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-21681: -- Summary: MLOR do not work correctly when featureStd contains zero Key: SPARK-21681 URL: https://issues.apache.org/jira/browse/SPARK-21681 Project: Spark Issue

[jira] [Commented] (SPARK-20418) multi-label classification support

2017-07-26 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16102280#comment-16102280 ] Weichen Xu commented on SPARK-20418: I will work on this. > multi-label classification support >

[jira] [Commented] (SPARK-11215) Add multiple columns support to StringIndexer

2017-07-26 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16102274#comment-16102274 ] Weichen Xu commented on SPARK-11215: I will take over this feature and create a PR soon. > Add

[jira] [Commented] (SPARK-21087) CrossValidator, TrainValidationSplit should preserve all models after fitting: Scala

2017-07-26 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16102048#comment-16102048 ] Weichen Xu commented on SPARK-21087: I will work on it. > CrossValidator, TrainValidationSplit

[jira] [Commented] (SPARK-17025) Cannot persist PySpark ML Pipeline model that includes custom Transformer

2017-07-26 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16101994#comment-16101994 ] Weichen Xu commented on SPARK-17025: Because currently, scala calling python will be difficult and

[jira] [Commented] (SPARK-21523) Fix bug of strong wolfe linesearch `init` parameter lose effectiveness

2017-07-24 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099225#comment-16099225 ] Weichen Xu commented on SPARK-21523: I will work on this once the breeze cut a new version for this

[jira] [Updated] (SPARK-21523) Fix bug of strong wolfe linesearch `init` parameter lose effectiveness

2017-07-24 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-21523: --- Priority: Minor (was: Major) > Fix bug of strong wolfe linesearch `init` parameter lose

[jira] [Created] (SPARK-21523) Fix bug of strong wolfe linesearch `init` parameter lose effectiveness

2017-07-24 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-21523: -- Summary: Fix bug of strong wolfe linesearch `init` parameter lose effectiveness Key: SPARK-21523 URL: https://issues.apache.org/jira/browse/SPARK-21523 Project: Spark

[jira] [Updated] (SPARK-20504) ML 2.2 QA: API: Java compatibility, docs

2017-05-17 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-20504: --- Attachment: (updated)signature.diff (updated)process_script2.sh

[jira] [Issue Comment Deleted] (SPARK-20504) ML 2.2 QA: API: Java compatibility, docs

2017-05-15 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-20504: --- Comment: was deleted (was: You’re right this is really a headache. Java tools cannot extract several

[jira] [Updated] (SPARK-20504) ML 2.2 QA: API: Java compatibility, docs

2017-05-15 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-20504: --- You’re right this is really a headache. Java tools cannot extract several information `scalac`

[jira] [Comment Edited] (SPARK-20504) ML 2.2 QA: API: Java compatibility, docs

2017-05-12 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008898#comment-16008898 ] Weichen Xu edited comment on SPARK-20504 at 5/12/17 11:30 PM: -- I have

[jira] [Updated] (SPARK-20504) ML 2.2 QA: API: Java compatibility, docs

2017-05-12 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-20504: --- Attachment: 5_added_ml_class 4_common_ml_class

[jira] [Commented] (SPARK-20504) ML 2.2 QA: API: Java compatibility, docs

2017-05-12 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008898#comment-16008898 ] Weichen Xu commented on SPARK-20504: I have already taken the following steps to check this QA issue,

[jira] [Created] (SPARK-20423) fix MLOR coeffs centering when reg == 0

2017-04-21 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-20423: -- Summary: fix MLOR coeffs centering when reg == 0 Key: SPARK-20423 URL: https://issues.apache.org/jira/browse/SPARK-20423 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-19215) Add necessary check for `RDD.checkpoint` to avoid potential mistakes

2017-01-13 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-19215: -- Summary: Add necessary check for `RDD.checkpoint` to avoid potential mistakes Key: SPARK-19215 URL: https://issues.apache.org/jira/browse/SPARK-19215 Project: Spark

[jira] [Updated] (SPARK-19189) Optimize CartesianRDD to avoid parent RDD partition re-computation and re-serialization

2017-01-13 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-19189: --- Summary: Optimize CartesianRDD to avoid parent RDD partition re-computation and re-serialization

[jira] [Updated] (SPARK-19189) Optimize CartesianRDD to avoid partition re-computation and re-serialization

2017-01-13 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-19189: --- Priority: Minor (was: Major) > Optimize CartesianRDD to avoid partition re-computation and

[jira] [Updated] (SPARK-19190) Optimize CartesianRDD to avoid partition re-computation and re-serialization

2017-01-13 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-19190: --- Issue Type: Improvement (was: Bug) > Optimize CartesianRDD to avoid partition re-computation and

[jira] [Updated] (SPARK-19190) Optimize CartesianRDD to avoid partition re-computation and re-serialization

2017-01-13 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-19190: --- Priority: Minor (was: Major) > Optimize CartesianRDD to avoid partition re-computation and

[jira] [Updated] (SPARK-19203) Optimize CartesianRDD to avoid partition re-computation and re-serialization

2017-01-13 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-19203: --- Priority: Minor (was: Major) > Optimize CartesianRDD to avoid partition re-computation and

[jira] [Updated] (SPARK-19189) Optimize CartesianRDD to avoid partition re-computation and re-serialization

2017-01-13 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-19189: --- Issue Type: Improvement (was: Bug) > Optimize CartesianRDD to avoid partition re-computation and

[jira] [Comment Edited] (SPARK-10078) Vector-free L-BFGS

2017-01-11 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819964#comment-15819964 ] Weichen Xu edited comment on SPARK-10078 at 1/12/17 4:59 AM: - [~debasish83]

[jira] [Comment Edited] (SPARK-10078) Vector-free L-BFGS

2017-01-11 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819964#comment-15819964 ] Weichen Xu edited comment on SPARK-10078 at 1/12/17 4:58 AM: - [~debasish83]

[jira] [Comment Edited] (SPARK-10078) Vector-free L-BFGS

2017-01-11 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15820180#comment-15820180 ] Weichen Xu edited comment on SPARK-10078 at 1/12/17 4:54 AM: - As the detail

[jira] [Comment Edited] (SPARK-10078) Vector-free L-BFGS

2017-01-11 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15820180#comment-15820180 ] Weichen Xu edited comment on SPARK-10078 at 1/12/17 4:54 AM: - As the detail

[jira] [Commented] (SPARK-10078) Vector-free L-BFGS

2017-01-11 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15820180#comment-15820180 ] Weichen Xu commented on SPARK-10078: As the detail problems I list above(I only list a small part

[jira] [Comment Edited] (SPARK-10078) Vector-free L-BFGS

2017-01-11 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819851#comment-15819851 ] Weichen Xu edited comment on SPARK-10078 at 1/12/17 4:43 AM: - [~debasish83]

[jira] [Comment Edited] (SPARK-10078) Vector-free L-BFGS

2017-01-11 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819964#comment-15819964 ] Weichen Xu edited comment on SPARK-10078 at 1/12/17 3:02 AM: - [~debasish83]

[jira] [Comment Edited] (SPARK-10078) Vector-free L-BFGS

2017-01-11 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819964#comment-15819964 ] Weichen Xu edited comment on SPARK-10078 at 1/12/17 2:55 AM: - [~debasish83]

[jira] [Comment Edited] (SPARK-10078) Vector-free L-BFGS

2017-01-11 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819964#comment-15819964 ] Weichen Xu edited comment on SPARK-10078 at 1/12/17 2:48 AM: - [~debasish83]

[jira] [Comment Edited] (SPARK-10078) Vector-free L-BFGS

2017-01-11 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819964#comment-15819964 ] Weichen Xu edited comment on SPARK-10078 at 1/12/17 2:45 AM: - [~debasish83]

[jira] [Commented] (SPARK-10078) Vector-free L-BFGS

2017-01-11 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819964#comment-15819964 ] Weichen Xu commented on SPARK-10078: [~debasish83] But when we implement VF-LBFGS/VF-OWLQN base on

[jira] [Commented] (SPARK-10078) Vector-free L-BFGS

2017-01-11 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819851#comment-15819851 ] Weichen Xu commented on SPARK-10078: [~debasish83] Can L-BFGS-B be distributed computed when scaled

[jira] [Commented] (SPARK-18036) Decision Trees do not handle edge cases

2016-12-20 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15766243#comment-15766243 ] Weichen Xu commented on SPARK-18036: Oh, I'm too busy recently to work on it, it would be great if

[jira] [Issue Comment Deleted] (SPARK-18036) Decision Trees do not handle edge cases

2016-12-20 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-18036: --- Comment: was deleted (was: i am working on this... ) > Decision Trees do not handle edge cases >

[jira] [Commented] (SPARK-18286) Add Scala/Java/Python examples for MinHash and RandomProjection

2016-11-05 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15638806#comment-15638806 ] Weichen Xu commented on SPARK-18286: I will work on it, thanks~ > Add Scala/Java/Python examples for

[jira] [Updated] (SPARK-18218) Optimize BlockMatrix multiplication, which may cause OOM and low parallelism usage problem in several cases

2016-11-02 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-18218: --- Issue Type: Improvement (was: Bug) > Optimize BlockMatrix multiplication, which may cause OOM and

[jira] [Created] (SPARK-18218) Optimize BlockMatrix multiplication, which may cause OOM and low parallelism usage problem in several cases

2016-11-02 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-18218: -- Summary: Optimize BlockMatrix multiplication, which may cause OOM and low parallelism usage problem in several cases Key: SPARK-18218 URL:

[jira] [Closed] (SPARK-18201) add toDense and toSparse into Matrix trait, like Vector design

2016-11-01 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu closed SPARK-18201. -- Resolution: Duplicate It will fix in this PR https://github.com/apache/spark/pull/15628 > add toDense

[jira] [Created] (SPARK-18201) add toDense and toSparse into Matrix trait, like Vector design

2016-11-01 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-18201: -- Summary: add toDense and toSparse into Matrix trait, like Vector design Key: SPARK-18201 URL: https://issues.apache.org/jira/browse/SPARK-18201 Project: Spark

[jira] [Commented] (SPARK-18036) Decision Trees do not handle edge cases

2016-10-25 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15607357#comment-15607357 ] Weichen Xu commented on SPARK-18036: i am working on this... > Decision Trees do not handle edge

[jira] [Issue Comment Deleted] (SPARK-18095) There is a display problem in spark UI storage tab when rdd was persisted in multiple replicas

2016-10-25 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-18095: --- Comment: was deleted (was: I am working on it...) > There is a display problem in spark UI storage

[jira] [Commented] (SPARK-18095) There is a display problem in spark UI storage tab when rdd was persisted in multiple replicas

2016-10-25 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15605712#comment-15605712 ] Weichen Xu commented on SPARK-18095: I am working on it... > There is a display problem in spark UI

[jira] [Updated] (SPARK-18095) There is a display problem in spark UI storage tab when rdd was persisted in multiple replicas

2016-10-25 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-18095: --- Description: There is a display problem in spark UI storage tab when rdd was persisted in multiple

[jira] [Created] (SPARK-18095) There is a display problem in spark UI storage tab when rdd was persisted in multiple replicas

2016-10-25 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-18095: -- Summary: There is a display problem in spark UI storage tab when rdd was persisted in multiple replicas Key: SPARK-18095 URL: https://issues.apache.org/jira/browse/SPARK-18095

[jira] [Updated] (SPARK-18078) Add option for customize zipPartition task preferred locations

2016-10-24 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-18078: --- Priority: Minor (was: Major) > Add option for customize zipPartition task preferred locations >

[jira] [Updated] (SPARK-18078) Add option for customize zipPartition task preferred locations

2016-10-24 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-18078: --- Description: `RDD.zipPartitions` task preferred locations strategy will use the intersection of

[jira] [Updated] (SPARK-18078) Add option for customize zipPartition task preferred locations

2016-10-24 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-18078: --- Description: `RDD.zipPartitions` task preferred locations strategy will use the intersection of

[jira] [Created] (SPARK-18078) Add option for customize zipPartition task preferred locations

2016-10-24 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-18078: -- Summary: Add option for customize zipPartition task preferred locations Key: SPARK-18078 URL: https://issues.apache.org/jira/browse/SPARK-18078 Project: Spark

[jira] [Created] (SPARK-18051) Custom PartitionCoalescer cause serialization exception

2016-10-21 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-18051: -- Summary: Custom PartitionCoalescer cause serialization exception Key: SPARK-18051 URL: https://issues.apache.org/jira/browse/SPARK-18051 Project: Spark Issue

[jira] [Created] (SPARK-18007) update SparkR MLP - add initalWeights parameter

2016-10-19 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-18007: -- Summary: update SparkR MLP - add initalWeights parameter Key: SPARK-18007 URL: https://issues.apache.org/jira/browse/SPARK-18007 Project: Spark Issue Type:

[jira] [Updated] (SPARK-18003) RDD zipWithIndex generate wrong result when one partition contains more than 2147483647 records.

2016-10-18 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-18003: --- Description: RDD zipWithIndex generate wrong result when one partition contains more than

[jira] [Updated] (SPARK-18003) RDD zipWithIndex generate wrong result when one partition contains more than 2147483647 records.

2016-10-18 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-18003: --- Component/s: Spark Core > RDD zipWithIndex generate wrong result when one partition contains more

[jira] [Created] (SPARK-18003) RDD zipWithIndex generate wrong result when one partition contains more than 2147483647 records.

2016-10-18 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-18003: -- Summary: RDD zipWithIndex generate wrong result when one partition contains more than 2147483647 records. Key: SPARK-18003 URL: https://issues.apache.org/jira/browse/SPARK-18003

[jira] [Updated] (SPARK-17961) Add storageLevel to Dataset for SparkR

2016-10-16 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-17961: --- Component/s: SQL SparkR > Add storageLevel to Dataset for SparkR >

[jira] [Updated] (SPARK-17961) Add storageLevel to Dataset for SparkR

2016-10-16 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-17961: --- Issue Type: Improvement (was: Bug) > Add storageLevel to Dataset for SparkR >

[jira] [Commented] (SPARK-17961) Add storageLevel to Dataset for SparkR

2016-10-16 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15580131#comment-15580131 ] Weichen Xu commented on SPARK-17961: I am working on it and will create PR soon. > Add storageLevel

[jira] [Created] (SPARK-17961) Add storageLevel to Dataset for SparkR

2016-10-16 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-17961: -- Summary: Add storageLevel to Dataset for SparkR Key: SPARK-17961 URL: https://issues.apache.org/jira/browse/SPARK-17961 Project: Spark Issue Type: Bug

[jira] [Comment Edited] (SPARK-17139) Add model summary for MultinomialLogisticRegression

2016-10-10 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564097#comment-15564097 ] Weichen Xu edited comment on SPARK-17139 at 10/11/16 1:25 AM: -- I'm working

[jira] [Commented] (SPARK-17139) Add model summary for MultinomialLogisticRegression

2016-10-10 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564097#comment-15564097 ] Weichen Xu commented on SPARK-17139: I'm working on it hardly and will create PR this week, thanks!

[jira] [Updated] (SPARK-17540) SparkR array serde cannot work correctly when array length == 0

2016-10-10 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-17540: --- Description: SparkR cannot handle array serde when array length == 0 when length = 0 R side set the

[jira] [Updated] (SPARK-17540) SparkR array serde cannot work correctly when array length == 0

2016-10-10 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-17540: --- Description: SparkR cannot handle array serde when array length == 0 when length = 0 R side set the

[jira] [Closed] (SPARK-17540) SparkR array serde cannot work correctly when array length == 0

2016-10-08 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu closed SPARK-17540. -- Resolution: Won't Fix > SparkR array serde cannot work correctly when array length == 0 >

[jira] [Updated] (SPARK-17745) Update Python API for NB to support weighted instances

2016-09-30 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-17745: --- Component/s: PySpark > Update Python API for NB to support weighted instances >

[jira] [Commented] (SPARK-17745) Update Python API for NB to support weighted instances

2016-09-30 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15535750#comment-15535750 ] Weichen Xu commented on SPARK-17745: I will work on it and create PR ASAP, thanks! > Update Python

<    1   2   3   4   5   6   7   >