[jira] [Commented] (SPARK-21799) KMeans performance regression (5-6x slowdown) in Spark 2.2

2017-09-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161992#comment-16161992 ] Joseph K. Bradley commented on SPARK-21799: --- Now that I've caught up on these, this is just a

[jira] [Commented] (SPARK-21799) KMeans performance regression (5-6x slowdown) in Spark 2.2

2017-09-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151407#comment-16151407 ] Apache Spark commented on SPARK-21799: -- User 'WeichenXu123' has created a pull request for this

[jira] [Commented] (SPARK-21799) KMeans performance regression (5-6x slowdown) in Spark 2.2

2017-08-25 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16141319#comment-16141319 ] Weichen Xu commented on SPARK-21799: [~zahili] hmm..You're right. We are hard to get the precise

[jira] [Commented] (SPARK-21799) KMeans performance regression (5-6x slowdown) in Spark 2.2

2017-08-24 Thread zakaria hili (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16140183#comment-16140183 ] zakaria hili commented on SPARK-21799: -- [~WeichenXu123], df.rdd.getStorageLevel return none even if

[jira] [Commented] (SPARK-21799) KMeans performance regression (5-6x slowdown) in Spark 2.2

2017-08-24 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16140162#comment-16140162 ] Weichen Xu commented on SPARK-21799: I suggest check both `df.storageLevel` and

[jira] [Commented] (SPARK-21799) KMeans performance regression (5-6x slowdown) in Spark 2.2

2017-08-24 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139802#comment-16139802 ] Weichen Xu commented on SPARK-21799: [~Siddharth Murching] Already have another jira & PR, take a

[jira] [Commented] (SPARK-21799) KMeans performance regression (5-6x slowdown) in Spark 2.2

2017-08-24 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139799#comment-16139799 ] Weichen Xu commented on SPARK-21799: [~Siddharth Murching] +1 This will cause double cache. > KMeans

[jira] [Commented] (SPARK-21799) KMeans performance regression (5-6x slowdown) in Spark 2.2

2017-08-23 Thread zakaria hili (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138408#comment-16138408 ] zakaria hili commented on SPARK-21799: -- [~Siddharth Murching], sorry about that, I think that the

[jira] [Commented] (SPARK-21799) KMeans performance regression (5-6x slowdown) in Spark 2.2

2017-08-22 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136630#comment-16136630 ] Nick Pentreath commented on SPARK-21799: Refer to SPARK-18608 and SPARK-19422. There is some work

[jira] [Commented] (SPARK-21799) KMeans performance regression (5-6x slowdown) in Spark 2.2

2017-08-21 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136175#comment-16136175 ] Liang-Chi Hsieh commented on SPARK-21799: - So I think the problem is you shouldn't do

[jira] [Commented] (SPARK-21799) KMeans performance regression (5-6x slowdown) in Spark 2.2

2017-08-21 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136164#comment-16136164 ] Liang-Chi Hsieh commented on SPARK-21799: - Hmm, I go to check ML KMeans codes where I don't find

[jira] [Commented] (SPARK-21799) KMeans performance regression (5-6x slowdown) in Spark 2.2

2017-08-21 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136152#comment-16136152 ] Liang-Chi Hsieh commented on SPARK-21799: - Yeah, that looks right direction. {{df.storageLevel}}