[jira] [Commented] (SPARK-20268) Arbitrary RDD element (Fast return) instead of using first

2017-04-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962037#comment-15962037 ] Sean Owen commented on SPARK-20268: --- If any element will do, why not the first? I get that it might

[jira] [Commented] (SPARK-20266) ExecutorBackend blocked at "UserGroupInformation.doAs"

2017-04-08 Thread Jerry.X.He (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962029#comment-15962029 ] Jerry.X.He commented on SPARK-20266: o, additional env : master, slave01, slave02, client are both in

[jira] [Commented] (SPARK-19870) Repeatable deadlock on BlockInfoManager and TorrentBroadcast

2017-04-08 Thread Eyal Farago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15961954#comment-15961954 ] Eyal Farago commented on SPARK-19870: - [~stevenruppert] logs from the hung executor might shed some

[jira] [Created] (SPARK-20268) Arbitrary RDD element (Fast return) instead of using first

2017-04-08 Thread Hayri Volkan Agun (JIRA)
Hayri Volkan Agun created SPARK-20268: - Summary: Arbitrary RDD element (Fast return) instead of using first Key: SPARK-20268 URL: https://issues.apache.org/jira/browse/SPARK-20268 Project: Spark

[jira] [Commented] (SPARK-18813) MLlib 2.2 Roadmap

2017-04-08 Thread Hayri Volkan Agun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15961943#comment-15961943 ] Hayri Volkan Agun commented on SPARK-18813: --- It is a possibility that I think for the future

[jira] [Resolved] (SPARK-20267) Dataset should be camel-cased to match DatFrame

2017-04-08 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-20267. Resolution: Won't Fix Regardless of the merits, it's a public API and now it's too late to

[jira] [Updated] (SPARK-20267) Dataset should be camel-cased to match DatFrame

2017-04-08 Thread Kevin Mc Inerney (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Mc Inerney updated SPARK-20267: - Shepherd: Reynold Xin > Dataset should be camel-cased to match DatFrame >

[jira] [Created] (SPARK-20267) Dataset should be camel-cased to match DatFrame

2017-04-08 Thread Kevin Mc Inerney (JIRA)
Kevin Mc Inerney created SPARK-20267: Summary: Dataset should be camel-cased to match DatFrame Key: SPARK-20267 URL: https://issues.apache.org/jira/browse/SPARK-20267 Project: Spark

[jira] [Updated] (SPARK-20266) ExecutorBackend blocked at "UserGroupInformation.doAs"

2017-04-08 Thread Jerry.X.He (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry.X.He updated SPARK-20266: --- Attachment: logsSubmitBySparkSubmitAtSlave02.zip logsSubmitByIdeaAtClient.zip these

[jira] [Updated] (SPARK-20266) ExecutorBackend blocked at "UserGroupInformation.doAs"

2017-04-08 Thread Jerry.X.He (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry.X.He updated SPARK-20266: --- Docs Text: hi friends, I have been sturtured an cluster of Spark tomorrow, and today I want to run

[jira] [Updated] (SPARK-20266) ExecutorBackend blocked at "UserGroupInformation.doAs"

2017-04-08 Thread Jerry.X.He (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry.X.He updated SPARK-20266: --- Docs Text: I have been sturtured an cluster of Spark tomorrow, and today I want to run an WordCount

[jira] [Created] (SPARK-20266) ExecutorBackend blocked at "UserGroupInformation.doAs"

2017-04-08 Thread Jerry.X.He (JIRA)
Jerry.X.He created SPARK-20266: -- Summary: ExecutorBackend blocked at "UserGroupInformation.doAs" Key: SPARK-20266 URL: https://issues.apache.org/jira/browse/SPARK-20266 Project: Spark Issue

[jira] [Resolved] (SPARK-20251) Spark streaming skips batches in a case of failure

2017-04-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-20251. --- Resolution: Invalid Start with

[jira] [Commented] (SPARK-19067) mapGroupsWithState - arbitrary stateful operations with Structured Streaming (similar to DStream.mapWithState)

2017-04-08 Thread Yuval Itzchakov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15961844#comment-15961844 ] Yuval Itzchakov commented on SPARK-19067: - Is this going to make it into 2.1.1? >

[jira] [Commented] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2017-04-08 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15961842#comment-15961842 ] Wenchen Fan commented on SPARK-12837: - BTW, if you have problems with Spark 1.6, please open another

[jira] [Reopened] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2017-04-08 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reopened SPARK-12837: - I reopened this ticket, I'm looking into it > Spark driver requires large memory space for

[jira] [Commented] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2017-04-08 Thread balaji krishnan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15961834#comment-15961834 ] balaji krishnan commented on SPARK-12837: - sorry wanted to type two more lines, but sent before

[jira] [Commented] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2017-04-08 Thread balaji krishnan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15961829#comment-15961829 ] balaji krishnan commented on SPARK-12837: - Thanks @teobar I did what you suggested, but hitting

[jira] [Commented] (SPARK-20265) Improve Prefix'span pre-processing efficiency

2017-04-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15961771#comment-15961771 ] Apache Spark commented on SPARK-20265: -- User 'Syrux' has created a pull request for this issue:

[jira] [Assigned] (SPARK-20265) Improve Prefix'span pre-processing efficiency

2017-04-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20265: Assignee: (was: Apache Spark) > Improve Prefix'span pre-processing efficiency >

[jira] [Assigned] (SPARK-20265) Improve Prefix'span pre-processing efficiency

2017-04-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20265: Assignee: Apache Spark > Improve Prefix'span pre-processing efficiency >

[jira] [Commented] (SPARK-7856) Scalable PCA implementation for tall and fat matrices

2017-04-08 Thread Hayri Volkan Agun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15961768#comment-15961768 ] Hayri Volkan Agun commented on SPARK-7856: -- Hi Tarek, Still on the issue of Probabilistic PCA.

[jira] [Commented] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2017-04-08 Thread teobar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15961762#comment-15961762 ] teobar commented on SPARK-12837: Sorry for not posting this earlier, have forgot my password and didn't

[jira] [Updated] (SPARK-20265) Improve Prefix'span pre-processing efficiency

2017-04-08 Thread Cyril de Vogelaere (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cyril de Vogelaere updated SPARK-20265: --- Description: Given the sequence : 0 4 0 1 0 2 0 5 0 2 0 4 0 5 And supposing only

[jira] [Resolved] (SPARK-12210) Small example that shows how to integrate spark.mllib with spark.ml

2017-04-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-12210. --- Resolution: Won't Fix > Small example that shows how to integrate spark.mllib with spark.ml >

[jira] [Resolved] (SPARK-3903) Create general data loading method for LabeledPoints

2017-04-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-3903. -- Resolution: Won't Fix > Create general data loading method for LabeledPoints >

[jira] [Resolved] (SPARK-4038) Outlier Detection Algorithm for MLlib

2017-04-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-4038. -- Resolution: Won't Fix > Outlier Detection Algorithm for MLlib > - >

[jira] [Created] (SPARK-20265) Improve Prefix'span pre-processing efficiency

2017-04-08 Thread Cyril de Vogelaere (JIRA)
Cyril de Vogelaere created SPARK-20265: -- Summary: Improve Prefix'span pre-processing efficiency Key: SPARK-20265 URL: https://issues.apache.org/jira/browse/SPARK-20265 Project: Spark

[jira] [Updated] (SPARK-20260) MLUtils parseLibSVMRecord has incorrect string interpolation for error message

2017-04-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-20260: -- Priority: Trivial (was: Minor) OK. this is the kind of thing that doesn't even need a JIRA > MLUtils

[jira] [Commented] (SPARK-20263) create empty dataframes in sparkR

2017-04-08 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15961724#comment-15961724 ] Felix Cheung commented on SPARK-20263: -- May I ask how would you use this empty dataframe? > create

[jira] [Commented] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2017-04-08 Thread balaji krishnan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15961703#comment-15961703 ] balaji krishnan commented on SPARK-12837: - Any workaround available to fix this in 1.6.1 please.