[jira] [Resolved] (SPARK-5892) Clean up ML, MLlib docs for 1.3 release

2015-02-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-5892. -- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4675

[jira] [Commented] (SPARK-5921) Task/Stage always in pending on foreachPartition in UI

2015-02-20 Thread Michael Nitschinger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328833#comment-14328833 ] Michael Nitschinger commented on SPARK-5921: Hm, what makes me wonder is that

[jira] [Resolved] (SPARK-5867) Update spark.ml docs with DataFrame, Python examples

2015-02-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-5867. -- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4675

[jira] [Commented] (SPARK-5921) Task/Stage always in pending on foreachPartition in UI

2015-02-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328816#comment-14328816 ] Sean Owen commented on SPARK-5921: -- I can't reproduce this. The stage shows as completed

[jira] [Updated] (SPARK-5917) Distinct is broken

2015-02-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5917: - Component/s: (was: MLlib) Spark Core Priority: Major (was: Critical) I can't

[jira] [Updated] (SPARK-5918) Spark Thrift server reports metadata for VARCHAR column as STRING in result set schema

2015-02-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5918: - Component/s: SQL Spark Thrift server reports metadata for VARCHAR column as STRING in result set

[jira] [Created] (SPARK-5922) Add diff(other: RDD[VertexId, VD]) in VertexRDD

2015-02-20 Thread Takeshi Yamamuro (JIRA)
Takeshi Yamamuro created SPARK-5922: --- Summary: Add diff(other: RDD[VertexId, VD]) in VertexRDD Key: SPARK-5922 URL: https://issues.apache.org/jira/browse/SPARK-5922 Project: Spark Issue

[jira] [Commented] (SPARK-4144) Support incremental model training of Naive Bayes classifier

2015-02-20 Thread Jeremy Freeman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328726#comment-14328726 ] Jeremy Freeman commented on SPARK-4144: --- Hi all, I'd be happy to pick this up and

[jira] [Created] (SPARK-5921) Task/Stage always in pending with 1 partition on foreachPartition

2015-02-20 Thread Michael Nitschinger (JIRA)
Michael Nitschinger created SPARK-5921: -- Summary: Task/Stage always in pending with 1 partition on foreachPartition Key: SPARK-5921 URL: https://issues.apache.org/jira/browse/SPARK-5921 Project:

[jira] [Updated] (SPARK-5921) Task/Stage always in pending on foreachPartition

2015-02-20 Thread Michael Nitschinger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Nitschinger updated SPARK-5921: --- Attachment: Screen Shot 2015-02-20 at 11.00.34.png Task/Stage always in pending on

[jira] [Updated] (SPARK-5921) Task/Stage always in pending on foreachPartition

2015-02-20 Thread Michael Nitschinger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Nitschinger updated SPARK-5921: --- Attachment: Screen Shot 2015-02-20 at 10.59.26.png Task/Stage always in pending on

[jira] [Updated] (SPARK-5916) $SPARK_HOME/bin/beeline conflicts with $HIVE_HOME/bin/beeline

2015-02-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5916: - Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) I get you, although I suppose it

[jira] [Resolved] (SPARK-5909) Add a clearCache command to Spark SQL's cache manager

2015-02-20 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-5909. --- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 4694

[jira] [Commented] (SPARK-5921) Task/Stage always in pending on foreachPartition

2015-02-20 Thread Michael Nitschinger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328764#comment-14328764 ] Michael Nitschinger commented on SPARK-5921: Interestingly the logs indicate

[jira] [Resolved] (SPARK-5744) RDD.isEmpty / take fails for (empty) RDD of Nothing

2015-02-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5744. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 4698

[jira] [Created] (SPARK-5920) Use a BufferedInputStream to read local shuffle data

2015-02-20 Thread Kay Ousterhout (JIRA)
Kay Ousterhout created SPARK-5920: - Summary: Use a BufferedInputStream to read local shuffle data Key: SPARK-5920 URL: https://issues.apache.org/jira/browse/SPARK-5920 Project: Spark Issue

[jira] [Commented] (SPARK-5917) Distinct is broken

2015-02-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328769#comment-14328769 ] Sean Owen commented on SPARK-5917: -- Isn't this just another symptom of the known issues

[jira] [Commented] (SPARK-5790) VertexRDD's won't zip properly for `diff` capability

2015-02-20 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328778#comment-14328778 ] Takeshi Yamamuro commented on SPARK-5790: - Hi, What's the status of your work? I

[jira] [Updated] (SPARK-5921) Task/Stage always in pending on foreachPartition

2015-02-20 Thread Michael Nitschinger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Nitschinger updated SPARK-5921: --- Summary: Task/Stage always in pending on foreachPartition (was: Task/Stage always in

[jira] [Commented] (SPARK-5016) GaussianMixtureEM should distribute matrix inverse for large numFeatures, k

2015-02-20 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328760#comment-14328760 ] Manoj Kumar commented on SPARK-5016: Could you please explain how you got upto this?

[jira] [Commented] (SPARK-5921) Task/Stage always in pending on foreachPartition

2015-02-20 Thread Michael Nitschinger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328766#comment-14328766 ] Michael Nitschinger commented on SPARK-5921: I've also attached some

[jira] [Updated] (SPARK-5921) Task/Stage always in pending on foreachPartition in UI

2015-02-20 Thread Michael Nitschinger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Nitschinger updated SPARK-5921: --- Summary: Task/Stage always in pending on foreachPartition in UI (was: Task/Stage

[jira] [Updated] (SPARK-5744) RDD.isEmpty / take fails for (empty) RDD of Nothing

2015-02-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5744: - Assignee: Sean Owen (was: Tobias Bertelsen) There was ultimately no good fix; partial fixes only raised

[jira] [Commented] (SPARK-5921) Task/Stage always in pending on foreachPartition in UI

2015-02-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328856#comment-14328856 ] Sean Owen commented on SPARK-5921: -- I wonder if it's related to

[jira] [Created] (SPARK-5923) Very slow query when using Oracle hive metastore and table has lots of partitions

2015-02-20 Thread Matthew Taylor (JIRA)
Matthew Taylor created SPARK-5923: - Summary: Very slow query when using Oracle hive metastore and table has lots of partitions Key: SPARK-5923 URL: https://issues.apache.org/jira/browse/SPARK-5923

[jira] [Commented] (SPARK-5016) GaussianMixtureEM should distribute matrix inverse for large numFeatures, k

2015-02-20 Thread Travis Galoppo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328868#comment-14328868 ] Travis Galoppo commented on SPARK-5016: --- @mechcoder This falls directly from the

[jira] [Assigned] (SPARK-5918) Spark Thrift server reports metadata for VARCHAR column as STRING in result set schema

2015-02-20 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian reassigned SPARK-5918: - Assignee: Cheng Lian Spark Thrift server reports metadata for VARCHAR column as STRING in

[jira] [Commented] (SPARK-5918) Spark Thrift server reports metadata for VARCHAR column as STRING in result set schema

2015-02-20 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328944#comment-14328944 ] Cheng Lian commented on SPARK-5918: --- When reading a Hive VARCHAR column, Spark SQL

[jira] [Commented] (SPARK-1476) 2GB limit in spark for blocks

2015-02-20 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329554#comment-14329554 ] Imran Rashid commented on SPARK-1476: - I spent a little time with [~sandyr] on this

[jira] [Commented] (SPARK-5928) Remote Shuffle Blocks cannot be more than 2 GB

2015-02-20 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329568#comment-14329568 ] Imran Rashid commented on SPARK-5928: - (just edited the description -- I mistakenly

[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329664#comment-14329664 ] Zhan Zhang commented on SPARK-1537: --- [~vanzin] If you don't have bandwidth, or don't

[jira] [Updated] (SPARK-5933) Centralize deprecated configs in SparkConf

2015-02-20 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5933: - Description: Deprecated configs are currently all strewn across the code base. It would be good to

[jira] [Commented] (SPARK-5928) Remote Shuffle Blocks cannot be more than 2 GB

2015-02-20 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329720#comment-14329720 ] Imran Rashid commented on SPARK-5928: - Actually, there *is* some weirdness in how

[jira] [Commented] (SPARK-5516) ActorSystemImpl: Uncaught fatal error from thread [sparkDriver-akka.actor.default-dispatcher-22] shutting down ActorSystem [sparkDriver] java.lang.OutOfMemoryError: Jav

2015-02-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329718#comment-14329718 ] Xiangrui Meng commented on SPARK-5516: -- [~wuyukai] Could you provide all the

[jira] [Created] (SPARK-5928) Remote Shuffle Blocks cannot be more than 2 GB

2015-02-20 Thread Imran Rashid (JIRA)
Imran Rashid created SPARK-5928: --- Summary: Remote Shuffle Blocks cannot be more than 2 GB Key: SPARK-5928 URL: https://issues.apache.org/jira/browse/SPARK-5928 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-5281) Registering table on RDD is giving MissingRequirementError

2015-02-20 Thread Sebastian YEPES FERNANDEZ (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329628#comment-14329628 ] Sebastian YEPES FERNANDEZ commented on SPARK-5281: -- Also having this

[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329649#comment-14329649 ] Zhan Zhang commented on SPARK-1537: --- [~vanzin] Thanks for the comments. I don't

[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329660#comment-14329660 ] Marcelo Vanzin commented on SPARK-1537: --- Hi [~zzhan], I already posted the link to

[jira] [Created] (SPARK-5931) Use consistent naming for time properties

2015-02-20 Thread Andrew Or (JIRA)
Andrew Or created SPARK-5931: Summary: Use consistent naming for time properties Key: SPARK-5931 URL: https://issues.apache.org/jira/browse/SPARK-5931 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329678#comment-14329678 ] Zhan Zhang commented on SPARK-1537: --- [~vanzin] I declare integrate your code from the

[jira] [Commented] (SPARK-5912) Programming guide for feature selection

2015-02-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329692#comment-14329692 ] Joseph K. Bradley commented on SPARK-5912: -- Great, thanks! I build and view them

[jira] [Updated] (SPARK-5516) ActorSystemImpl: Uncaught fatal error from thread [sparkDriver-akka.actor.default-dispatcher-22] shutting down ActorSystem [sparkDriver] java.lang.OutOfMemoryError: Java

2015-02-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5516: - Target Version/s: 1.4.0 (was: 1.2.0) ActorSystemImpl: Uncaught fatal error from thread

[jira] [Updated] (SPARK-5932) Use consistent naming for byte properties

2015-02-20 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5932: - Description: This is SPARK-5931's sister issue. The naming of existing byte configs is inconsistent. We

[jira] [Updated] (SPARK-5931) Use consistent naming for time properties

2015-02-20 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5931: - Description: This is SPARK-5932's sister issue. The naming of existing time configs is inconsistent. We

[jira] [Updated] (SPARK-3249) Fix links in ScalaDoc that cause warning messages in `sbt/sbt unidoc`

2015-02-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3249: - Target Version/s: (was: 1.2.0) Fix links in ScalaDoc that cause warning messages in `sbt/sbt

[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329739#comment-14329739 ] Sean Owen commented on SPARK-1537: -- [~zzhan] You have provided a patch as a PR right?

[jira] [Commented] (SPARK-5912) Programming guide for feature selection

2015-02-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329773#comment-14329773 ] Apache Spark commented on SPARK-5912: - User 'avulanov' has created a pull request for

[jira] [Updated] (SPARK-5928) Remote Shuffle Blocks cannot be more than 2 GB

2015-02-20 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated SPARK-5928: Description: If a shuffle block is over 2GB, the shuffle fails, with an uninformative exception.

[jira] [Updated] (SPARK-4705) Driver retries in cluster mode always fail if event logging is enabled

2015-02-20 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4705: - Summary: Driver retries in cluster mode always fail if event logging is enabled (was: Driver retries in

[jira] [Commented] (SPARK-1391) BlockManager cannot transfer blocks larger than 2G in size

2015-02-20 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329605#comment-14329605 ] Imran Rashid commented on SPARK-1391: - [~coderplay], I assume you are no longer

[jira] [Resolved] (SPARK-5896) toDF in python doesn't work with tuple/list w/o names

2015-02-20 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-5896. - Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4679

[jira] [Created] (SPARK-5929) Pyspark: Register a pip requirements file with spark_context

2015-02-20 Thread Buck (JIRA)
Buck created SPARK-5929: --- Summary: Pyspark: Register a pip requirements file with spark_context Key: SPARK-5929 URL: https://issues.apache.org/jira/browse/SPARK-5929 Project: Spark Issue Type:

[jira] [Commented] (SPARK-5928) Remote Shuffle Blocks cannot be more than 2 GB

2015-02-20 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329584#comment-14329584 ] Imran Rashid commented on SPARK-5928: - Here are some thoughts on we *might* fix this.

[jira] [Commented] (SPARK-5928) Remote Shuffle Blocks cannot be more than 2 GB

2015-02-20 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329623#comment-14329623 ] Imran Rashid commented on SPARK-5928: - sometimes this also results in exceptions like

[jira] [Commented] (SPARK-3368) Spark cannot be used with Avro and Parquet

2015-02-20 Thread Daniel Fry (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329645#comment-14329645 ] Daniel Fry commented on SPARK-3368: --- Hey fwiw I encountered this recently with spark

[jira] [Resolved] (SPARK-5898) Can't create DataFrame from Pandas data frame

2015-02-20 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-5898. - Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4679

[jira] [Comment Edited] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329649#comment-14329649 ] Zhan Zhang edited comment on SPARK-1537 at 2/20/15 10:14 PM: -

[jira] [Updated] (SPARK-5932) Use consistent naming for byte properties

2015-02-20 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5932: - Description: This is SPARK-5931's sister issue. The naming of existing byte configs is inconsistent. We

[jira] [Updated] (SPARK-5931) Use consistent naming for time properties

2015-02-20 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5931: - Description: This is SPARK-5932's sister issue. The naming of existing time configs is inconsistent. We

[jira] [Created] (SPARK-5932) Use consistent naming for byte properties

2015-02-20 Thread Andrew Or (JIRA)
Andrew Or created SPARK-5932: Summary: Use consistent naming for byte properties Key: SPARK-5932 URL: https://issues.apache.org/jira/browse/SPARK-5932 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329691#comment-14329691 ] Sean Owen commented on SPARK-1537: -- [~zzhan] I also can't figure out what you are

[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329700#comment-14329700 ] Zhan Zhang commented on SPARK-1537: --- [~sowen] From the whole context, I believe you

[jira] [Updated] (SPARK-4406) SVD should check for k 1

2015-02-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4406: - Target Version/s: 1.3.0 (was: 1.2.0) SVD should check for k 1 --

[jira] [Issue Comment Deleted] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhan Zhang updated SPARK-1537: -- Comment: was deleted (was: [~sowen] By the way, I am not waiting for someone to give me the patch. It

[jira] [Updated] (SPARK-5930) Documented default of spark.shuffle.io.retryWait is confusing

2015-02-20 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5930: - Description: The description makes it sound like the retryWait itself defaults to 15 seconds, when it's

[jira] [Updated] (SPARK-5930) Documented default of spark.shuffle.io.retryWait is confusing

2015-02-20 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5930: - Summary: Documented default of spark.shuffle.io.retryWait is confusing (was: Documented default of

[jira] [Updated] (SPARK-5930) Documented default of spark.shuffle.io.retryWait is confusing

2015-02-20 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5930: - Priority: Trivial (was: Minor) Documented default of spark.shuffle.io.retryWait is confusing

[jira] [Updated] (SPARK-5930) Documented default of spark.shuffle.io.retryWait is confusing

2015-02-20 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5930: - Affects Version/s: 1.2.0 Documented default of spark.shuffle.io.retryWait is confusing

[jira] [Updated] (SPARK-5930) Documented default of spark.shuffle.io.retryWait is not consistent

2015-02-20 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5930: - Priority: Minor (was: Major) Documented default of spark.shuffle.io.retryWait is not consistent

[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329704#comment-14329704 ] Zhan Zhang commented on SPARK-1537: --- [~sowen] By the way, I am not waiting for someone

[jira] [Updated] (SPARK-4081) Categorical feature indexing

2015-02-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4081: - Target Version/s: 1.4.0 (was: 1.2.0) Categorical feature indexing

[jira] [Commented] (SPARK-1391) BlockManager cannot transfer blocks larger than 2G in size

2015-02-20 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329613#comment-14329613 ] Imran Rashid commented on SPARK-1391: - Here is a minimal program to demonstrate the

[jira] [Updated] (SPARK-5930) Documented default of spark.shuffle.io.retryWait is confusing

2015-02-20 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5930: - Description: The description makes it sound like the retryWait itself defaults to 15 seconds, when it's

[jira] [Commented] (SPARK-4655) Split Stage into ShuffleMapStage and ResultStage subclasses

2015-02-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329641#comment-14329641 ] Apache Spark commented on SPARK-4655: - User 'ilganeli' has created a pull request for

[jira] [Updated] (SPARK-5932) Use consistent naming for byte properties

2015-02-20 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5932: - Description: This is SPARK-5931's sister issue. The naming of existing byte configs is inconsistent. We

[jira] [Updated] (SPARK-5931) Use consistent naming for time properties

2015-02-20 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5931: - Description: This is SPARK-5932's sister issue. The naming of existing time configs is inconsistent. We

[jira] [Created] (SPARK-5933) Centralize deprecated configs in SparkConf

2015-02-20 Thread Andrew Or (JIRA)
Andrew Or created SPARK-5933: Summary: Centralize deprecated configs in SparkConf Key: SPARK-5933 URL: https://issues.apache.org/jira/browse/SPARK-5933 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-5629) Add spark-ec2 action to return info about an existing cluster

2015-02-20 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329771#comment-14329771 ] Nicholas Chammas commented on SPARK-5629: - [~florianverhein] - Hmm... Thinking

[jira] [Closed] (SPARK-1673) GLMNET implementation in Spark

2015-02-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng closed SPARK-1673. Resolution: Duplicate GLMNET implementation in Spark --

[jira] [Updated] (SPARK-2335) k-Nearest Neighbor classification and regression for MLLib

2015-02-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2335: - Target Version/s: 1.4.0 k-Nearest Neighbor classification and regression for MLLib

[jira] [Closed] (SPARK-1418) Python MLlib's _get_unmangled_rdd should uncache RDDs when training is done

2015-02-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng closed SPARK-1418. Resolution: Implemented Fix Version/s: 1.2.0 Python MLlib's _get_unmangled_rdd should

[jira] [Closed] (SPARK-1892) Add an OWL-QN optimizer for L1 regularized optimizations.

2015-02-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng closed SPARK-1892. Resolution: Duplicate Add an OWL-QN optimizer for L1 regularized optimizations.

[jira] [Updated] (SPARK-2138) The KMeans algorithm in the MLlib can lead to the Serialized Task size become bigger and bigger

2015-02-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2138: - Target Version/s: 1.4.0 The KMeans algorithm in the MLlib can lead to the Serialized Task size

[jira] [Updated] (SPARK-1473) Feature selection for high dimensional datasets

2015-02-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1473: - Issue Type: Umbrella (was: New Feature) Feature selection for high dimensional datasets

[jira] [Updated] (SPARK-5888) Add OneHotEncoder as a Transformer

2015-02-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5888: - Summary: Add OneHotEncoder as a Transformer (was: Add OneHotEncoder) Add OneHotEncoder as a

[jira] [Commented] (SPARK-5937) [YARN] ClientSuite must set YARN mode to true to ensure correct SparkHadoopUtil implementation is used.

2015-02-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329814#comment-14329814 ] Apache Spark commented on SPARK-5937: - User 'harishreedharan' has created a pull

[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329828#comment-14329828 ] Zhan Zhang commented on SPARK-1537: --- [~sowen] In JIRA, we share the code so that other

[jira] [Updated] (SPARK-1359) SGD implementation is not efficient

2015-02-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1359: - Target Version/s: 1.4.0 SGD implementation is not efficient ---

[jira] [Closed] (SPARK-1794) Generic ADMM implementation for SVM, lasso, and L1-regularized logistic regression

2015-02-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng closed SPARK-1794. Resolution: Duplicate Generic ADMM implementation for SVM, lasso, and L1-regularized logistic

[jira] [Updated] (SPARK-1856) Standardize MLlib interfaces

2015-02-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1856: - Issue Type: Umbrella (was: New Feature) Standardize MLlib interfaces

[jira] [Updated] (SPARK-1655) In naive Bayes, store conditional probabilities distributively.

2015-02-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1655: - Target Version/s: 1.4.0 In naive Bayes, store conditional probabilities distributively.

[jira] [Commented] (SPARK-5928) Remote Shuffle Blocks cannot be more than 2 GB

2015-02-20 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329851#comment-14329851 ] Kay Ousterhout commented on SPARK-5928: --- Is it possible this is caused because the

[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-20 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329852#comment-14329852 ] Marcelo Vanzin commented on SPARK-1537: --- Hi [~zhzhan], bq. But It is hard to

[jira] [Closed] (SPARK-1014) MultilogisticRegressionWithSGD

2015-02-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng closed SPARK-1014. Resolution: Duplicate We support multinomial logistic regression with LBFGS in 1.3. I marked this

[jira] [Created] (SPARK-5935) Accept MapType in the schema provided to a JSON dataset.

2015-02-20 Thread Yin Huai (JIRA)
Yin Huai created SPARK-5935: --- Summary: Accept MapType in the schema provided to a JSON dataset. Key: SPARK-5935 URL: https://issues.apache.org/jira/browse/SPARK-5935 Project: Spark Issue Type:

[jira] [Commented] (SPARK-5935) Accept MapType in the schema provided to a JSON dataset.

2015-02-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329803#comment-14329803 ] Apache Spark commented on SPARK-5935: - User 'yhuai' has created a pull request for

[jira] [Updated] (SPARK-5925) YARN - Spark progress bar stucks at 10% but after finishing shows 100%

2015-02-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5925: - Target Version/s: (was: 1.2.1) Issue Type: Improvement (was: Bug) I don't know that it's as

[jira] [Commented] (SPARK-1955) VertexRDD can incorrectly assume index sharing

2015-02-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329246#comment-14329246 ] Apache Spark commented on SPARK-1955: - User 'brennonyork' has created a pull request

[jira] [Created] (SPARK-5925) YARN - Spark progress bar stucks at 10% but after finishing shows 100%

2015-02-20 Thread Laszlo Fesus (JIRA)
Laszlo Fesus created SPARK-5925: --- Summary: YARN - Spark progress bar stucks at 10% but after finishing shows 100% Key: SPARK-5925 URL: https://issues.apache.org/jira/browse/SPARK-5925 Project: Spark

[jira] [Created] (SPARK-5926) [SQL] DataFrame.explain() return false result for DDL command

2015-02-20 Thread Yanbo Liang (JIRA)
Yanbo Liang created SPARK-5926: -- Summary: [SQL] DataFrame.explain() return false result for DDL command Key: SPARK-5926 URL: https://issues.apache.org/jira/browse/SPARK-5926 Project: Spark

  1   2   >