[jira] [Assigned] (SPARK-15784) Add Power Iteration Clustering to spark.ml

2016-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15784: Assignee: Apache Spark > Add Power Iteration Clustering to spark.ml >

[jira] [Commented] (SPARK-15899) file scheme should be used correctly

2016-06-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328056#comment-15328056 ] Sean Owen commented on SPARK-15899: --- OK, I had assumed that absolute paths on Windows would have to be

[jira] [Assigned] (SPARK-15927) Eliminate redundant code in DAGScheduler's getParentStages and getAncestorShuffleDependencies methods.

2016-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15927: Assignee: Apache Spark (was: Kay Ousterhout) > Eliminate redundant code in

[jira] [Commented] (SPARK-15927) Eliminate redundant code in DAGScheduler's getParentStages and getAncestorShuffleDependencies methods.

2016-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328045#comment-15328045 ] Apache Spark commented on SPARK-15927: -- User 'kayousterhout' has created a pull request for this

[jira] [Assigned] (SPARK-15927) Eliminate redundant code in DAGScheduler's getParentStages and getAncestorShuffleDependencies methods.

2016-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15927: Assignee: Kay Ousterhout (was: Apache Spark) > Eliminate redundant code in

[jira] [Closed] (SPARK-15928) Eliminate redundant code in DAGScheduler's getParentStages and getAncestorShuffleDependencies methods.

2016-06-13 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout closed SPARK-15928. -- Resolution: Duplicate > Eliminate redundant code in DAGScheduler's getParentStages and >

[jira] [Created] (SPARK-15928) Eliminate redundant code in DAGScheduler's getParentStages and getAncestorShuffleDependencies methods.

2016-06-13 Thread Kay Ousterhout (JIRA)
Kay Ousterhout created SPARK-15928: -- Summary: Eliminate redundant code in DAGScheduler's getParentStages and getAncestorShuffleDependencies methods. Key: SPARK-15928 URL:

[jira] [Created] (SPARK-15927) Eliminate redundant code in DAGScheduler's getParentStages and getAncestorShuffleDependencies methods.

2016-06-13 Thread Kay Ousterhout (JIRA)
Kay Ousterhout created SPARK-15927: -- Summary: Eliminate redundant code in DAGScheduler's getParentStages and getAncestorShuffleDependencies methods. Key: SPARK-15927 URL:

[jira] [Created] (SPARK-15926) Improve readability of DAGScheduler stage creation methods

2016-06-13 Thread Kay Ousterhout (JIRA)
Kay Ousterhout created SPARK-15926: -- Summary: Improve readability of DAGScheduler stage creation methods Key: SPARK-15926 URL: https://issues.apache.org/jira/browse/SPARK-15926 Project: Spark

[jira] [Updated] (SPARK-15655) Wrong Result when Fetching Partitioned Tables

2016-06-13 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-15655: - Assignee: Xiao Li > Wrong Result when Fetching Partitioned Tables >

[jira] [Updated] (SPARK-15655) Wrong Result when Fetching Partitioned Tables

2016-06-13 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-15655: - Target Version/s: 2.0.0 > Wrong Result when Fetching Partitioned Tables >

[jira] [Updated] (SPARK-15655) Wrong Result when Fetching Partitioned Tables

2016-06-13 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-15655: - Priority: Blocker (was: Critical) > Wrong Result when Fetching Partitioned Tables >

[jira] [Commented] (SPARK-15822) segmentation violation in o.a.s.unsafe.types.UTF8String

2016-06-13 Thread Pete Robbins (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327992#comment-15327992 ] Pete Robbins commented on SPARK-15822: -- So this does seem to cause the NPE or SEGV intermittently,

[jira] [Assigned] (SPARK-15925) Replaces registerTempTable with createOrReplaceTempView in SparkR

2016-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15925: Assignee: Apache Spark (was: Cheng Lian) > Replaces registerTempTable with

[jira] [Assigned] (SPARK-15925) Replaces registerTempTable with createOrReplaceTempView in SparkR

2016-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15925: Assignee: Cheng Lian (was: Apache Spark) > Replaces registerTempTable with

[jira] [Closed] (SPARK-15924) SparkR parser bug with backslash in comments

2016-06-13 Thread Xuan Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Wang closed SPARK-15924. - Resolution: Not A Problem Not a problem of open source Spark > SparkR parser bug with backslash in

[jira] [Commented] (SPARK-15666) Join on two tables generated from a same table throwing query analyzer issue

2016-06-13 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327919#comment-15327919 ] Herman van Hovell commented on SPARK-15666: --- Looking at the exception {{... in operator

[jira] [Commented] (SPARK-15925) Replaces registerTempTable with createOrReplaceTempView in SparkR

2016-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327918#comment-15327918 ] Apache Spark commented on SPARK-15925: -- User 'liancheng' has created a pull request for this issue:

[jira] [Resolved] (SPARK-15697) [SPARK REPL] unblock some of the useful repl commands.

2016-06-13 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-15697. -- Resolution: Fixed Assignee: Prashant Sharma Fix Version/s: 2.0.0 > [SPARK

[jira] [Created] (SPARK-15925) Replaces registerTempTable with createOrReplaceTempView in SparkR

2016-06-13 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-15925: -- Summary: Replaces registerTempTable with createOrReplaceTempView in SparkR Key: SPARK-15925 URL: https://issues.apache.org/jira/browse/SPARK-15925 Project: Spark

[jira] [Commented] (SPARK-15922) BlockMatrix to IndexedRowMatrix throws an error

2016-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327887#comment-15327887 ] Apache Spark commented on SPARK-15922: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Assigned] (SPARK-15922) BlockMatrix to IndexedRowMatrix throws an error

2016-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15922: Assignee: Apache Spark > BlockMatrix to IndexedRowMatrix throws an error >

[jira] [Assigned] (SPARK-15922) BlockMatrix to IndexedRowMatrix throws an error

2016-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15922: Assignee: (was: Apache Spark) > BlockMatrix to IndexedRowMatrix throws an error >

[jira] [Commented] (SPARK-15922) BlockMatrix to IndexedRowMatrix throws an error

2016-06-13 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327884#comment-15327884 ] Dongjoon Hyun commented on SPARK-15922: --- Hi, [~chaz2505]. This is due to `toIndexedRowMatrix` bug.

[jira] [Commented] (SPARK-15924) SparkR parser bug with backslash in comments

2016-06-13 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327851#comment-15327851 ] Shivaram Venkataraman commented on SPARK-15924: --- I'm not sure what part of this code

[jira] [Issue Comment Deleted] (SPARK-15899) file scheme should be used correctly

2016-06-13 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-15899: - Comment: was deleted (was: When I added the two extra slashes, it works on Linux. But,

[jira] [Comment Edited] (SPARK-15869) HTTP 500 and NPE on streaming batch details page

2016-06-13 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327822#comment-15327822 ] Shixiong Zhu edited comment on SPARK-15869 at 6/13/16 5:47 PM: --- Do you have

[jira] [Commented] (SPARK-15899) file scheme should be used correctly

2016-06-13 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327825#comment-15327825 ] Kazuaki Ishizaki commented on SPARK-15899: -- When I added the two extra slashes, it works on

[jira] [Assigned] (SPARK-15613) Incorrect days to millis conversion

2016-06-13 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-15613: -- Assignee: Davies Liu > Incorrect days to millis conversion >

[jira] [Commented] (SPARK-15869) HTTP 500 and NPE on streaming batch details page

2016-06-13 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327822#comment-15327822 ] Shixiong Zhu commented on SPARK-15869: -- Do you have a reproducer? > HTTP 500 and NPE on streaming

[jira] [Commented] (SPARK-15899) file scheme should be used correctly

2016-06-13 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327824#comment-15327824 ] Kazuaki Ishizaki commented on SPARK-15899: -- When I added the two extra slashes, it works on

[jira] [Comment Edited] (SPARK-15869) HTTP 500 and NPE on streaming batch details page

2016-06-13 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327822#comment-15327822 ] Shixiong Zhu edited comment on SPARK-15869 at 6/13/16 5:47 PM: --- Do you have

[jira] [Commented] (SPARK-15666) Join on two tables generated from a same table throwing query analyzer issue

2016-06-13 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327818#comment-15327818 ] Herman van Hovell commented on SPARK-15666: --- [~mkbond777] Is this also a problem on 2.0? Any

[jira] [Commented] (SPARK-15345) SparkSession's conf doesn't take effect when there's already an existing SparkContext

2016-06-13 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327803#comment-15327803 ] Herman van Hovell commented on SPARK-15345: --- [~m1lan] Just to be sure, is this the actual code

[jira] [Updated] (SPARK-15826) PipedRDD to allow configurable char encoding (default: UTF-8)

2016-06-13 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil updated SPARK-15826: Summary: PipedRDD to allow configurable char encoding (default: UTF-8) (was: PipedRDD to strictly

[jira] [Resolved] (SPARK-15913) Dispatcher.stopped should be enclosed by synchronized block.

2016-06-13 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-15913. Resolution: Fixed Assignee: Dongjoon Hyun Fix Version/s: 2.0.0 >

[jira] [Updated] (SPARK-15924) SparkR parser bug with backslash in comments

2016-06-13 Thread Xuan Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Wang updated SPARK-15924: -- Description: When I run an R cell with the following comments: {code} # p <- p +

[jira] [Updated] (SPARK-15924) SparkR parser bug with backslash in comments

2016-06-13 Thread Xuan Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Wang updated SPARK-15924: -- Description: When I run an R cell with the following comments: {code} # p <- p +

[jira] [Updated] (SPARK-15924) SparkR parser bug with backslash in comments

2016-06-13 Thread Xuan Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Wang updated SPARK-15924: -- Description: When I run an R cell with the following comments: {code} # p <- p +

[jira] [Created] (SPARK-15924) SparkR parser bug with backslash in comments

2016-06-13 Thread Xuan Wang (JIRA)
Xuan Wang created SPARK-15924: - Summary: SparkR parser bug with backslash in comments Key: SPARK-15924 URL: https://issues.apache.org/jira/browse/SPARK-15924 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-15163) Mark experimental algorithms experimental in PySpark

2016-06-13 Thread Krishna Kalyan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327748#comment-15327748 ] Krishna Kalyan commented on SPARK-15163: Hi [~holdenk], Is this task still up for grabs?.

[jira] [Resolved] (SPARK-15814) Aggregator can return null result

2016-06-13 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-15814. --- Resolution: Resolved > Aggregator can return null result >

[jira] [Commented] (SPARK-15923) Spark Application rest api returns "no such app: "

2016-06-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327750#comment-15327750 ] Sean Owen commented on SPARK-15923: --- [~tgraves] or [~ste...@apache.org] will probably know better, but

[jira] [Created] (SPARK-15923) Spark Application rest api returns "no such app: "

2016-06-13 Thread Yesha Vora (JIRA)
Yesha Vora created SPARK-15923: -- Summary: Spark Application rest api returns "no such app: " Key: SPARK-15923 URL: https://issues.apache.org/jira/browse/SPARK-15923 Project: Spark Issue Type:

[jira] [Commented] (SPARK-15902) Add a deprecation warning for Python 2.6

2016-06-13 Thread Krishna Kalyan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327723#comment-15327723 ] Krishna Kalyan commented on SPARK-15902: Hi [~holdenk], I have some questions, where do I add

[jira] [Commented] (SPARK-15822) segmentation violation in o.a.s.unsafe.types.UTF8String

2016-06-13 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327703#comment-15327703 ] Herman van Hovell commented on SPARK-15822: --- [~robbinspg] You can dump the plan to the console

[jira] [Commented] (SPARK-15370) Some correlated subqueries return incorrect answers

2016-06-13 Thread Luciano Resende (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327683#comment-15327683 ] Luciano Resende commented on SPARK-15370: - [~hvanhovell] You might need to add [~freiss] to

[jira] [Commented] (SPARK-15118) spark couldn't get hive properyties in hive-site.xml

2016-06-13 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327674#comment-15327674 ] Herman van Hovell commented on SPARK-15118: --- [~eksmile] any update on this? > spark couldn't

[jira] [Commented] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Alessio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327542#comment-15327542 ] Alessio commented on SPARK-15904: - With the --driver-memory 4G switch I've tried both. With no luck. At

[jira] [Comment Edited] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Alessio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327476#comment-15327476 ] Alessio edited comment on SPARK-15904 at 6/13/16 2:48 PM: -- If anyone's

[jira] [Commented] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327510#comment-15327510 ] Sean Owen commented on SPARK-15904: --- Yes, that just means "out of memory". The question is whether this

[jira] [Commented] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Alessio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327476#comment-15327476 ] Alessio commented on SPARK-15904: - If anyone's interested, the dataset I'm working on is freely available

[jira] [Updated] (SPARK-15918) unionAll returns wrong result when two dataframes has schema in different order

2016-06-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-15918: -- Fix Version/s: (was: 1.6.1) Don't set fix version; 1.6.1 wouldn't make sense anyway. > unionAll

[jira] [Updated] (SPARK-15922) BlockMatrix to IndexedRowMatrix throws an error

2016-06-13 Thread Charlie Evans (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Evans updated SPARK-15922: -- Description: {code} import org.apache.spark.mllib.linalg.distributed._ import

[jira] [Updated] (SPARK-15922) BlockMatrix to IndexedRowMatrix throws an error

2016-06-13 Thread Charlie Evans (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Evans updated SPARK-15922: -- Description: {code} import org.apache.spark.mllib.linalg.distributed._ import

[jira] [Created] (SPARK-15922) BlockMatrix to IndexedRowMatrix throws an error

2016-06-13 Thread Charlie Evans (JIRA)
Charlie Evans created SPARK-15922: - Summary: BlockMatrix to IndexedRowMatrix throws an error Key: SPARK-15922 URL: https://issues.apache.org/jira/browse/SPARK-15922 Project: Spark Issue

[jira] [Commented] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327449#comment-15327449 ] Sean Owen commented on SPARK-15904: --- It's not your 400MB data set that is the only thing in memory or

[jira] [Commented] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Alessio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327443#comment-15327443 ] Alessio commented on SPARK-15904: - Correct. Memory and Disk gives priority to Memory...but my dataset is

[jira] [Resolved] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-15904. --- Resolution: Not A Problem Memory and disk still means it's also persisting in memory. I think you'll

[jira] [Commented] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Alessio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327438#comment-15327438 ] Alessio commented on SPARK-15904: - My machine has 16GB of RAM. I also tried closing all the other apps,

[jira] [Commented] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327430#comment-15327430 ] Sean Owen commented on SPARK-15904: --- How much RAM does your machine have? 10GB heap means much more

[jira] [Comment Edited] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Alessio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327411#comment-15327411 ] Alessio edited comment on SPARK-15904 at 6/13/16 1:55 PM: -- This is absolutely

[jira] [Commented] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Alessio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327411#comment-15327411 ] Alessio commented on SPARK-15904: - This is absolutely weird to me. I gave Spark 9GB and during the

[jira] [Comment Edited] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Alessio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327397#comment-15327397 ] Alessio edited comment on SPARK-15904 at 6/13/16 1:48 PM: -- Dear [~srowen], at

[jira] [Commented] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327405#comment-15327405 ] Sean Owen commented on SPARK-15904: --- Hm, but that only means Spark used a lot of memory, and you gave

[jira] [Comment Edited] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Alessio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327397#comment-15327397 ] Alessio edited comment on SPARK-15904 at 6/13/16 1:49 PM: -- Dear [~srowen], at

[jira] [Commented] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Alessio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327397#comment-15327397 ] Alessio commented on SPARK-15904: - Dear [~srowen], at the beginning I noticed that "Cleaning RDD” phase

[jira] [Commented] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327369#comment-15327369 ] Sean Owen commented on SPARK-15904: --- -verbose:gc is a JVM option and should write to stderr. You'd

[jira] [Comment Edited] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Alessio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327272#comment-15327272 ] Alessio edited comment on SPARK-15904 at 6/13/16 12:45 PM: --- Dear Sean, I must

[jira] [Comment Edited] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Alessio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327272#comment-15327272 ] Alessio edited comment on SPARK-15904 at 6/13/16 12:44 PM: --- Dear Sean, I must

[jira] [Comment Edited] (SPARK-8546) PMML export for Naive Bayes

2016-06-13 Thread Radoslaw Gasiorek (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327167#comment-15327167 ] Radoslaw Gasiorek edited comment on SPARK-8546 at 6/13/16 12:43 PM: hi

[jira] [Comment Edited] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Alessio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327272#comment-15327272 ] Alessio edited comment on SPARK-15904 at 6/13/16 12:41 PM: --- Dear Sean, I must

[jira] [Commented] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Alessio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327272#comment-15327272 ] Alessio commented on SPARK-15904: - Dear Sean, I must certainly agree with you on k< High Memory Pressure

[jira] [Closed] (SPARK-15921) Spark unable to read partitioned table in avro format and column name in upper case

2016-06-13 Thread Rajkumar Singh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajkumar Singh closed SPARK-15921. -- Resolution: Fixed > Spark unable to read partitioned table in avro format and column name in

[jira] [Updated] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Alessio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessio updated SPARK-15904: Description: Running MLlib K-Means on a ~400MB dataset (12 partitions), persisted on Memory and Disk.

[jira] [Commented] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327255#comment-15327255 ] Sean Owen commented on SPARK-15904: --- Yeah it's coherent, though typically k << number of points. It

[jira] [Resolved] (SPARK-15919) DStream "saveAsTextFile" doesn't update the prefix after each checkpoint

2016-06-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-15919. --- Resolution: Not A Problem Look at the implementation of DStream.saveAsTextFiles -- about all it does

[jira] [Closed] (SPARK-15919) DStream "saveAsTextFile" doesn't update the prefix after each checkpoint

2016-06-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen closed SPARK-15919. - > DStream "saveAsTextFile" doesn't update the prefix after each checkpoint >

[jira] [Commented] (SPARK-12623) map key_values to values

2016-06-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327248#comment-15327248 ] Sean Owen commented on SPARK-12623: --- The Status can only be "Resolved". You're referring to the

[jira] [Reopened] (SPARK-15919) DStream "saveAsTextFile" doesn't update the prefix after each checkpoint

2016-06-13 Thread Aamir Abbas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aamir Abbas reopened SPARK-15919: - This is an issue, as I do not actually need the current timestamp to use in output path. I need the

[jira] [Commented] (SPARK-15746) SchemaUtils.checkColumnType with VectorUDT prints instance details in error message

2016-06-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327237#comment-15327237 ] Nick Pentreath commented on SPARK-15746: I think you can go ahead now - I also vote for the

[jira] [Commented] (SPARK-12623) map key_values to values

2016-06-13 Thread Elazar Gershuni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327236#comment-15327236 ] Elazar Gershuni commented on SPARK-12623: - At the very least, it should have a "won't fix"

[jira] [Commented] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Alessio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327234#comment-15327234 ] Alessio commented on SPARK-15904: - My dataset has 9000+ patterns, each of which has 2000+ attributes.

[jira] [Commented] (SPARK-15919) DStream "saveAsTextFile" doesn't update the prefix after each checkpoint

2016-06-13 Thread Aamir Abbas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327229#comment-15327229 ] Aamir Abbas commented on SPARK-15919: - ForeachRDD is fine in case you want to save individual RDDs

[jira] [Resolved] (SPARK-15919) DStream "saveAsTextFile" doesn't update the prefix after each checkpoint

2016-06-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-15919. --- Resolution: Not A Problem No, this is simple to accomplish in Spark already. You need to use

[jira] [Commented] (SPARK-6628) ClassCastException occurs when executing sql statement "insert into" on hbase table

2016-06-13 Thread Murshid Chalaev (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327224#comment-15327224 ] Murshid Chalaev commented on SPARK-6628: Thank you > ClassCastException occurs when executing sql

[jira] [Commented] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327220#comment-15327220 ] Nick Pentreath commented on SPARK-15904: Could you explain why you're using K>3000 when your

[jira] [Commented] (SPARK-15919) DStream "saveAsTextFile" doesn't update the prefix after each checkpoint

2016-06-13 Thread Aamir Abbas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327212#comment-15327212 ] Aamir Abbas commented on SPARK-15919: - I need to save the output of each batch in a different place.

[jira] [Commented] (SPARK-15919) DStream "saveAsTextFile" doesn't update the prefix after each checkpoint

2016-06-13 Thread binde (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327209#comment-15327209 ] binde commented on SPARK-15919: --- this is not a bug, getOutputPath() will be invoked on the job start run.

[jira] [Commented] (SPARK-8546) PMML export for Naive Bayes

2016-06-13 Thread Villu Ruusmann (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327205#comment-15327205 ] Villu Ruusmann commented on SPARK-8546: --- Hi [~rgasiorek] - would it be an option to re-build your

[jira] [Resolved] (SPARK-15920) Using map on DataFrame

2016-06-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-15920. --- Resolution: Not A Problem Target Version/s: (was: 2.0.0) Don't set Target please, and

[jira] [Commented] (SPARK-6628) ClassCastException occurs when executing sql statement "insert into" on hbase table

2016-06-13 Thread Teng Qiu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327201#comment-15327201 ] Teng Qiu commented on SPARK-6628: - this is caused by missing interface implementation in

[jira] [Commented] (SPARK-10258) Add @Since annotation to ml.feature

2016-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327197#comment-15327197 ] Apache Spark commented on SPARK-10258: -- User 'MLnick' has created a pull request for this issue:

[jira] [Commented] (SPARK-15790) Audit @Since annotations in ML

2016-06-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327193#comment-15327193 ] Nick Pentreath commented on SPARK-15790: Yes, I've just looked at things in the concrete classes

[jira] [Updated] (SPARK-15921) Spark unable to read partitioned table in avro format and column name in upper case

2016-06-13 Thread Rajkumar Singh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajkumar Singh updated SPARK-15921: --- Description: Spark return null value if the field name is uppercase in hive avro partitioned

[jira] [Created] (SPARK-15921) Spark unable to read partitioned table in avro format and column name in upper case

2016-06-13 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created SPARK-15921: -- Summary: Spark unable to read partitioned table in avro format and column name in upper case Key: SPARK-15921 URL: https://issues.apache.org/jira/browse/SPARK-15921

[jira] [Closed] (SPARK-15293) 'collect_list' function undefined

2016-06-13 Thread Piotr Milanowski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Milanowski closed SPARK-15293. Works fine, thanks. > 'collect_list' function undefined > - >

[jira] [Created] (SPARK-15920) Using map on DataFrame

2016-06-13 Thread Piotr Milanowski (JIRA)
Piotr Milanowski created SPARK-15920: Summary: Using map on DataFrame Key: SPARK-15920 URL: https://issues.apache.org/jira/browse/SPARK-15920 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-8546) PMML export for Naive Bayes

2016-06-13 Thread Radoslaw Gasiorek (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327167#comment-15327167 ] Radoslaw Gasiorek commented on SPARK-8546: -- hi there, [~josephkb] We would like to use Mllib

[jira] [Created] (SPARK-15919) DStream "saveAsTextFile" doesn't update the prefix after each checkpoint

2016-06-13 Thread Aamir Abbas (JIRA)
Aamir Abbas created SPARK-15919: --- Summary: DStream "saveAsTextFile" doesn't update the prefix after each checkpoint Key: SPARK-15919 URL: https://issues.apache.org/jira/browse/SPARK-15919 Project:

<    1   2   3   >