[jira] [Commented] (SPARK-2812) convert maven to archetype based build

2014-08-04 Thread Prashant Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084347#comment-14084347 ] Prashant Sharma commented on SPARK-2812: What do you mean by archetype based

[jira] [Comment Edited] (SPARK-2812) convert maven to archetype based build

2014-08-04 Thread Prashant Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084347#comment-14084347 ] Prashant Sharma edited comment on SPARK-2812 at 8/4/14 6:17 AM:

[jira] [Created] (SPARK-2820) Group by query not returning random values

2014-08-04 Thread Athira Das (JIRA)
Athira Das created SPARK-2820: - Summary: Group by query not returning random values Key: SPARK-2820 URL: https://issues.apache.org/jira/browse/SPARK-2820 Project: Spark Issue Type: Question

[jira] [Created] (SPARK-2821) Group by returning random values in Spark SQL. While running the query sqlContext.sql(SELECT id, month, AVG(marks) FROM data WHERE marks25 GROUP BY id, month)

2014-08-04 Thread Athira Das (JIRA)
Athira Das created SPARK-2821: - Summary: Group by returning random values in Spark SQL. While running the query sqlContext.sql(SELECT id, month, AVG(marks) FROM data WHERE marks25 GROUP BY id, month) Key: SPARK-2821

[jira] [Created] (SPARK-2822) Group by returning random values in SparkSQL

2014-08-04 Thread Athira Das (JIRA)
Athira Das created SPARK-2822: - Summary: Group by returning random values in SparkSQL Key: SPARK-2822 URL: https://issues.apache.org/jira/browse/SPARK-2822 Project: Spark Issue Type: Question

[jira] [Issue Comment Deleted] (SPARK-2820) Group by query not returning random values

2014-08-04 Thread Athira Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Athira Das updated SPARK-2820: -- Comment: was deleted (was: sqlContext.sql(SELECT id, month, AVG(marks) FROM data WHERE marks25 GROUP

[jira] [Updated] (SPARK-2823) GraphX jobs throw IllegalArgumentException

2014-08-04 Thread Lu Lu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lu Lu updated SPARK-2823: - Description: If the users set “spark.default.parallelism” and the value is different with the EdgeRDD partition

[jira] [Updated] (SPARK-2803) add Kafka stream feature for fetch messages from specified starting offset position

2014-08-04 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2803: --- Labels: (was: patch) add Kafka stream feature for fetch messages from specified starting

[jira] [Updated] (SPARK-2803) add Kafka stream feature for fetch messages from specified starting offset position

2014-08-04 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2803: --- Component/s: (was: Input/Output) Streaming add Kafka stream feature

[jira] [Updated] (SPARK-2787) Make sort-based shuffle write files directly when there is no sorting / aggregation and # of partitions is small

2014-08-04 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2787: - Target Version/s: 1.1.0 Make sort-based shuffle write files directly when there is no sorting /

[jira] [Assigned] (SPARK-2787) Make sort-based shuffle write files directly when there is no sorting / aggregation and # of partitions is small

2014-08-04 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia reassigned SPARK-2787: Assignee: Matei Zaharia Make sort-based shuffle write files directly when there is no

[jira] [Commented] (SPARK-2650) Wrong initial sizes for in-memory column buffers

2014-08-04 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084397#comment-14084397 ] Cheng Lian commented on SPARK-2650: --- Did some experiments and came to some conclusions:

[jira] [Commented] (SPARK-2016) rdd in-memory storage UI becomes unresponsive when the number of RDD partitions is large

2014-08-04 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084399#comment-14084399 ] Patrick Wendell commented on SPARK-2016: I think a major part of this was on the

[jira] [Commented] (SPARK-2824) Allow saving Parquet files to the HiveMetastore

2014-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084402#comment-14084402 ] Apache Spark commented on SPARK-2824: - User 'aarondav' has created a pull request for

[jira] [Updated] (SPARK-2815) Compilation failed upon the hadoop version 2.0.0-cdh4.5.0

2014-08-04 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2815: --- Priority: Major (was: Blocker) Compilation failed upon the hadoop version 2.0.0-cdh4.5.0

[jira] [Commented] (SPARK-2815) Compilation failed upon the hadoop version 2.0.0-cdh4.5.0

2014-08-04 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084407#comment-14084407 ] Patrick Wendell commented on SPARK-2815: This may just be a Won't Fix. I thin it's

[jira] [Comment Edited] (SPARK-2815) Compilation failed upon the hadoop version 2.0.0-cdh4.5.0

2014-08-04 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084407#comment-14084407 ] Patrick Wendell edited comment on SPARK-2815 at 8/4/14 7:50 AM:

[jira] [Reopened] (SPARK-2742) The variable inputFormatInfo and inputFormatMap never used

2014-08-04 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell reopened SPARK-2742: Oops - I closed accidentially The variable inputFormatInfo and inputFormatMap never used

[jira] [Commented] (SPARK-1986) lib.Analytics should be in org.apache.spark.examples

2014-08-04 Thread Larry Xiao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084423#comment-14084423 ] Larry Xiao commented on SPARK-1986: --- Hi Ankur Yes I like Analytics to be in examples And

[jira] [Created] (SPARK-2826) Reduce the Memory Copy for HashOuterJoin

2014-08-04 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-2826: Summary: Reduce the Memory Copy for HashOuterJoin Key: SPARK-2826 URL: https://issues.apache.org/jira/browse/SPARK-2826 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-2826) Reduce the Memory Copy for HashOuterJoin

2014-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084429#comment-14084429 ] Apache Spark commented on SPARK-2826: - User 'chenghao-intel' has created a pull

[jira] [Updated] (SPARK-2818) Improve joinning RDDs that transformed from the same cached RDD

2014-08-04 Thread Lu Lu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lu Lu updated SPARK-2818: - Description: if the joinning RDDs are originating from a same cached RDD a, the DAGScheduler will submit

[jira] [Commented] (SPARK-1986) lib.Analytics should be in org.apache.spark.examples

2014-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084449#comment-14084449 ] Apache Spark commented on SPARK-1986: - User 'larryxiao' has created a pull request for

[jira] [Updated] (SPARK-2818) Improve joinning RDDs that transformed from the same cached RDD

2014-08-04 Thread Lu Lu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lu Lu updated SPARK-2818: - Description: if the joinning RDDs are originating from a same cached RDD a, the DAGScheduler will submit

[jira] [Commented] (SPARK-2827) Add DegreeDist function support

2014-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084457#comment-14084457 ] Apache Spark commented on SPARK-2827: - User 'luluorta' has created a pull request for

[jira] [Updated] (SPARK-2818) Improve joinning RDDs that transformed from the same cached RDD

2014-08-04 Thread Lu Lu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lu Lu updated SPARK-2818: - Description: if the joinning RDDs are originating from a same cached RDD, the DAGScheduler will submit

[jira] [Updated] (SPARK-2818) Improve joinning RDDs that transformed from the same cached RDD

2014-08-04 Thread Lu Lu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lu Lu updated SPARK-2818: - Description: if the joinning RDDs are originating from a same cached RDD, the DAGScheduler will submit

[jira] [Comment Edited] (SPARK-2579) Reading from S3 returns an inconsistent number of items with Spark 0.9.1

2014-08-04 Thread Eemil Lagerspetz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083484#comment-14083484 ] Eemil Lagerspetz edited comment on SPARK-2579 at 8/4/14 12:42 PM:

[jira] [Updated] (SPARK-2818) Improve joinning RDDs that transformed from the same parent RDD

2014-08-04 Thread Lu Lu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lu Lu updated SPARK-2818: - Description: if the joinning RDDs are originating from a same cached RDD, the DAGScheduler will submit

[jira] [Created] (SPARK-2830) MLlib v1.1 documentation

2014-08-04 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-2830: Summary: MLlib v1.1 documentation Key: SPARK-2830 URL: https://issues.apache.org/jira/browse/SPARK-2830 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-2838) performance tests for feature transformations

2014-08-04 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-2838: Summary: performance tests for feature transformations Key: SPARK-2838 URL: https://issues.apache.org/jira/browse/SPARK-2838 Project: Spark Issue Type:

[jira] [Created] (SPARK-2841) Documentation for feature transformations

2014-08-04 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-2841: Summary: Documentation for feature transformations Key: SPARK-2841 URL: https://issues.apache.org/jira/browse/SPARK-2841 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-2840) Improve documentation for decision tree

2014-08-04 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-2840: Summary: Improve documentation for decision tree Key: SPARK-2840 URL: https://issues.apache.org/jira/browse/SPARK-2840 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-2842) Documentation for Word2Vec

2014-08-04 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-2842: Summary: Documentation for Word2Vec Key: SPARK-2842 URL: https://issues.apache.org/jira/browse/SPARK-2842 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-2843) Improve documentation for ALS

2014-08-04 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-2843: Summary: Improve documentation for ALS Key: SPARK-2843 URL: https://issues.apache.org/jira/browse/SPARK-2843 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-2812) convert maven to archetype based build

2014-08-04 Thread Anand Avati (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084963#comment-14084963 ] Anand Avati commented on SPARK-2812: I took Mark's suggestion to use

[jira] [Commented] (SPARK-2844) Existing JVM Hive Context not correctly used in Python Hive Context

2014-08-04 Thread Ahir Reddy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085069#comment-14085069 ] Ahir Reddy commented on SPARK-2844: --- https://github.com/apache/spark/pull/1768

[jira] [Created] (SPARK-2844) Existing JVM Hive Context not correctly used in Python Hive Context

2014-08-04 Thread Ahir Reddy (JIRA)
Ahir Reddy created SPARK-2844: - Summary: Existing JVM Hive Context not correctly used in Python Hive Context Key: SPARK-2844 URL: https://issues.apache.org/jira/browse/SPARK-2844 Project: Spark

[jira] [Created] (SPARK-2845) Add timestamp to BlockManager events

2014-08-04 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-2845: - Summary: Add timestamp to BlockManager events Key: SPARK-2845 URL: https://issues.apache.org/jira/browse/SPARK-2845 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-2650) Wrong initial sizes for in-memory column buffers

2014-08-04 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085104#comment-14085104 ] Cheng Lian commented on SPARK-2650: --- Some additional comments after more experiments and

[jira] [Comment Edited] (SPARK-2650) Wrong initial sizes for in-memory column buffers

2014-08-04 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085104#comment-14085104 ] Cheng Lian edited comment on SPARK-2650 at 8/4/14 7:14 PM: --- Some

[jira] [Resolved] (SPARK-1687) Support NamedTuples in RDDs

2014-08-04 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-1687. --- Resolution: Fixed Fix Version/s: 1.1.0 Support NamedTuples in RDDs

[jira] [Updated] (SPARK-2846) Spark SQL hive implementation bypass StorageHandler which breaks any customized StorageHandler

2014-08-04 Thread Alex Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Liu updated SPARK-2846: Component/s: SQL Spark SQL hive implementation bypass StorageHandler which breaks any customized

[jira] [Created] (SPARK-2847) SPARK SQL Hive misses SHOW CREATE TABLE command

2014-08-04 Thread Alex Liu (JIRA)
Alex Liu created SPARK-2847: --- Summary: SPARK SQL Hive misses SHOW CREATE TABLE command Key: SPARK-2847 URL: https://issues.apache.org/jira/browse/SPARK-2847 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-2848) Shade Guava in Spark deliverables

2014-08-04 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-2848: - Summary: Shade Guava in Spark deliverables Key: SPARK-2848 URL: https://issues.apache.org/jira/browse/SPARK-2848 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-2849) Spark submit --driver-* options don't work in client mode

2014-08-04 Thread Andrew Or (JIRA)
Andrew Or created SPARK-2849: Summary: Spark submit --driver-* options don't work in client mode Key: SPARK-2849 URL: https://issues.apache.org/jira/browse/SPARK-2849 Project: Spark Issue Type:

[jira] [Updated] (SPARK-2849) Spark submit --driver-* options don't work in client mode

2014-08-04 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-2849: - Description: We currently ignore all \--driver-* options in client mode. Meanwhile, elsewhere we tell

[jira] [Updated] (SPARK-2849) Spark submit --driver-* options don't work in client mode

2014-08-04 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-2849: - Description: We currently ignore all \-\-driver-* options in client mode. Meanwhile, elsewhere we tell

[jira] [Updated] (SPARK-2841) Documentation for feature transformations

2014-08-04 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2841: - Assignee: DB Tsai Documentation for feature transformations

[jira] [Updated] (SPARK-2835) performance tests for statistical functions

2014-08-04 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2835: - Assignee: Doris Xin performance tests for statistical functions

[jira] [Updated] (SPARK-2831) performance tests for linear classification methods

2014-08-04 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2831: - Assignee: DB Tsai performance tests for linear classification methods

[jira] [Updated] (SPARK-2837) performance tests for ALS

2014-08-04 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2837: - Assignee: Burak Yavuz performance tests for ALS -

[jira] [Updated] (SPARK-2836) performance tests for k-means

2014-08-04 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2836: - Assignee: Burak Yavuz performance tests for k-means -

[jira] [Updated] (SPARK-2834) performance tests for linear algebra functions

2014-08-04 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2834: - Assignee: Burak Yavuz performance tests for linear algebra functions

[jira] [Updated] (SPARK-2833) performance tests for linear regression

2014-08-04 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2833: - Assignee: Burak Yavuz performance tests for linear regression

[jira] [Updated] (SPARK-2831) performance tests for linear classification methods

2014-08-04 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2831: - Assignee: Burak Yavuz (was: DB Tsai) performance tests for linear classification methods

[jira] [Updated] (SPARK-2838) performance tests for feature transformations

2014-08-04 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2838: - Assignee: Xiangrui Meng performance tests for feature transformations

[jira] [Updated] (SPARK-2840) Improve documentation for decision tree

2014-08-04 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2840: - Assignee: Joseph K. Bradley Improve documentation for decision tree

[jira] [Created] (SPARK-2851) Check API consistency for decision tree

2014-08-04 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-2851: Summary: Check API consistency for decision tree Key: SPARK-2851 URL: https://issues.apache.org/jira/browse/SPARK-2851 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-2852) Check API consistency for feature transformations

2014-08-04 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-2852: Summary: Check API consistency for feature transformations Key: SPARK-2852 URL: https://issues.apache.org/jira/browse/SPARK-2852 Project: Spark Issue Type:

[jira] [Commented] (SPARK-2825) Allow creating external tables in metastore

2014-08-04 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085311#comment-14085311 ] Michael Armbrust commented on SPARK-2825: - CREATE EXTERNAL TABLE already works as

[jira] [Updated] (SPARK-2419) Misc updates to streaming programming guide

2014-08-04 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-2419: - Priority: Critical (was: Major) Misc updates to streaming programming guide

[jira] [Updated] (SPARK-2243) Support multiple SparkContexts in the same JVM

2014-08-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2243: --- Affects Version/s: 1.1.0 Support multiple SparkContexts in the same JVM

[jira] [Updated] (SPARK-2243) Support multiple SparkContexts in the same JVM

2014-08-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2243: --- Target Version/s: 1.2.0 Assignee: Reynold Xin Support multiple SparkContexts in the

[jira] [Updated] (SPARK-2546) Configuration object thread safety issue

2014-08-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2546: --- Assignee: Josh Rosen Configuration object thread safety issue

[jira] [Updated] (SPARK-2585) Remove special handling of Hadoop JobConf

2014-08-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2585: --- Assignee: Josh Rosen (was: Reynold Xin) Remove special handling of Hadoop JobConf

[jira] [Updated] (SPARK-2774) Set preferred locations for reduce tasks

2014-08-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2774: --- Target Version/s: 1.2.0 Set preferred locations for reduce tasks

[jira] [Updated] (SPARK-2633) support register spark listener to listener bus with Java API

2014-08-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2633: --- Priority: Critical (was: Major) Target Version/s: 1.2.0 support register spark

[jira] [Commented] (SPARK-2774) Set preferred locations for reduce tasks

2014-08-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085418#comment-14085418 ] Reynold Xin commented on SPARK-2774: I assigned to ticket to [~shivaram] and scoped

[jira] [Updated] (SPARK-2321) Design a proper progress reporting event listener API

2014-08-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2321: --- Target Version/s: 1.2.0 Assignee: Reynold Xin Design a proper progress reporting event

[jira] [Updated] (SPARK-2774) Set preferred locations for reduce tasks

2014-08-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2774: --- Assignee: Shivaram Venkataraman Set preferred locations for reduce tasks

[jira] [Updated] (SPARK-2323) Exception in accumulator update should not crash DAGScheduler SparkContext

2014-08-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2323: --- Priority: Critical (was: Major) Exception in accumulator update should not crash DAGScheduler

[jira] [Comment Edited] (SPARK-2854) Finalize _acceptable_types in pyspark.sql

2014-08-04 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085472#comment-14085472 ] Yin Huai edited comment on SPARK-2854 at 8/4/14 11:15 PM: -- For

[jira] [Commented] (SPARK-2854) Finalize _acceptable_types in pyspark.sql

2014-08-04 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085472#comment-14085472 ] Yin Huai commented on SPARK-2854: - For ByteType, ShortType and IntegerType, I am not sure

[jira] [Created] (SPARK-2855) pyspark test cases crashed for no reason

2014-08-04 Thread Nan Zhu (JIRA)
Nan Zhu created SPARK-2855: -- Summary: pyspark test cases crashed for no reason Key: SPARK-2855 URL: https://issues.apache.org/jira/browse/SPARK-2855 Project: Spark Issue Type: Bug

[jira] [Assigned] (SPARK-2685) Update ExternalAppendOnlyMap to avoid buffer.remove()

2014-08-04 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia reassigned SPARK-2685: Assignee: Matei Zaharia Update ExternalAppendOnlyMap to avoid buffer.remove()

[jira] [Updated] (SPARK-2685) Update ExternalAppendOnlyMap to avoid buffer.remove()

2014-08-04 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2685: - Target Version/s: 1.1.0 Update ExternalAppendOnlyMap to avoid buffer.remove()

[jira] [Commented] (SPARK-2685) Update ExternalAppendOnlyMap to avoid buffer.remove()

2014-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085493#comment-14085493 ] Apache Spark commented on SPARK-2685: - User 'mateiz' has created a pull request for

[jira] [Commented] (SPARK-2179) Public API for DataTypes and Schema

2014-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085565#comment-14085565 ] Apache Spark commented on SPARK-2179: - User 'yhuai' has created a pull request for

[jira] [Commented] (SPARK-1834) NoSuchMethodError when invoking JavaPairRDD.reduce() in Java

2014-08-04 Thread Franklyn Dsouza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085567#comment-14085567 ] Franklyn Dsouza commented on SPARK-1834: There is no reduce function in

[jira] [Updated] (SPARK-1021) sortByKey() launches a cluster job when it shouldn't

2014-08-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-1021: --- Target Version/s: 1.2.0 Affects Version/s: 1.1.0 1.0.0 sortByKey()

[jira] [Commented] (SPARK-2550) Support regularization and intercept in pyspark's linear methods

2014-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085589#comment-14085589 ] Apache Spark commented on SPARK-2550: - User 'miccagiann' has created a pull request

[jira] [Commented] (SPARK-2550) Support regularization and intercept in pyspark's linear methods

2014-08-04 Thread Michael Yannakopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085590#comment-14085590 ] Michael Yannakopoulos commented on SPARK-2550: -- New pull request for

[jira] [Commented] (SPARK-2655) Change the default logging level to WARN

2014-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085639#comment-14085639 ] Apache Spark commented on SPARK-2655: - User 'davies' has created a pull request for

[jira] [Commented] (SPARK-1153) Generalize VertexId in GraphX so that UUIDs can be used as vertex IDs.

2014-08-04 Thread Larry Xiao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085676#comment-14085676 ] Larry Xiao commented on SPARK-1153: --- I like npanj's approach. It's universal. You treat

[jira] [Resolved] (SPARK-1811) Support resizable output buffer for kryo serializer

2014-08-04 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-1811. -- Resolution: Duplicate Fix Version/s: 1.1.0 Support resizable output buffer for kryo

[jira] [Commented] (SPARK-2010) Support for nested data in PySpark SQL

2014-08-04 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085743#comment-14085743 ] Nicholas Chammas commented on SPARK-2010: - I just tried this on {{master}}:

[jira] [Updated] (SPARK-2857) master.ui.port and worker.ui.port are ignored

2014-08-04 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-2857: - Fix Version/s: 1.1.0 master.ui.port and worker.ui.port are ignored

[jira] [Updated] (SPARK-2857) master.ui.port and worker.ui.port are ignored

2014-08-04 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-2857: - Component/s: Deploy master.ui.port and worker.ui.port are ignored

[jira] [Created] (SPARK-2857) master.ui.port and worker.ui.port are ignored

2014-08-04 Thread Andrew Or (JIRA)
Andrew Or created SPARK-2857: Summary: master.ui.port and worker.ui.port are ignored Key: SPARK-2857 URL: https://issues.apache.org/jira/browse/SPARK-2857 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-2854) Finalize _acceptable_types in pyspark.sql

2014-08-04 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-2854: Component/s: SQL Finalize _acceptable_types in pyspark.sql -

[jira] [Commented] (SPARK-2010) Support for nested data in PySpark SQL

2014-08-04 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085807#comment-14085807 ] Nicholas Chammas commented on SPARK-2010: - Hmm, [it doesn't look like it's in the

[jira] [Comment Edited] (SPARK-2010) Support for nested data in PySpark SQL

2014-08-04 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085807#comment-14085807 ] Nicholas Chammas edited comment on SPARK-2010 at 8/5/14 5:06 AM:

[jira] [Commented] (SPARK-2157) Can't write tight firewall rules for Spark

2014-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085834#comment-14085834 ] Apache Spark commented on SPARK-2157: - User 'andrewor14' has created a pull request

[jira] [Commented] (SPARK-2856) Decrease initial buffer size for Kryo to 64KB

2014-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085846#comment-14085846 ] Apache Spark commented on SPARK-2856: - User 'rxin' has created a pull request for this