[jira] [Created] (SPARK-2613) CLONE - word2vec: Distributed Representation of Words

2014-07-22 Thread Yifan Yang (JIRA)
Yifan Yang created SPARK-2613: - Summary: CLONE - word2vec: Distributed Representation of Words Key: SPARK-2613 URL: https://issues.apache.org/jira/browse/SPARK-2613 Project: Spark Issue Type:

[jira] [Commented] (SPARK-2421) Spark should treat writable as serializable for keys

2014-07-22 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069880#comment-14069880 ] Sandy Ryza commented on SPARK-2421: --- It should be relatively straightforward to add a

[jira] [Commented] (SPARK-2612) ALS has data skew for popular product

2014-07-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069883#comment-14069883 ] Apache Spark commented on SPARK-2612: - User 'renozhang' has created a pull request for

[jira] [Created] (SPARK-2614) Add the spark-examples-xxx-.jar to the Debian package created by assembly/pom.xml (e.g. -PDeb)

2014-07-22 Thread Christian Tzolov (JIRA)
Christian Tzolov created SPARK-2614: --- Summary: Add the spark-examples-xxx-.jar to the Debian package created by assembly/pom.xml (e.g. -PDeb) Key: SPARK-2614 URL:

[jira] [Created] (SPARK-2615) Add == support for HiveQl

2014-07-22 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-2615: Summary: Add == support for HiveQl Key: SPARK-2615 URL: https://issues.apache.org/jira/browse/SPARK-2615 Project: Spark Issue Type: Bug Components: SQL

[jira] [Commented] (SPARK-2615) Add == support for HiveQl

2014-07-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069934#comment-14069934 ] Apache Spark commented on SPARK-2615: - User 'chenghao-intel' has created a pull

[jira] [Created] (SPARK-2616) Update Mesos to 0.19.1

2014-07-22 Thread Timothy Chen (JIRA)
Timothy Chen created SPARK-2616: --- Summary: Update Mesos to 0.19.1 Key: SPARK-2616 URL: https://issues.apache.org/jira/browse/SPARK-2616 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-2452) Multi-statement input to spark repl does not work

2014-07-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2452. Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1441

[jira] [Commented] (SPARK-2615) Add == support for HiveQl

2014-07-22 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069948#comment-14069948 ] Cheng Hao commented on SPARK-2615: -- https://github.com/apache/spark/pull/1522 Add ==

[jira] [Issue Comment Deleted] (SPARK-2615) Add == support for HiveQl

2014-07-22 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-2615: - Comment: was deleted (was: https://github.com/apache/spark/pull/1522) Add == support for HiveQl

[jira] [Commented] (SPARK-2599) almostEquals mllib.util.TestingUtils does not behave as expected when comparing against 0.0

2014-07-22 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069979#comment-14069979 ] Sean Owen commented on SPARK-2599: -- Yeah they're tracking roughly the same issue but it

[jira] [Created] (SPARK-2617) Correct doc and usage of preservesPartitioning

2014-07-22 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-2617: Summary: Correct doc and usage of preservesPartitioning Key: SPARK-2617 URL: https://issues.apache.org/jira/browse/SPARK-2617 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-2612) ALS has data skew for popular product

2014-07-22 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-2612. -- Resolution: Fixed Fix Version/s: 1.1.0 ALS has data skew for popular product

[jira] [Updated] (SPARK-2612) ALS has data skew for popular product

2014-07-22 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2612: - Assignee: Peng Zhang ALS has data skew for popular product

[jira] [Commented] (SPARK-2614) Add the spark-examples-xxx-.jar to the Debian package created by assembly/pom.xml (e.g. -PDeb)

2014-07-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070197#comment-14070197 ] Apache Spark commented on SPARK-2614: - User 'tzolov' has created a pull request for

[jira] [Updated] (SPARK-2614) Add the spark-examples-xxx-.jar to the Debian package created by assembly/pom.xml (e.g. -Pdeb)

2014-07-22 Thread Christian Tzolov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Tzolov updated SPARK-2614: Summary: Add the spark-examples-xxx-.jar to the Debian package created by assembly/pom.xml

[jira] [Created] (SPARK-2618) use config spark.scheduler.priority for specifying TaskSet's priority on DAGScheduler

2014-07-22 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-2618: --- Summary: use config spark.scheduler.priority for specifying TaskSet's priority on DAGScheduler Key: SPARK-2618 URL: https://issues.apache.org/jira/browse/SPARK-2618

[jira] [Commented] (SPARK-2604) Spark Application hangs on yarn in edge case scenario of executor memory requirement

2014-07-22 Thread Twinkle Sachdeva (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070227#comment-14070227 ] Twinkle Sachdeva commented on SPARK-2604: - I tried running in yarn-cluster mode.

[jira] [Created] (SPARK-2619) Configurable file-mode for spark/bin folder in the .deb package.

2014-07-22 Thread Christian Tzolov (JIRA)
Christian Tzolov created SPARK-2619: --- Summary: Configurable file-mode for spark/bin folder in the .deb package. Key: SPARK-2619 URL: https://issues.apache.org/jira/browse/SPARK-2619 Project: Spark

[jira] [Commented] (SPARK-2446) Add BinaryType support to Parquet I/O.

2014-07-22 Thread Teng Qiu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070310#comment-14070310 ] Teng Qiu commented on SPARK-2446: - hi [~marmbrus] impala creating parquet file also

[jira] [Comment Edited] (SPARK-2446) Add BinaryType support to Parquet I/O.

2014-07-22 Thread Teng Qiu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070310#comment-14070310 ] Teng Qiu edited comment on SPARK-2446 at 7/22/14 2:47 PM: -- hi

[jira] [Commented] (SPARK-2282) PySpark crashes if too many tasks complete quickly

2014-07-22 Thread Ken Carlile (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070367#comment-14070367 ] Ken Carlile commented on SPARK-2282: Hi Aaron, Another question for you. Would it

[jira] [Created] (SPARK-2620) case class cannot be used as key for reduce

2014-07-22 Thread Gerard Maas (JIRA)
Gerard Maas created SPARK-2620: -- Summary: case class cannot be used as key for reduce Key: SPARK-2620 URL: https://issues.apache.org/jira/browse/SPARK-2620 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-2620) case class cannot be used as key for reduce

2014-07-22 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070439#comment-14070439 ] Sean Owen commented on SPARK-2620: -- Duplicate of

[jira] [Created] (SPARK-2621) Update task InputMetrics incrementally

2014-07-22 Thread Sandy Ryza (JIRA)
Sandy Ryza created SPARK-2621: - Summary: Update task InputMetrics incrementally Key: SPARK-2621 URL: https://issues.apache.org/jira/browse/SPARK-2621 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark

2014-07-22 Thread Helena Edelson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Helena Edelson updated SPARK-2593: -- Issue Type: Improvement (was: Brainstorming) Add ability to pass an existing Akka

[jira] [Updated] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark

2014-07-22 Thread Helena Edelson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Helena Edelson updated SPARK-2593: -- Description: As a developer I want to pass an existing ActorSystem into StreamingContext in

[jira] [Commented] (SPARK-2620) case class cannot be used as key for reduce

2014-07-22 Thread Gerard Maas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070443#comment-14070443 ] Gerard Maas commented on SPARK-2620: [~sowen] No, doesn't look like it is. case

[jira] [Commented] (SPARK-2615) Add == support for HiveQl

2014-07-22 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070448#comment-14070448 ] Yin Huai commented on SPARK-2615: - Based on Hive language manual

[jira] [Commented] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark

2014-07-22 Thread Evan Chan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070476#comment-14070476 ] Evan Chan commented on SPARK-2593: -- I would say that the base SparkContext should have

[jira] [Created] (SPARK-2622) Add Jenkins build numbers to SparkQA messages

2014-07-22 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-2622: Summary: Add Jenkins build numbers to SparkQA messages Key: SPARK-2622 URL: https://issues.apache.org/jira/browse/SPARK-2622 Project: Spark Issue Type:

[jira] [Commented] (SPARK-2282) PySpark crashes if too many tasks complete quickly

2014-07-22 Thread Aaron Davidson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070528#comment-14070528 ] Aaron Davidson commented on SPARK-2282: --- Great to hear! These files haven't been

[jira] [Commented] (SPARK-2620) case class cannot be used as key for reduce

2014-07-22 Thread Daniel Siegmann (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070543#comment-14070543 ] Daniel Siegmann commented on SPARK-2620: I have confirmed this on Spark 1.0.1 as

[jira] [Created] (SPARK-2623) Stacked Auto Encoder (Deep Learning )

2014-07-22 Thread Victor Fang (JIRA)
Victor Fang created SPARK-2623: -- Summary: Stacked Auto Encoder (Deep Learning ) Key: SPARK-2623 URL: https://issues.apache.org/jira/browse/SPARK-2623 Project: Spark Issue Type: New Feature

[jira] [Updated] (SPARK-2623) Stacked Auto Encoder (Deep Learning )

2014-07-22 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2623: - Assignee: Victor Fang Stacked Auto Encoder (Deep Learning )

[jira] [Created] (SPARK-2624) Datanucleus jars not accessible in yarn-cluster mode

2014-07-22 Thread Andrew Or (JIRA)
Andrew Or created SPARK-2624: Summary: Datanucleus jars not accessible in yarn-cluster mode Key: SPARK-2624 URL: https://issues.apache.org/jira/browse/SPARK-2624 Project: Spark Issue Type: Bug

[jira] [Comment Edited] (SPARK-2620) case class cannot be used as key for reduce

2014-07-22 Thread Aaron (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070604#comment-14070604 ] Aaron edited comment on SPARK-2620 at 7/22/14 6:05 PM: --- If you look

[jira] [Commented] (SPARK-2620) case class cannot be used as key for reduce

2014-07-22 Thread Aaron (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070604#comment-14070604 ] Aaron commented on SPARK-2620: -- If you look at the diff of distinct from branch-0.9 to master

[jira] [Comment Edited] (SPARK-2620) case class cannot be used as key for reduce

2014-07-22 Thread Aaron (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070604#comment-14070604 ] Aaron edited comment on SPARK-2620 at 7/22/14 6:05 PM: --- If you look

[jira] [Created] (SPARK-2625) Fix ShuffleReadMetrics for NettyBlockFetcherIterator

2014-07-22 Thread Sandy Ryza (JIRA)
Sandy Ryza created SPARK-2625: - Summary: Fix ShuffleReadMetrics for NettyBlockFetcherIterator Key: SPARK-2625 URL: https://issues.apache.org/jira/browse/SPARK-2625 Project: Spark Issue Type:

[jira] [Commented] (SPARK-2618) use config spark.scheduler.priority for specifying TaskSet's priority on DAGScheduler

2014-07-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070691#comment-14070691 ] Patrick Wendell commented on SPARK-2618: Can you explain more what you are trying

[jira] [Resolved] (SPARK-2047) Use less memory in AppendOnlyMap.destructiveSortedIterator

2014-07-22 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2047. -- Resolution: Fixed Fix Version/s: 1.1.0 Use less memory in

[jira] [Updated] (SPARK-2047) Use less memory in AppendOnlyMap.destructiveSortedIterator

2014-07-22 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2047: - Assignee: Aaron Davidson Use less memory in AppendOnlyMap.destructiveSortedIterator

[jira] [Created] (SPARK-2626) Stop SparkContext in all examples

2014-07-22 Thread Andrew Or (JIRA)
Andrew Or created SPARK-2626: Summary: Stop SparkContext in all examples Key: SPARK-2626 URL: https://issues.apache.org/jira/browse/SPARK-2626 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-2627) Check for PEP 8 compliance on all Python code in the Jenkins CI cycle

2014-07-22 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-2627: --- Summary: Check for PEP 8 compliance on all Python code in the Jenkins CI cycle Key: SPARK-2627 URL: https://issues.apache.org/jira/browse/SPARK-2627 Project:

[jira] [Created] (SPARK-2628) Mesos backend throwing unable to find LoginModule

2014-07-22 Thread Timothy Chen (JIRA)
Timothy Chen created SPARK-2628: --- Summary: Mesos backend throwing unable to find LoginModule Key: SPARK-2628 URL: https://issues.apache.org/jira/browse/SPARK-2628 Project: Spark Issue Type:

[jira] [Commented] (SPARK-1166) leftover vpc_id may block the creation of new ec2 cluster

2014-07-22 Thread bruce szalwinski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070823#comment-14070823 ] bruce szalwinski commented on SPARK-1166: - I've been able to reproduce, but not

[jira] [Commented] (SPARK-2628) Mesos backend throwing unable to find LoginModule

2014-07-22 Thread Timothy Chen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070826#comment-14070826 ] Timothy Chen commented on SPARK-2628: - [~pwendell] please assign to me, thanks!

[jira] [Updated] (SPARK-2628) Mesos backend throwing unable to find LoginModule

2014-07-22 Thread Timothy Chen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Chen updated SPARK-2628: Component/s: Mesos Mesos backend throwing unable to find LoginModule

[jira] [Commented] (SPARK-2452) Multi-statement input to spark repl does not work

2014-07-22 Thread Timothy Hunter (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070841#comment-14070841 ] Timothy Hunter commented on SPARK-2452: --- Excellent, thanks Patrick.

[jira] [Commented] (SPARK-1166) leftover vpc_id may block the creation of new ec2 cluster

2014-07-22 Thread bruce szalwinski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070840#comment-14070840 ] bruce szalwinski commented on SPARK-1166: - To resolve, I go to

[jira] [Updated] (SPARK-2628) Mesos backend throwing unable to find LoginModule

2014-07-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2628: --- Assignee: Tim Chen Mesos backend throwing unable to find LoginModule

[jira] [Updated] (SPARK-1642) Upgrade FlumeInputDStream's FlumeReceiver to support FLUME-2083

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-1642: - Fix Version/s: (was: 1.1.0) Upgrade FlumeInputDStream's FlumeReceiver to support FLUME-2083

[jira] [Updated] (SPARK-1645) Improve Spark Streaming compatibility with Flume

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-1645: - Issue Type: Improvement (was: Bug) Improve Spark Streaming compatibility with Flume

[jira] [Updated] (SPARK-1645) Improve Spark Streaming compatibility with Flume

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-1645: - Target Version/s: 1.1.0 Improve Spark Streaming compatibility with Flume

[jira] [Commented] (SPARK-2614) Add the spark-examples-xxx-.jar to the Debian package created by assembly/pom.xml (e.g. -Pdeb)

2014-07-22 Thread Mark Hamstra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070908#comment-14070908 ] Mark Hamstra commented on SPARK-2614: - It's also common for installers/admins to not

[jira] [Updated] (SPARK-1642) Upgrade FlumeInputDStream's FlumeReceiver to support FLUME-2083

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-1642: - Target Version/s: 1.1.0 Upgrade FlumeInputDStream's FlumeReceiver to support FLUME-2083

[jira] [Updated] (SPARK-1853) Show Streaming application code context (file, line number) in Spark Stages UI

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-1853: - Target Version/s: 1.1.0 Show Streaming application code context (file, line number) in Spark

[jira] [Updated] (SPARK-2464) Twitter Receiver does not stop correctly when streamingContext.stop is called

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-2464: - Target Version/s: 1.1.0, 1.0.2 Twitter Receiver does not stop correctly when

[jira] [Updated] (SPARK-2345) ForEachDStream should have an option of running the foreachfunc on Spark

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-2345: - Issue Type: Wish (was: Bug) ForEachDStream should have an option of running the foreachfunc on

[jira] [Updated] (SPARK-1854) Add a version of StreamingContext.fileStream that take hadoop conf object

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-1854: - Fix Version/s: (was: 1.1.0) Add a version of StreamingContext.fileStream that take hadoop

[jira] [Updated] (SPARK-2464) Twitter Receiver does not stop correctly when streamingContext.stop is called

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-2464: - Fix Version/s: (was: 1.0.2) (was: 1.1.0) Twitter Receiver does not

[jira] [Commented] (SPARK-2379) stopReceive in dead loop, cause stackoverflow exception

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070915#comment-14070915 ] Tathagata Das commented on SPARK-2379: -- Any information on this? If we have no way to

[jira] [Updated] (SPARK-2447) Add common solution for sending upsert actions to HBase (put, deletes, and increment)

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-2447: - Target Version/s: 1.1.0 Add common solution for sending upsert actions to HBase (put, deletes,

[jira] [Updated] (SPARK-2377) Create a Python API for Spark Streaming

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-2377: - Target Version/s: 1.1.0 Create a Python API for Spark Streaming

[jira] [Updated] (SPARK-2377) Create a Python API for Spark Streaming

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-2377: - Fix Version/s: (was: 1.1.0) Create a Python API for Spark Streaming

[jira] [Assigned] (SPARK-2377) Create a Python API for Spark Streaming

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das reassigned SPARK-2377: Assignee: Tathagata Das Create a Python API for Spark Streaming

[jira] [Updated] (SPARK-2447) Add common solution for sending upsert actions to HBase (put, deletes, and increment)

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-2447: - Component/s: Streaming Spark Core Add common solution for sending upsert

[jira] [Updated] (SPARK-1729) Make Flume pull data from source, rather than the current push model

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-1729: - Assignee: Hari Shreedharan (was: Tathagata Das) Make Flume pull data from source, rather than

[jira] [Commented] (SPARK-2599) almostEquals mllib.util.TestingUtils does not behave as expected when comparing against 0.0

2014-07-22 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070929#comment-14070929 ] DB Tsai commented on SPARK-2599: I'm the original guy implementing `almostEquals` for my

[jira] [Updated] (SPARK-1730) Make receiver store data reliably to avoid data-loss on executor failures

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-1730: - Assignee: Hari Shreedharan Make receiver store data reliably to avoid data-loss on executor

[jira] [Updated] (SPARK-2438) Streaming + MLLib

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-2438: - Target Version/s: 1.1.0 Streaming + MLLib - Key: SPARK-2438

[jira] [Updated] (SPARK-2438) Streaming + MLLib

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-2438: - Component/s: Streaming Streaming + MLLib - Key: SPARK-2438

[jira] [Updated] (SPARK-1645) Improve Spark Streaming compatibility with Flume

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-1645: - Issue Type: Improvement (was: New Feature) Improve Spark Streaming compatibility with Flume

[jira] [Updated] (SPARK-1645) Improve Spark Streaming compatibility with Flume

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-1645: - Issue Type: New Feature (was: Improvement) Improve Spark Streaming compatibility with Flume

[jira] [Updated] (SPARK-2438) Streaming + MLLib

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-2438: - Issue Type: New Feature (was: Improvement) Streaming + MLLib -

[jira] [Updated] (SPARK-1730) Make receiver store data reliably to avoid data-loss on executor failures

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-1730: - Target Version/s: 1.1.0 Fix Version/s: (was: 1.1.0) Make receiver store data

[jira] [Updated] (SPARK-2548) JavaRecoverableWordCount is missing

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-2548: - Target Version/s: 1.1.0, 1.0.2, 0.9.3 (was: 1.1.0, 0.9.3) JavaRecoverableWordCount is missing

[jira] [Created] (SPARK-2629) Improve performance of DStream.updateStateByKey using IndexRDD

2014-07-22 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-2629: Summary: Improve performance of DStream.updateStateByKey using IndexRDD Key: SPARK-2629 URL: https://issues.apache.org/jira/browse/SPARK-2629 Project: Spark

[jira] [Commented] (SPARK-2629) Improve performance of DStream.updateStateByKey using IndexRDD

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070939#comment-14070939 ] Tathagata Das commented on SPARK-2629: -- Index RDD is necessary for this improvement

[jira] [Commented] (SPARK-1642) Upgrade FlumeInputDStream's FlumeReceiver to support FLUME-2083

2014-07-22 Thread Ted Malaska (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070938#comment-14070938 ] Ted Malaska commented on SPARK-1642: Are there any changes needed here? Upgrade

[jira] [Commented] (SPARK-2420) Change Spark build to minimize library conflicts

2014-07-22 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070966#comment-14070966 ] Marcelo Vanzin commented on SPARK-2420: --- I'm all for sanitizing dependencies, but

[jira] [Updated] (SPARK-1853) Show Streaming application code context (file, line number) in Spark Stages UI

2014-07-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-1853: - Assignee: Mubarak Seyed (was: Tathagata Das) Show Streaming application code context (file,

[jira] [Created] (SPARK-2630) Input data size goes overflow when size is large then 4G in one task

2014-07-22 Thread Davies Liu (JIRA)
Davies Liu created SPARK-2630: - Summary: Input data size goes overflow when size is large then 4G in one task Key: SPARK-2630 URL: https://issues.apache.org/jira/browse/SPARK-2630 Project: Spark

[jira] [Updated] (SPARK-2630) Input data size goes overflow when size is large then 4G in one task

2014-07-22 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-2630: -- Attachment: overflow.tiff The input size is showed as 5.8MB, but the real input size is 4.3G. Input

[jira] [Created] (SPARK-2631) In-memory Compression is not configured with SQLConf

2014-07-22 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-2631: --- Summary: In-memory Compression is not configured with SQLConf Key: SPARK-2631 URL: https://issues.apache.org/jira/browse/SPARK-2631 Project: Spark

[jira] [Commented] (SPARK-2282) PySpark crashes if too many tasks complete quickly

2014-07-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071039#comment-14071039 ] Patrick Wendell commented on SPARK-2282: [~carlilek] I'd actually recommend just

[jira] [Updated] (SPARK-2426) Quadratic Minimization for MLlib ALS

2014-07-22 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2426: - Target Version/s: (was: 1.1.0) Quadratic Minimization for MLlib ALS

[jira] [Resolved] (SPARK-2613) CLONE - word2vec: Distributed Representation of Words

2014-07-22 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-2613. -- Resolution: Duplicate CLONE - word2vec: Distributed Representation of Words

[jira] [Updated] (SPARK-1545) Add Random Forest algorithm to MLlib

2014-07-22 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1545: - Target Version/s: (was: 1.1.0) Add Random Forest algorithm to MLlib

[jira] [Updated] (SPARK-1547) Add gradient boosting algorithm to MLlib

2014-07-22 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1547: - Target Version/s: (was: 1.1.0) Add gradient boosting algorithm to MLlib

[jira] [Updated] (SPARK-2010) Support for nested data in PySpark SQL

2014-07-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2010: Priority: Blocker (was: Critical) Support for nested data in PySpark SQL

[jira] [Created] (SPARK-2632) Importing a method of class in Spark REPL causes the REPL to pulls in unnecessary stuff.

2014-07-22 Thread Yin Huai (JIRA)
Yin Huai created SPARK-2632: --- Summary: Importing a method of class in Spark REPL causes the REPL to pulls in unnecessary stuff. Key: SPARK-2632 URL: https://issues.apache.org/jira/browse/SPARK-2632

[jira] [Updated] (SPARK-2632) Importing a method of class in Spark REPL causes the REPL to pulls in unnecessary stuff.

2014-07-22 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-2632: Description: Master is affected by this bug. To reproduce the exception, you can start a local cluster

[jira] [Commented] (SPARK-2282) PySpark crashes if too many tasks complete quickly

2014-07-22 Thread Aaron Davidson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071192#comment-14071192 ] Aaron Davidson commented on SPARK-2282: --- [~pwendell] That would in general be the

[jira] [Commented] (SPARK-2282) PySpark crashes if too many tasks complete quickly

2014-07-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071197#comment-14071197 ] Patrick Wendell commented on SPARK-2282: Ah my b. I was confused. PySpark

[jira] [Commented] (SPARK-2615) Add == support for HiveQl

2014-07-22 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071214#comment-14071214 ] Cheng Hao commented on SPARK-2615: -- Yes, that's true. But == is actually used in lots of

[jira] [Commented] (SPARK-2614) Add the spark-examples-xxx-.jar to the Debian package created by assembly/pom.xml (e.g. -Pdeb)

2014-07-22 Thread Christian Tzolov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071220#comment-14071220 ] Christian Tzolov commented on SPARK-2614: - Fair point [~markhamstra]. I agree

[jira] [Commented] (SPARK-975) Spark Replay Debugger

2014-07-22 Thread Phuoc Do (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071225#comment-14071225 ] Phuoc Do commented on SPARK-975: Cheng Lian, some JS libraries that can draw flow diagrams:

[jira] [Commented] (SPARK-975) Spark Replay Debugger

2014-07-22 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071232#comment-14071232 ] Cheng Lian commented on SPARK-975: -- Hey [~phuocd], that image actually shows exactly the

  1   2   >