[jira] [Commented] (SPARK-5685) Show warning when users open text files compressed with non-splittable algorithms like gzip

2015-02-09 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311931#comment-14311931 ] Nicholas Chammas commented on SPARK-5685: - [~joshrosen] - What do you think of

[jira] [Updated] (SPARK-5682) Reuse hadoop encrypted shuffle algorithm to enable spark encrypted shuffle

2015-02-09 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyunzhang_intel updated SPARK-5682: Attachment: encrypted_shuffle.patch.4 encrypted_shuffle.patch.4 is how to reuse hadoop

[jira] [Updated] (SPARK-5681) Calling graceful stop() immediately after start() on StreamingContext should not get stuck indefinitely

2015-02-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5681: - Component/s: Streaming Calling graceful stop() immediately after start() on StreamingContext should

[jira] [Commented] (SPARK-5016) GaussianMixtureEM should distribute matrix inverse for large numFeatures, k

2015-02-09 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311956#comment-14311956 ] Manoj Kumar commented on SPARK-5016: [~tgaloppo] I would like your inputs on this as

[jira] [Commented] (SPARK-5684) Key not found exception is thrown in case location of added partition to a parquet table is different than a path containing the partition values

2015-02-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311943#comment-14311943 ] Apache Spark commented on SPARK-5684: - User 'saucam' has created a pull request for

[jira] [Commented] (SPARK-5281) Registering table on RDD is giving MissingRequirementError

2015-02-09 Thread irene rognoni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311948#comment-14311948 ] irene rognoni commented on SPARK-5281: -- same issue here, since last week, no news on

[jira] [Commented] (SPARK-5676) License missing from spark-ec2 repo

2015-02-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311990#comment-14311990 ] Sean Owen commented on SPARK-5676: -- Dumb question, I know, but what is the spark-ec2 repo

[jira] [Commented] (SPARK-5679) Flaky tests in InputOutputMetricsSuite: input metrics with interleaved reads and input metrics with mixed read method

2015-02-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311994#comment-14311994 ] Sean Owen commented on SPARK-5679: -- Same as SPARK-5227? Flaky tests in

[jira] [Resolved] (SPARK-5473) Expose SSH failures after status checks pass

2015-02-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5473. -- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4262

[jira] [Updated] (SPARK-5473) Expose SSH failures after status checks pass

2015-02-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5473: - Assignee: Nicholas Chammas Expose SSH failures after status checks pass

[jira] [Commented] (SPARK-5239) JdbcRDD throws java.lang.AbstractMethodError: oracle.jdbc.driver.xxxxxx.isClosed()Z

2015-02-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312040#comment-14312040 ] Apache Spark commented on SPARK-5239: - User 'srowen' has created a pull request for

[jira] [Updated] (SPARK-5688) In Decision Trees, choosing a random subset of categories for each split

2015-02-09 Thread Eric Denovitzer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Denovitzer updated SPARK-5688: --- Labels: categorical decisiontree (was: categorical) In Decision Trees, choosing a random

[jira] [Updated] (SPARK-4423) Improve foreach() documentation to avoid confusion between local- and cluster-mode behavior

2015-02-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-4423: - Assignee: Ilya Ganelin Improve foreach() documentation to avoid confusion between local- and

[jira] [Commented] (SPARK-4655) Split Stage into ShuffleMapStage and ResultStage subclasses

2015-02-09 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312458#comment-14312458 ] Ilya Ganelin commented on SPARK-4655: - Hi [~joshrosen], I'd be happy to work on this.

[jira] [Commented] (SPARK-4423) Improve foreach() documentation to avoid confusion between local- and cluster-mode behavior

2015-02-09 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312424#comment-14312424 ] Ilya Ganelin commented on SPARK-4423: - I'll be happy to update this. Thank you.

[jira] [Commented] (SPARK-5651) Support 'create db.table' in HiveContext

2015-02-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312344#comment-14312344 ] Apache Spark commented on SPARK-5651: - User 'OopsOutOfMemory' has created a pull

[jira] [Commented] (SPARK-5570) No docs stating that `new SparkConf().set(spark.driver.memory, ...) will not work

2015-02-09 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312416#comment-14312416 ] Ilya Ganelin commented on SPARK-5570: - I'll fix this, can you please assign it to me?

[jira] [Updated] (SPARK-823) spark.default.parallelism's default is inconsistent across scheduler backends

2015-02-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-823: Assignee: Ilya Ganelin spark.default.parallelism's default is inconsistent across scheduler backends

[jira] [Commented] (SPARK-4705) Driver retries in yarn-cluster mode always fail if event logging is enabled

2015-02-09 Thread Twinkle Sachdeva (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312328#comment-14312328 ] Twinkle Sachdeva commented on SPARK-4705: - Hi [~vanzin] Please take a look at the

[jira] [Commented] (SPARK-5687) in TaskResultGetter need to catch OutOfMemoryError.

2015-02-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312365#comment-14312365 ] Apache Spark commented on SPARK-5687: - User 'lianhuiwang' has created a pull request

[jira] [Updated] (SPARK-5688) Splits for Categorical Variables in DecisionTrees

2015-02-09 Thread Eric Denovitzer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Denovitzer updated SPARK-5688: --- Summary: Splits for Categorical Variables in DecisionTrees (was: In Decision Trees, choosing

[jira] [Created] (SPARK-5688) In Decision Trees, choosing a random subset of categories for each split

2015-02-09 Thread Eric Denovitzer (JIRA)
Eric Denovitzer created SPARK-5688: -- Summary: In Decision Trees, choosing a random subset of categories for each split Key: SPARK-5688 URL: https://issues.apache.org/jira/browse/SPARK-5688 Project:

[jira] [Commented] (SPARK-5688) Splits for Categorical Variables in DecisionTrees

2015-02-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312403#comment-14312403 ] Apache Spark commented on SPARK-5688: - User 'edenovit' has created a pull request for

[jira] [Commented] (SPARK-5079) Detect failed jobs / batches in Spark Streaming unit tests

2015-02-09 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312415#comment-14312415 ] Ilya Ganelin commented on SPARK-5079: - I can work on this - can you please assign it

[jira] [Created] (SPARK-5689) Document what can be run in different YARN modes

2015-02-09 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-5689: Summary: Document what can be run in different YARN modes Key: SPARK-5689 URL: https://issues.apache.org/jira/browse/SPARK-5689 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-823) spark.default.parallelism's default is inconsistent across scheduler backends

2015-02-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-823. - Resolution: Fixed spark.default.parallelism's default is inconsistent across scheduler backends

[jira] [Updated] (SPARK-4705) Driver retries in yarn-cluster mode always fail if event logging is enabled

2015-02-09 Thread Twinkle Sachdeva (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Twinkle Sachdeva updated SPARK-4705: Attachment: multi-attempts with no attempt based UI.png Driver retries in yarn-cluster

[jira] [Updated] (SPARK-5703) JobProgressListener throws empty.max error

2015-02-09 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5703: - Description: In JobProgressListener, if you have a JobEnd that does not have a corresponding JobStart,

[jira] [Updated] (SPARK-5703) JobProgressListener throws empty.max error

2015-02-09 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5703: - Summary: JobProgressListener throws empty.max error (was: JobProgressListener throws empty.max error in

[jira] [Created] (SPARK-5703) JobProgressListener throws empty.max error in HS

2015-02-09 Thread Andrew Or (JIRA)
Andrew Or created SPARK-5703: Summary: JobProgressListener throws empty.max error in HS Key: SPARK-5703 URL: https://issues.apache.org/jira/browse/SPARK-5703 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-4105) FAILED_TO_UNCOMPRESS(5) errors when fetching shuffle data with sort-based shuffle

2015-02-09 Thread Mark Khaitman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313652#comment-14313652 ] Mark Khaitman commented on SPARK-4105: -- We're running 1.2.1-rc2 on our cluster and

[jira] [Created] (SPARK-5707) Enabling spark.sql.codegen throws ClassNotFound exception

2015-02-09 Thread Yi Yao (JIRA)
Yi Yao created SPARK-5707: - Summary: Enabling spark.sql.codegen throws ClassNotFound exception Key: SPARK-5707 URL: https://issues.apache.org/jira/browse/SPARK-5707 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-5708) Add Slf4jSink to Spark Metrics Sink

2015-02-09 Thread Judy Nash (JIRA)
Judy Nash created SPARK-5708: Summary: Add Slf4jSink to Spark Metrics Sink Key: SPARK-5708 URL: https://issues.apache.org/jira/browse/SPARK-5708 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-2653) Heap size should be the sum of driver.memory and executor.memory in local mode

2015-02-09 Thread liu chang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313664#comment-14313664 ] liu chang commented on SPARK-2653: -- Hi, Davies, what's wrong with this? Heap size

[jira] [Commented] (SPARK-5703) AllJobsPage throws empty.max error

2015-02-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313332#comment-14313332 ] Apache Spark commented on SPARK-5703: - User 'andrewor14' has created a pull request

[jira] [Commented] (SPARK-5597) Model import/export for DecisionTree and ensembles

2015-02-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313388#comment-14313388 ] Apache Spark commented on SPARK-5597: - User 'mengxr' has created a pull request for

[jira] [Comment Edited] (SPARK-5558) pySpark zip function unexpected errors

2015-02-09 Thread Charles Hayden (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313434#comment-14313434 ] Charles Hayden edited comment on SPARK-5558 at 2/10/15 2:59 AM:

[jira] [Created] (SPARK-5706) Support inference schema from a single json string

2015-02-09 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-5706: Summary: Support inference schema from a single json string Key: SPARK-5706 URL: https://issues.apache.org/jira/browse/SPARK-5706 Project: Spark Issue Type:

[jira] [Commented] (SPARK-5682) Reuse hadoop encrypted shuffle algorithm to enable spark encrypted shuffle

2015-02-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313380#comment-14313380 ] Apache Spark commented on SPARK-5682: - User 'kellyzly' has created a pull request for

[jira] [Updated] (SPARK-5682) Reuse hadoop encrypted shuffle algorithm to enable spark encrypted shuffle

2015-02-09 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyunzhang_intel updated SPARK-5682: Attachment: (was: encrypted_shuffle.patch.4) Reuse hadoop encrypted shuffle algorithm

[jira] [Commented] (SPARK-5705) Explore GPU-accelerated Linear Algebra Libraries

2015-02-09 Thread Evan Sparks (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313398#comment-14313398 ] Evan Sparks commented on SPARK-5705: This JIRA is a continuation of this thread:

[jira] [Commented] (SPARK-5558) pySpark zip function unexpected errors

2015-02-09 Thread Charles Hayden (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313434#comment-14313434 ] Charles Hayden commented on SPARK-5558: --- This seems to be working as expected in 1.3

[jira] [Resolved] (SPARK-5597) Model import/export for DecisionTree and ensembles

2015-02-09 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-5597. -- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4493

[jira] [Updated] (SPARK-5704) createDataFrame replace applySchema/inferSchema

2015-02-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-5704: -- Issue Type: Sub-task (was: New Feature) Parent: SPARK-5166 createDataFrame replace

[jira] [Created] (SPARK-5705) Explore GPU-accelerated Linear Algebra Libraries

2015-02-09 Thread Evan Sparks (JIRA)
Evan Sparks created SPARK-5705: -- Summary: Explore GPU-accelerated Linear Algebra Libraries Key: SPARK-5705 URL: https://issues.apache.org/jira/browse/SPARK-5705 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-2653) Heap size should be the sum of driver.memory and executor.memory in local mode

2015-02-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313680#comment-14313680 ] Davies Liu commented on SPARK-2653: --- Right now, in local mode, only one of

[jira] [Created] (SPARK-5709) Add EXPLAIN support for DataFrame API for debugging purpose

2015-02-09 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-5709: Summary: Add EXPLAIN support for DataFrame API for debugging purpose Key: SPARK-5709 URL: https://issues.apache.org/jira/browse/SPARK-5709 Project: Spark Issue

[jira] [Created] (SPARK-5702) Allow short names for built-in data sources

2015-02-09 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-5702: -- Summary: Allow short names for built-in data sources Key: SPARK-5702 URL: https://issues.apache.org/jira/browse/SPARK-5702 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-5703) AllJobsPage throws empty.max error

2015-02-09 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5703: - Summary: AllJobsPage throws empty.max error (was: JobProgressListener throws empty.max error)

[jira] [Commented] (SPARK-2653) Heap size should be the sum of driver.memory and executor.memory in local mode

2015-02-09 Thread liu chang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313663#comment-14313663 ] liu chang commented on SPARK-2653: -- Hi, Davies, what's wrong with this? Heap size

[jira] [Created] (SPARK-5710) Combines two adjacent `Cast` expressions into one

2015-02-09 Thread guowei (JIRA)
guowei created SPARK-5710: - Summary: Combines two adjacent `Cast` expressions into one Key: SPARK-5710 URL: https://issues.apache.org/jira/browse/SPARK-5710 Project: Spark Issue Type: Improvement

[jira] [Comment Edited] (SPARK-5570) No docs stating that `new SparkConf().set(spark.driver.memory, ...) will not work

2015-02-09 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312416#comment-14312416 ] Ilya Ganelin edited comment on SPARK-5570 at 2/9/15 4:27 PM: -

[jira] [Commented] (SPARK-823) spark.default.parallelism's default is inconsistent across scheduler backends

2015-02-09 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312382#comment-14312382 ] Ilya Ganelin commented on SPARK-823: Hi [~joshrosen] I believe the documentation is up

[jira] [Commented] (SPARK-4600) org.apache.spark.graphx.VertexRDD.diff does not work

2015-02-09 Thread Brennon York (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312589#comment-14312589 ] Brennon York commented on SPARK-4600: - I can take this, thanks!

[jira] [Resolved] (SPARK-1142) Allow adding jars on app submission, outside of code

2015-02-09 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-1142. Resolution: Not a Problem Allow adding jars on app submission, outside of code

[jira] [Created] (SPARK-5691) Preventing duplicate registering of an application has incorrect logic

2015-02-09 Thread Matt Cheah (JIRA)
Matt Cheah created SPARK-5691: - Summary: Preventing duplicate registering of an application has incorrect logic Key: SPARK-5691 URL: https://issues.apache.org/jira/browse/SPARK-5691 Project: Spark

[jira] [Created] (SPARK-5692) Model import/export for Word2Vec

2015-02-09 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-5692: Summary: Model import/export for Word2Vec Key: SPARK-5692 URL: https://issues.apache.org/jira/browse/SPARK-5692 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-5692) Model import/export for Word2Vec

2015-02-09 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5692: - Description: Supoort save and load for Word2VecModel. We may want to discuss whether we want to

[jira] [Created] (SPARK-5693) Install Pandas on Jenkins machines and enable to_pandas doctest for DataFrames

2015-02-09 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-5693: -- Summary: Install Pandas on Jenkins machines and enable to_pandas doctest for DataFrames Key: SPARK-5693 URL: https://issues.apache.org/jira/browse/SPARK-5693 Project:

[jira] [Created] (SPARK-5694) Python API for evaluation metrics

2015-02-09 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-5694: Summary: Python API for evaluation metrics Key: SPARK-5694 URL: https://issues.apache.org/jira/browse/SPARK-5694 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-4900) MLlib SingularValueDecomposition ARPACK IllegalStateException

2015-02-09 Thread Mike Beyer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312769#comment-14312769 ] Mike Beyer commented on SPARK-4900: --- put a snapshot test data 1000x1000 matrix to

[jira] [Updated] (SPARK-5195) when hive table is query with alias the cache data lose effectiveness.

2015-02-09 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-5195: --- Assignee: yixiaohua when hive table is query with alias the cache data lose effectiveness.

[jira] [Commented] (SPARK-5343) ShortestPaths traverses backwards

2015-02-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312651#comment-14312651 ] Apache Spark commented on SPARK-5343: - User 'brennonyork' has created a pull request

[jira] [Updated] (SPARK-5679) Flaky tests in InputOutputMetricsSuite: input metrics with interleaved reads and input metrics with mixed read method

2015-02-09 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-5679: -- Labels: flaky-test (was: ) Flaky tests in InputOutputMetricsSuite: input metrics with interleaved

[jira] [Resolved] (SPARK-5664) Restore stty settings when exiting for launching spark-shell from SBT

2015-02-09 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-5664. - Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4451

[jira] [Resolved] (SPARK-3242) Spark 1.0.2 ec2 scripts creates clusters with Spark 1.0.1 installed by default

2015-02-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-3242. -- Resolution: Fixed Resolved by

[jira] [Closed] (SPARK-5311) EventLoggingListener throws exception if log directory does not exist

2015-02-09 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-5311. Resolution: Won't Fix Assignee: Josh Rosen EventLoggingListener throws exception if log directory

[jira] [Commented] (SPARK-5589) Split pyspark/sql.py into multiple files

2015-02-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312781#comment-14312781 ] Apache Spark commented on SPARK-5589: - User 'davies' has created a pull request for

[jira] [Created] (SPARK-5695) Check GBT caching logic

2015-02-09 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-5695: Summary: Check GBT caching logic Key: SPARK-5695 URL: https://issues.apache.org/jira/browse/SPARK-5695 Project: Spark Issue Type: Task Components:

[jira] [Updated] (SPARK-4900) MLlib SingularValueDecomposition ARPACK IllegalStateException

2015-02-09 Thread Mike Beyer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Beyer updated SPARK-4900: -- Affects Version/s: 1.2.1 MLlib SingularValueDecomposition ARPACK IllegalStateException

[jira] [Commented] (SPARK-4900) MLlib SingularValueDecomposition ARPACK IllegalStateException

2015-02-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312795#comment-14312795 ] Sean Owen commented on SPARK-4900: -- So I think there is at least a small problem in the

[jira] [Created] (SPARK-5683) Improve the json serialization for DataFrame API

2015-02-09 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-5683: Summary: Improve the json serialization for DataFrame API Key: SPARK-5683 URL: https://issues.apache.org/jira/browse/SPARK-5683 Project: Spark Issue Type:

[jira] [Commented] (SPARK-5683) Improve the json serialization for DataFrame API

2015-02-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311923#comment-14311923 ] Apache Spark commented on SPARK-5683: - User 'chenghao-intel' has created a pull

[jira] [Updated] (SPARK-5684) Key not found exception is thrown in case location of added partition to a parquet table is different than a path containing the partition values

2015-02-09 Thread Yash Datta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yash Datta updated SPARK-5684: -- Priority: Major (was: Critical) Key not found exception is thrown in case location of added partition

[jira] [Created] (SPARK-5685) Show warning when users open text files compressed with non-splittable algorithms like gzip

2015-02-09 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-5685: --- Summary: Show warning when users open text files compressed with non-splittable algorithms like gzip Key: SPARK-5685 URL: https://issues.apache.org/jira/browse/SPARK-5685

[jira] [Commented] (SPARK-5710) Combines two adjacent `Cast` expressions into one

2015-02-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313729#comment-14313729 ] Apache Spark commented on SPARK-5710: - User 'guowei2' has created a pull request for

[jira] [Created] (SPARK-5711) Sort Shuffle performance issues about using AppendOnlyMap for large data sets

2015-02-09 Thread Sun Fulin (JIRA)
Sun Fulin created SPARK-5711: Summary: Sort Shuffle performance issues about using AppendOnlyMap for large data sets Key: SPARK-5711 URL: https://issues.apache.org/jira/browse/SPARK-5711 Project: Spark

[jira] [Commented] (SPARK-5700) Bump jets3t version from 0.9.2 to 0.9.3 in hadoop-2.3 and hadoop-2.4 profiles

2015-02-09 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313746#comment-14313746 ] Josh Rosen commented on SPARK-5700: --- Looks like 0.9.3 is now on Maven Central:

[jira] [Commented] (SPARK-5704) createDataFrame replace applySchema/inferSchema

2015-02-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313763#comment-14313763 ] Apache Spark commented on SPARK-5704: - User 'davies' has created a pull request for

[jira] [Updated] (SPARK-4655) Split Stage into ShuffleMapStage and ResultStage subclasses

2015-02-09 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-4655: -- Target Version/s: 1.4.0 (was: 1.3.0) Assignee: Ilya Ganelin (was: Josh Rosen) Hi

[jira] [Commented] (SPARK-5678) DataFrame.to_pandas

2015-02-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312568#comment-14312568 ] Apache Spark commented on SPARK-5678: - User 'davies' has created a pull request for

[jira] [Closed] (SPARK-4267) Failing to launch jobs on Spark on YARN with Hadoop 2.5.0 or later

2015-02-09 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-4267. Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Sean Owen Failing to launch jobs on Spark

[jira] [Commented] (SPARK-5679) Flaky tests in InputOutputMetricsSuite: input metrics with interleaved reads and input metrics with mixed read method

2015-02-09 Thread Kostas Sakellis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312600#comment-14312600 ] Kostas Sakellis commented on SPARK-5679: I have tried to repo this in a number of

[jira] [Updated] (SPARK-5690) Flaky test: org.apache.spark.deploy.rest.StandaloneRestSubmitSuite.simple submit until completion

2015-02-09 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5690: - Affects Version/s: 1.3.0 Flaky test: org.apache.spark.deploy.rest.StandaloneRestSubmitSuite.simple

[jira] [Updated] (SPARK-5690) Flaky test: org.apache.spark.deploy.rest.StandaloneRestSubmitSuite.simple submit until completion

2015-02-09 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5690: - Priority: Critical (was: Major) Flaky test:

[jira] [Created] (SPARK-5690) Flaky test:

2015-02-09 Thread Patrick Wendell (JIRA)
Patrick Wendell created SPARK-5690: -- Summary: Flaky test: Key: SPARK-5690 URL: https://issues.apache.org/jira/browse/SPARK-5690 Project: Spark Issue Type: Bug Components: Tests

[jira] [Updated] (SPARK-5690) Flaky test: org.apache.spark.deploy.rest.StandaloneRestSubmitSuite.simple submit until completion

2015-02-09 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-5690: --- Summary: Flaky test: org.apache.spark.deploy.rest.StandaloneRestSubmitSuite.simple submit

[jira] [Updated] (SPARK-5689) Document what can be run in different YARN modes

2015-02-09 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-5689: --- Issue Type: Documentation (was: Improvement) Document what can be run in different YARN

[jira] [Commented] (SPARK-1142) Allow adding jars on app submission, outside of code

2015-02-09 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312608#comment-14312608 ] Patrick Wendell commented on SPARK-1142: This already exists - you can use the

[jira] [Commented] (SPARK-5685) Show warning when users open text files compressed with non-splittable algorithms like gzip

2015-02-09 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312622#comment-14312622 ] Josh Rosen commented on SPARK-5685: --- [~nchammas], in general I'm a huge fan of runtime

[jira] [Commented] (SPARK-5343) ShortestPaths traverses backwards

2015-02-09 Thread Brennon York (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312591#comment-14312591 ] Brennon York commented on SPARK-5343: - I'll take this issue, thanks. ShortestPaths

[jira] [Commented] (SPARK-3355) Allow running maven tests in run-tests

2015-02-09 Thread Brennon York (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312588#comment-14312588 ] Brennon York commented on SPARK-3355: - I've started this and should have the fix up

[jira] [Commented] (SPARK-5691) Preventing duplicate registering of an application has incorrect logic

2015-02-09 Thread Matt Cheah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312632#comment-14312632 ] Matt Cheah commented on SPARK-5691: --- I've determined that this is a pretty simple bug in

[jira] [Assigned] (SPARK-4600) org.apache.spark.graphx.VertexRDD.diff does not work

2015-02-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-4600: Assignee: Brennon York org.apache.spark.graphx.VertexRDD.diff does not work

[jira] [Commented] (SPARK-5691) Preventing duplicate registering of an application has incorrect logic

2015-02-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312644#comment-14312644 ] Apache Spark commented on SPARK-5691: - User 'mccheah' has created a pull request for

[jira] [Commented] (SPARK-5676) License missing from spark-ec2 repo

2015-02-09 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313254#comment-14313254 ] Nicholas Chammas commented on SPARK-5676: - It ended up in Mesos because [Spark

[jira] [Commented] (SPARK-5676) License missing from spark-ec2 repo

2015-02-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313260#comment-14313260 ] Sean Owen commented on SPARK-5676: -- Ah you're saying it isn't even part of Mesos. I see

[jira] [Commented] (SPARK-5676) License missing from spark-ec2 repo

2015-02-09 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313269#comment-14313269 ] Shivaram Venkataraman commented on SPARK-5676: -- Yes - it is managed as a

[jira] [Resolved] (SPARK-5648) support alter ... unset tblproperties(key)

2015-02-09 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-5648. - Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4424

[jira] [Commented] (SPARK-5676) License missing from spark-ec2 repo

2015-02-09 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313271#comment-14313271 ] Nicholas Chammas commented on SPARK-5676: - Yeah, AFAIK it has nothing to do with

  1   2   3   >