[jira] [Commented] (SPARK-7045) Word2Vec: avoid intermediate representation when creating model

2015-04-22 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507320#comment-14507320 ] Joseph K. Bradley commented on SPARK-7045: -- Oh, ok, I bet tmp() needs to be

[jira] [Commented] (SPARK-5206) Accumulators are not re-registered during recovering from checkpoint

2015-04-22 Thread Zhichao Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507322#comment-14507322 ] Zhichao Zhang commented on SPARK-5206: --- according to the way what you said, it

[jira] [Comment Edited] (SPARK-5206) Accumulators are not re-registered during recovering from checkpoint

2015-04-22 Thread Zhichao Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507322#comment-14507322 ] Zhichao Zhang edited comment on SPARK-5206 at 4/22/15 4:10 PM:

[jira] [Comment Edited] (SPARK-5206) Accumulators are not re-registered during recovering from checkpoint

2015-04-22 Thread Zhichao Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507322#comment-14507322 ] Zhichao Zhang edited comment on SPARK-5206 at 4/22/15 4:23 PM:

[jira] [Commented] (SPARK-7043) KryoSerializer cannot be used with REPL to interpret code in which case class definition and its shipping are in the same line

2015-04-22 Thread Peng Cheng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507318#comment-14507318 ] Peng Cheng commented on SPARK-7043: --- Sorry, didn't read all the issues. Some of them may

[jira] [Commented] (SPARK-6923) Get invalid hive table columns after save DataFrame to hive table

2015-04-22 Thread pin_zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507182#comment-14507182 ] pin_zhang commented on SPARK-6923: -- Hi, Michael Can you help to comment. we have a such

[jira] [Comment Edited] (SPARK-7045) Word2Vec: avoid intermediate representation when creating model

2015-04-22 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507199#comment-14507199 ] Manoj Kumar edited comment on SPARK-7045 at 4/22/15 2:56 PM: -

[jira] [Assigned] (SPARK-7039) JdbcRdd doesn't support java.sql.Types.NVARCHAR

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7039: --- Assignee: (was: Apache Spark) JdbcRdd doesn't support java.sql.Types.NVARCHAR

[jira] [Commented] (SPARK-7056) Make the WriteAheadLog pluggable

2015-04-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507057#comment-14507057 ] Tathagata Das commented on SPARK-7056: -- Users may want the WAL data to be written to

[jira] [Assigned] (SPARK-7039) JdbcRdd doesn't support java.sql.Types.NVARCHAR

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7039: --- Assignee: Apache Spark JdbcRdd doesn't support java.sql.Types.NVARCHAR

[jira] [Commented] (SPARK-7039) JdbcRdd doesn't support java.sql.Types.NVARCHAR

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507162#comment-14507162 ] Apache Spark commented on SPARK-7039: - User 'szheng79' has created a pull request for

[jira] [Commented] (SPARK-7045) Word2Vec: avoid intermediate representation when creating model

2015-04-22 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507199#comment-14507199 ] Manoj Kumar commented on SPARK-7045: I did try the method you suggested, but it does

[jira] [Commented] (SPARK-7042) Spark version of akka-actor_2.11 is not compatible with the official akka-actor_2.11 2.3.x

2015-04-22 Thread Konstantin Shaposhnikov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507243#comment-14507243 ] Konstantin Shaposhnikov commented on SPARK-7042: Is my understanding

[jira] [Resolved] (SPARK-7048) No toRDD or saveToText method for matrices

2015-04-22 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-7048. -- Resolution: Not A Problem Fix Version/s: (was: 1.3.1) Target Version/s: (was:

[jira] [Assigned] (SPARK-6231) Join on two tables (generated from same one) is broken

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6231: --- Assignee: (was: Apache Spark) Join on two tables (generated from same one) is broken

[jira] [Commented] (SPARK-6231) Join on two tables (generated from same one) is broken

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507214#comment-14507214 ] Apache Spark commented on SPARK-6231: - User 'yhuai' has created a pull request for

[jira] [Commented] (SPARK-7048) No toRDD or saveToText method for matrices

2015-04-22 Thread Ted Fujimoto (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507274#comment-14507274 ] Ted Fujimoto commented on SPARK-7048: - My problem is the one mentioned in the link in

[jira] [Assigned] (SPARK-6231) Join on two tables (generated from same one) is broken

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6231: --- Assignee: Apache Spark Join on two tables (generated from same one) is broken

[jira] [Commented] (SPARK-5206) Accumulators are not re-registered during recovering from checkpoint

2015-04-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507545#comment-14507545 ] Tathagata Das commented on SPARK-5206: -- 1. Yes, there is no built-in way to persist

[jira] [Commented] (SPARK-7017) Refactor dev/run-tests into Python

2015-04-22 Thread Brennon York (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507558#comment-14507558 ] Brennon York commented on SPARK-7017: - [~pwendell] can you assign this to me? Seems I

[jira] [Assigned] (SPARK-7058) Task deserialization time metric does not include time to deserialize broadcasted RDDs

2015-04-22 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-7058: - Assignee: Josh Rosen Task deserialization time metric does not include time to deserialize

[jira] [Updated] (SPARK-6797) Add support for YARN cluster mode

2015-04-22 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivaram Venkataraman updated SPARK-6797: - Assignee: Sun Rui Add support for YARN cluster mode

[jira] [Commented] (SPARK-7058) Task deserialization time metric does not include time to deserialize broadcasted RDDs

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507565#comment-14507565 ] Apache Spark commented on SPARK-7058: - User 'JoshRosen' has created a pull request for

[jira] [Created] (SPARK-7058) Task deserialization time metric does not include time to deserialize broadcasted RDDs

2015-04-22 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-7058: - Summary: Task deserialization time metric does not include time to deserialize broadcasted RDDs Key: SPARK-7058 URL: https://issues.apache.org/jira/browse/SPARK-7058

[jira] [Resolved] (SPARK-7052) Add ThreadUtils and move thread methods from Utils to ThreadUtils

2015-04-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-7052. Resolution: Fixed Fix Version/s: 1.4.0 Assignee: Shixiong Zhu Add ThreadUtils and

[jira] [Updated] (SPARK-7017) Refactor dev/run-tests into Python

2015-04-22 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-7017: - Assignee: Brennon York Refactor dev/run-tests into Python --

[jira] [Commented] (SPARK-5945) Spark should not retry a stage infinitely on a FetchFailedException

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507583#comment-14507583 ] Apache Spark commented on SPARK-5945: - User 'ilganeli' has created a pull request for

[jira] [Created] (SPARK-7059) Create a join API to facilitate equijoin and self join

2015-04-22 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-7059: -- Summary: Create a join API to facilitate equijoin and self join Key: SPARK-7059 URL: https://issues.apache.org/jira/browse/SPARK-7059 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-7060) Missing alias function on Python DataFrame

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7060: --- Assignee: Yin Huai (was: Apache Spark) Missing alias function on Python DataFrame

[jira] [Commented] (SPARK-7060) Missing alias function on Python DataFrame

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507638#comment-14507638 ] Apache Spark commented on SPARK-7060: - User 'yhuai' has created a pull request for

[jira] [Updated] (SPARK-7035) Drop __getattr__ on pyspark.sql.DataFrame

2015-04-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7035: --- Issue Type: Sub-task (was: Improvement) Parent: SPARK-6116 Drop __getattr__ on

[jira] [Updated] (SPARK-7058) Task deserialization time metric does not include time to deserialize broadcasted RDDs

2015-04-22 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-7058: -- Affects Version/s: 1.4.0 1.2.3 1.1.2

[jira] [Updated] (SPARK-6231) Missing alias function on Python DataFrame

2015-04-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-6231: --- Summary: Missing alias function on Python DataFrame (was: Join on two tables (generated from same

[jira] [Updated] (SPARK-6231) Join on two tables (generated from same one) is broken

2015-04-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-6231: --- Issue Type: Sub-task (was: Bug) Parent: SPARK-6116 Join on two tables (generated from same

[jira] [Assigned] (SPARK-2315) drop, dropRight and dropWhile which take RDD input and return RDD

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-2315: --- Assignee: Erik Erlandson (was: Apache Spark) drop, dropRight and dropWhile which take RDD

[jira] [Assigned] (SPARK-2315) drop, dropRight and dropWhile which take RDD input and return RDD

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-2315: --- Assignee: Apache Spark (was: Erik Erlandson) drop, dropRight and dropWhile which take RDD

[jira] [Commented] (SPARK-6923) Spark SQL CLI does not read Data Source schema correctly

2015-04-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507629#comment-14507629 ] Michael Armbrust commented on SPARK-6923: - It sounds like you have hit a bug in

[jira] [Reopened] (SPARK-6923) Get invalid hive table columns after save DataFrame to hive table

2015-04-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust reopened SPARK-6923: - Get invalid hive table columns after save DataFrame to hive table

[jira] [Updated] (SPARK-6923) Spark SQL CLI does not read Data Source schema correctly

2015-04-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-6923: Summary: Spark SQL CLI does not read Data Source schema correctly (was: Get invalid hive

[jira] [Created] (SPARK-7060) Missing alias function on Python DataFrame

2015-04-22 Thread Yin Huai (JIRA)
Yin Huai created SPARK-7060: --- Summary: Missing alias function on Python DataFrame Key: SPARK-7060 URL: https://issues.apache.org/jira/browse/SPARK-7060 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-4865) Include temporary tables in SHOW TABLES

2015-04-22 Thread Abhishek Tripathi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507651#comment-14507651 ] Abhishek Tripathi commented on SPARK-4865: -- Hi, It seems like issue is resolved

[jira] [Assigned] (SPARK-5945) Spark should not retry a stage infinitely on a FetchFailedException

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-5945: --- Assignee: Ilya Ganelin (was: Apache Spark) Spark should not retry a stage infinitely on a

[jira] [Assigned] (SPARK-5945) Spark should not retry a stage infinitely on a FetchFailedException

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-5945: --- Assignee: Apache Spark (was: Ilya Ganelin) Spark should not retry a stage infinitely on a

[jira] [Commented] (SPARK-6231) Join on two tables (generated from same one) is broken

2015-04-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507596#comment-14507596 ] Reynold Xin commented on SPARK-6231: I'm wondering if we should have a self join

[jira] [Assigned] (SPARK-7060) Missing alias function on Python DataFrame

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7060: --- Assignee: Apache Spark (was: Yin Huai) Missing alias function on Python DataFrame

[jira] [Updated] (SPARK-6231) Join on two tables (generated from same one) is broken

2015-04-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-6231: --- Summary: Join on two tables (generated from same one) is broken (was: Missing alias function on

[jira] [Updated] (SPARK-6116) Making DataFrame API non-experimental

2015-04-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-6116: --- Target Version/s: 1.5.0 (was: 1.4.0) Making DataFrame API non-experimental

[jira] [Commented] (SPARK-7053) KafkaUtils.createStream leaks resources

2015-04-22 Thread Platon Potapov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507781#comment-14507781 ] Platon Potapov commented on SPARK-7053: --- no, i don't think memory is being leaked.

[jira] [Resolved] (SPARK-7039) JdbcRdd doesn't support java.sql.Types.NVARCHAR

2015-04-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-7039. Resolution: Fixed Assignee: Shuai Zheng Target Version/s: (was: 1.3.1)

[jira] [Commented] (SPARK-7053) KafkaUtils.createStream leaks resources

2015-04-22 Thread Platon Potapov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507754#comment-14507754 ] Platon Potapov commented on SPARK-7053: --- i've reproduced the problem again - ran the

[jira] [Commented] (SPARK-7053) KafkaUtils.createStream leaks resources

2015-04-22 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507774#comment-14507774 ] Sean Owen commented on SPARK-7053: -- Looks like 100MB of byte arrays. You're not out of

[jira] [Commented] (SPARK-7053) KafkaUtils.createStream leaks resources

2015-04-22 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508153#comment-14508153 ] Sean Owen commented on SPARK-7053: -- Fair point, that's hard to explain if you definitely

[jira] [Updated] (SPARK-7056) Make the WriteAheadLog pluggable

2015-04-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-7056: - Description: Users may want the WAL data to be written to non-HDFS data storage systems. To

[jira] [Comment Edited] (SPARK-7056) Make the WriteAheadLog pluggable

2015-04-22 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507057#comment-14507057 ] Tathagata Das edited comment on SPARK-7056 at 4/23/15 12:44 AM:

[jira] [Commented] (SPARK-6290) spark.ml.param.Params.checkInputColumn bug upon error

2015-04-22 Thread Glenn Weidner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508246#comment-14508246 ] Glenn Weidner commented on SPARK-6290: -- I would like to work on this. In a test

[jira] [Commented] (SPARK-6891) ExecutorAllocationManager will request negative number executors

2015-04-22 Thread meiyoula (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508308#comment-14508308 ] meiyoula commented on SPARK-6891: - I test the master branch of github. I think you can run

[jira] [Resolved] (SPARK-6967) Internal DateType not handled correctly in caching

2015-04-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-6967. Resolution: Fixed Fix Version/s: 1.4.0 1.3.2 Internal DateType not

[jira] [Comment Edited] (SPARK-6967) Internal DateType not handled correctly in caching

2015-04-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499574#comment-14499574 ] Reynold Xin edited comment on SPARK-6967 at 4/23/15 2:19 AM: -

[jira] [Created] (SPARK-7066) VectorAssembler should use NumericType and StringType, not NativeType

2015-04-22 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-7066: -- Summary: VectorAssembler should use NumericType and StringType, not NativeType Key: SPARK-7066 URL: https://issues.apache.org/jira/browse/SPARK-7066 Project: Spark

[jira] [Assigned] (SPARK-7066) VectorAssembler should use NumericType and StringType, not NativeType

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7066: --- Assignee: Reynold Xin (was: Apache Spark) VectorAssembler should use NumericType and

[jira] [Assigned] (SPARK-7066) VectorAssembler should use NumericType and StringType, not NativeType

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7066: --- Assignee: Apache Spark (was: Reynold Xin) VectorAssembler should use NumericType and

[jira] [Updated] (SPARK-5288) Stabilize Spark SQL data type API followup

2015-04-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-5288: --- Parent Issue: SPARK-6116 (was: SPARK-5166) Stabilize Spark SQL data type API followup

[jira] [Updated] (SPARK-7064) Adding binary sparse vector support

2015-04-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-7064: --- Component/s: ML Adding binary sparse vector support ---

[jira] [Created] (SPARK-7062) Parquet compression does not work for Spark SQL loading

2015-04-22 Thread Yi Yao (JIRA)
Yi Yao created SPARK-7062: - Summary: Parquet compression does not work for Spark SQL loading Key: SPARK-7062 URL: https://issues.apache.org/jira/browse/SPARK-7062 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-6378) srcAttr in graph.triplets don't update when the size of graph is huge

2015-04-22 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508262#comment-14508262 ] Liang-Chi Hsieh commented on SPARK-6378: After checking the codes, looks like that

[jira] [Commented] (SPARK-7063) when lz4 compression is used, it causes core dump

2015-04-22 Thread Jenny MA (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508299#comment-14508299 ] Jenny MA commented on SPARK-7063: - will send out a pull request shortly. when lz4

[jira] [Commented] (SPARK-5659) Flaky test: o.a.s.streaming.ReceiverSuite.block

2015-04-22 Thread Fei Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508302#comment-14508302 ] Fei Wang commented on SPARK-5659: - my locally test with dev/run-tests also go into this

[jira] [Assigned] (SPARK-7063) when lz4 compression is used, it causes core dump

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7063: --- Assignee: (was: Apache Spark) when lz4 compression is used, it causes core dump

[jira] [Commented] (SPARK-7063) when lz4 compression is used, it causes core dump

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508323#comment-14508323 ] Apache Spark commented on SPARK-7063: - User 'linlin200605' has created a pull request

[jira] [Updated] (SPARK-7064) Adding binary sparse vector support

2015-04-22 Thread Julien Pierre (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Pierre updated SPARK-7064: - Issue Type: Improvement (was: Bug) Adding binary sparse vector support

[jira] [Assigned] (SPARK-7056) Make the WriteAheadLog pluggable

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7056: --- Assignee: Apache Spark (was: Tathagata Das) Make the WriteAheadLog pluggable

[jira] [Assigned] (SPARK-7056) Make the WriteAheadLog pluggable

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7056: --- Assignee: Tathagata Das (was: Apache Spark) Make the WriteAheadLog pluggable

[jira] [Commented] (SPARK-7056) Make the WriteAheadLog pluggable

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508360#comment-14508360 ] Apache Spark commented on SPARK-7056: - User 'tdas' has created a pull request for this

[jira] [Updated] (SPARK-7065) Clear the cached locations mapping after every stage to avoid inconsistent status

2015-04-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-7065: --- Component/s: Spark Core Clear the cached locations mapping after every stage to avoid

[jira] [Commented] (SPARK-7065) Clear the cached locations mapping after every stage to avoid inconsistent status

2015-04-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508433#comment-14508433 ] Patrick Wendell commented on SPARK-7065: It would be helpful to have a bit more

[jira] [Commented] (SPARK-5206) Accumulators are not re-registered during recovering from checkpoint

2015-04-22 Thread Zhichao Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508271#comment-14508271 ] Zhichao Zhang commented on SPARK-5206: --- [~tdas], thanks for your reply. Looking

[jira] [Comment Edited] (SPARK-5659) Flaky test: o.a.s.streaming.ReceiverSuite.block

2015-04-22 Thread Fei Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508302#comment-14508302 ] Fei Wang edited comment on SPARK-5659 at 4/23/15 2:02 AM: -- my

[jira] [Comment Edited] (SPARK-5659) Flaky test: o.a.s.streaming.ReceiverSuite.block

2015-04-22 Thread Fei Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508302#comment-14508302 ] Fei Wang edited comment on SPARK-5659 at 4/23/15 2:03 AM: -- my

[jira] [Created] (SPARK-7064) Adding binary sparse vector support

2015-04-22 Thread Julien Pierre (JIRA)
Julien Pierre created SPARK-7064: Summary: Adding binary sparse vector support Key: SPARK-7064 URL: https://issues.apache.org/jira/browse/SPARK-7064 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-7066) VectorAssembler should use NumericType, not NativeType

2015-04-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7066: --- Summary: VectorAssembler should use NumericType, not NativeType (was: VectorAssembler should use

[jira] [Commented] (SPARK-7026) LeftSemiJoin can not work when it has both equal condition and not equal condition.

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508345#comment-14508345 ] Apache Spark commented on SPARK-7026: - User 'adrian-wang' has created a pull request

[jira] [Resolved] (SPARK-7054) Spark jobs hang for ~15 mins when a node goes down

2015-04-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-7054. Resolution: Invalid Hey There, Please send this to the Spark users list to get feedback

[jira] [Updated] (SPARK-4867) UDF clean up

2015-04-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-4867: --- Parent Issue: SPARK-6116 (was: SPARK-5166) UDF clean up Key:

[jira] [Updated] (SPARK-5517) Add input types for Java UDFs

2015-04-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-5517: --- Parent Issue: SPARK-6116 (was: SPARK-5166) Add input types for Java UDFs

[jira] [Comment Edited] (SPARK-5659) Flaky test: o.a.s.streaming.ReceiverSuite.block

2015-04-22 Thread Fei Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508302#comment-14508302 ] Fei Wang edited comment on SPARK-5659 at 4/23/15 2:01 AM: -- my

[jira] [Assigned] (SPARK-7063) when lz4 compression is used, it causes core dump

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7063: --- Assignee: Apache Spark when lz4 compression is used, it causes core dump

[jira] [Updated] (SPARK-7063) when lz4 compression is used, it causes core dump

2015-04-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-7063: --- Environment: IBM JDK when lz4 compression is used, it causes core dump

[jira] [Updated] (SPARK-6827) Wrap FPGrowthModel.freqItemsets with namedtuples (or document the return type) in PySpark

2015-04-22 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-6827: - Fix Version/s: (was: 1.6.0) 1.4.0 Wrap FPGrowthModel.freqItemsets with

[jira] [Resolved] (SPARK-6827) Wrap FPGrowthModel.freqItemsets with namedtuples (or document the return type) in PySpark

2015-04-22 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-6827. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 5614

[jira] [Updated] (SPARK-6917) Broken data returned to PySpark dataframe if any large numbers used in Scala land

2015-04-22 Thread Harry Brundage (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harry Brundage updated SPARK-6917: -- Description: When trying to access data stored in a Parquet file with an INT96 column (read:

[jira] [Commented] (SPARK-6290) spark.ml.param.Params.checkInputColumn bug upon error

2015-04-22 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508278#comment-14508278 ] Joseph K. Bradley commented on SPARK-6290: -- There should be a complaint from

[jira] [Created] (SPARK-7063) when lz4 compression is used, it causes core dump

2015-04-22 Thread Jenny MA (JIRA)
Jenny MA created SPARK-7063: --- Summary: when lz4 compression is used, it causes core dump Key: SPARK-7063 URL: https://issues.apache.org/jira/browse/SPARK-7063 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-7063) when lz4 compression is used, it causes core dump

2015-04-22 Thread Jenny MA (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508297#comment-14508297 ] Jenny MA commented on SPARK-7063: - we should bump the version to

[jira] [Issue Comment Deleted] (SPARK-5295) Stabilize data types

2015-04-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-5295: --- Comment: was deleted (was: User 'JaysonSunshine' has created a pull request for this issue:

[jira] [Updated] (SPARK-5295) Stabilize data types

2015-04-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-5295: --- Parent Issue: SPARK-6116 (was: SPARK-5166) Stabilize data types

[jira] [Resolved] (SPARK-7062) Parquet compression does not work for Spark SQL loading

2015-04-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-7062. Resolution: Duplicate Parquet compression does not work for Spark SQL loading

[jira] [Commented] (SPARK-5945) Spark should not retry a stage infinitely on a FetchFailedException

2015-04-22 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508019#comment-14508019 ] Ilya Ganelin commented on SPARK-5945: - [~kayousterhout] - thanks for the review. If I

[jira] [Commented] (SPARK-5945) Spark should not retry a stage infinitely on a FetchFailedException

2015-04-22 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508091#comment-14508091 ] Kay Ousterhout commented on SPARK-5945: --- I realized there might be a cleaner

[jira] [Updated] (SPARK-7059) Create a DataFrame join API to facilitate equijoin and self join

2015-04-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7059: --- Summary: Create a DataFrame join API to facilitate equijoin and self join (was: Create a join API to

[jira] [Commented] (SPARK-7059) Create a DataFrame join API to facilitate equijoin and self join

2015-04-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507838#comment-14507838 ] Apache Spark commented on SPARK-7059: - User 'rxin' has created a pull request for this

  1   2   >