[jira] [Updated] (SPARK-12741) DataFrame count method return wrong size.

2016-01-21 Thread Sasi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sasi updated SPARK-12741: - Description: Hi, I'm updating my report. I'm working with Spark 1.5.2, (used to be 1.5.0), I have a DataFrame

[jira] [Created] (SPARK-12954) pyspark API 1.3.0 how we can patitionning by columns

2016-01-21 Thread malouke (JIRA)
malouke created SPARK-12954: --- Summary: pyspark API 1.3.0 how we can patitionning by columns Key: SPARK-12954 URL: https://issues.apache.org/jira/browse/SPARK-12954 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-12954) pyspark API 1.3.0 how we can patitionning by columns

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-12954. --- Resolution: Invalid Target Version/s: (was: 1.3.0) [~Malouke] a lot is wrong with this.

[jira] [Commented] (SPARK-2309) Generalize the binary logistic regression into multinomial logistic regression

2016-01-21 Thread Daniel Darabos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110628#comment-15110628 ] Daniel Darabos commented on SPARK-2309: ---

[jira] [Commented] (SPARK-12741) DataFrame count method return wrong size.

2016-01-21 Thread Sasi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110523#comment-15110523 ] Sasi commented on SPARK-12741: -- I changed the way I used the DataFrame from my last ticket. Now, I have

[jira] [Commented] (SPARK-12741) DataFrame count method return wrong size.

2016-01-21 Thread Sasi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110553#comment-15110553 ] Sasi commented on SPARK-12741: -- Create new DataFame didn't resolve the issue. I still think its bug.

[jira] [Commented] (SPARK-12741) DataFrame count method return wrong size.

2016-01-21 Thread Sasi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110584#comment-15110584 ] Sasi commented on SPARK-12741: -- I checked my DB which is Aerospike, and I got the same results of my

[jira] [Updated] (SPARK-12741) DataFrame count method return wrong size.

2016-01-21 Thread Sasi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sasi updated SPARK-12741: - Description: Hi, I'm updating my report. I'm working with Spark 1.5.2, (used to be 1.5.0), I have a DataFrame

[jira] [Commented] (SPARK-12843) Spark should avoid scanning all partitions when limit is set

2016-01-21 Thread dileep (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110512#comment-15110512 ] dileep commented on SPARK-12843: I will look in to this issue > Spark should avoid scanning all

[jira] [Commented] (SPARK-12954) pyspark API 1.3.0 how we can patitionning by columns

2016-01-21 Thread malouke (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110596#comment-15110596 ] malouke commented on SPARK-12954: - ok sorry, > pyspark API 1.3.0 how we can patitionning by columns >

[jira] [Commented] (SPARK-12741) DataFrame count method return wrong size.

2016-01-21 Thread Sasi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110563#comment-15110563 ] Sasi commented on SPARK-12741: -- If I'm running the following code: {code} dataFrame.where("...").count()

[jira] [Commented] (SPARK-12954) pyspark API 1.3.0 how we can patitionning by columns

2016-01-21 Thread malouke (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110603#comment-15110603 ] malouke commented on SPARK-12954: - hi sean, where i can ask question ? > pyspark API 1.3.0 how we can

[jira] [Commented] (SPARK-12741) DataFrame count method return wrong size.

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110572#comment-15110572 ] Sean Owen commented on SPARK-12741: --- I can't reproduce this. I always get the same count and collect

[jira] [Commented] (SPARK-12741) DataFrame count method return wrong size.

2016-01-21 Thread Sasi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110503#comment-15110503 ] Sasi commented on SPARK-12741: -- I updated the report, can you verify it again. Thanks! Sasi > DataFrame

[jira] [Commented] (SPARK-12741) DataFrame count method return wrong size.

2016-01-21 Thread Sasi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110613#comment-15110613 ] Sasi commented on SPARK-12741: -- Addtional update: If I use the following code, then I get the same length

[jira] [Commented] (SPARK-12741) DataFrame count method return wrong size.

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110511#comment-15110511 ] Sean Owen commented on SPARK-12741: --- I recall from other JIRAs that you're not updating the DataFrame /

[jira] [Updated] (SPARK-12247) Documentation for spark.ml's ALS and collaborative filtering in general

2016-01-21 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-12247: --- Assignee: Benjamin Fradet > Documentation for spark.ml's ALS and collaborative filtering in

[jira] [Assigned] (SPARK-12953) RDDRelation write set mode will be better to avoid error "pair.parquet already exists"

2016-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12953: Assignee: Apache Spark > RDDRelation write set mode will be better to avoid error

[jira] [Commented] (SPARK-12953) RDDRelation write set mode will be better to avoid error "pair.parquet already exists"

2016-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110301#comment-15110301 ] Apache Spark commented on SPARK-12953: -- User 'shijinkui' has created a pull request for this issue:

[jira] [Created] (SPARK-12953) RDDRelation write set mode will be better to avoid error "pair.parquet already exists"

2016-01-21 Thread shijinkui (JIRA)
shijinkui created SPARK-12953: - Summary: RDDRelation write set mode will be better to avoid error "pair.parquet already exists" Key: SPARK-12953 URL: https://issues.apache.org/jira/browse/SPARK-12953

[jira] [Commented] (SPARK-12906) LongSQLMetricValue cause memory leak on Spark 1.5.1

2016-01-21 Thread Sasi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110354#comment-15110354 ] Sasi commented on SPARK-12906: -- Looks like fixed on 1.5.2. Thanks! > LongSQLMetricValue cause memory leak

[jira] [Resolved] (SPARK-12906) LongSQLMetricValue cause memory leak on Spark 1.5.1

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-12906. --- Resolution: Duplicate > LongSQLMetricValue cause memory leak on Spark 1.5.1 >

[jira] [Updated] (SPARK-12247) Documentation for spark.ml's ALS and collaborative filtering in general

2016-01-21 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-12247: --- Affects Version/s: (was: 1.5.2) 2.0.0 > Documentation for

[jira] [Commented] (SPARK-6817) DataFrame UDFs in R

2016-01-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110299#comment-15110299 ] Felix Cheung commented on SPARK-6817: - Thanks for putting together on the doc [~sunrui] In this

[jira] [Updated] (SPARK-12953) RDDRelation write set mode will be better to avoid error "pair.parquet already exists"

2016-01-21 Thread shijinkui (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shijinkui updated SPARK-12953: -- Component/s: (was: SQL) Examples > RDDRelation write set mode will be better to

[jira] [Assigned] (SPARK-12953) RDDRelation write set mode will be better to avoid error "pair.parquet already exists"

2016-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12953: Assignee: (was: Apache Spark) > RDDRelation write set mode will be better to avoid

[jira] [Commented] (SPARK-6817) DataFrame UDFs in R

2016-01-21 Thread Sun Rui (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110359#comment-15110359 ] Sun Rui commented on SPARK-6817: for dapply(), user can call repartition() to set an appropriate number of

[jira] [Comment Edited] (SPARK-12843) Spark should avoid scanning all partitions when limit is set

2016-01-21 Thread dileep (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110712#comment-15110712 ] dileep edited comment on SPARK-12843 at 1/21/16 2:57 PM: - Its a caching issue,

[jira] [Commented] (SPARK-12932) Bad error message with trying to create Dataset from RDD of Java objects that are not bean-compliant

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110786#comment-15110786 ] Sean Owen commented on SPARK-12932: --- OK, do you want to make a PR for that? > Bad error message with

[jira] [Updated] (SPARK-12932) Bad error message with trying to create Dataset from RDD of Java objects that are not bean-compliant

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-12932: -- Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) > Bad error message with

[jira] [Commented] (SPARK-9740) first/last aggregate NULL behavior

2016-01-21 Thread Emlyn Corrin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110688#comment-15110688 ] Emlyn Corrin commented on SPARK-9740: - How do you use FIRST/LAST from the Java API with ignoreNulls

[jira] [Comment Edited] (SPARK-12741) DataFrame count method return wrong size.

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110676#comment-15110676 ] Sean Owen edited comment on SPARK-12741 at 1/21/16 2:40 PM: Wait, is this

[jira] [Commented] (SPARK-12843) Spark should avoid scanning all partitions when limit is set

2016-01-21 Thread dileep (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110709#comment-15110709 ] dileep commented on SPARK-12843: Please see the above Code. We need to make use of caching mechanism of

[jira] [Comment Edited] (SPARK-12843) Spark should avoid scanning all partitions when limit is set

2016-01-21 Thread dileep (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110709#comment-15110709 ] dileep edited comment on SPARK-12843 at 1/21/16 3:08 PM: - When I verified with 2

[jira] [Assigned] (SPARK-12760) inaccurate description for difference between local vs cluster mode in closure handling

2016-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12760: Assignee: Apache Spark > inaccurate description for difference between local vs cluster

[jira] [Assigned] (SPARK-12760) inaccurate description for difference between local vs cluster mode in closure handling

2016-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12760: Assignee: (was: Apache Spark) > inaccurate description for difference between local

[jira] [Commented] (SPARK-12843) Spark should avoid scanning all partitions when limit is set

2016-01-21 Thread dileep (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110712#comment-15110712 ] dileep commented on SPARK-12843: Its a caching issue, while scanning the table need to cache the records,

[jira] [Commented] (SPARK-10262) Add @Since annotation to ml.attribute

2016-01-21 Thread Tommy Yu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110743#comment-15110743 ] Tommy Yu commented on SPARK-10262: -- HI Xiangrui Meng I take a look all class under ml.attribute, those

[jira] [Commented] (SPARK-12760) inaccurate description for difference between local vs cluster mode in closure handling

2016-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110814#comment-15110814 ] Apache Spark commented on SPARK-12760: -- User 'srowen' has created a pull request for this issue:

[jira] [Commented] (SPARK-12741) DataFrame count method return wrong size.

2016-01-21 Thread Sasi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110701#comment-15110701 ] Sasi commented on SPARK-12741: -- That's not what I meant. I just set an example for each case, SQL way and

[jira] [Commented] (SPARK-12741) DataFrame count method return wrong size.

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110762#comment-15110762 ] Sean Owen commented on SPARK-12741: --- OK, that's what you wrote at the outset though. Then I can't

[jira] [Comment Edited] (SPARK-12741) DataFrame count method return wrong size.

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110762#comment-15110762 ] Sean Owen edited comment on SPARK-12741 at 1/21/16 3:26 PM: OK, that's

[jira] [Commented] (SPARK-12843) Spark should avoid scanning all partitions when limit is set

2016-01-21 Thread dileep (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110704#comment-15110704 ] dileep commented on SPARK-12843: public class JavaSparkSQL { public static class Person

[jira] [Issue Comment Deleted] (SPARK-12843) Spark should avoid scanning all partitions when limit is set

2016-01-21 Thread dileep (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dileep updated SPARK-12843: --- Comment: was deleted (was: public class JavaSparkSQL { public static class Person implements

[jira] [Commented] (SPARK-12932) Bad error message with trying to create Dataset from RDD of Java objects that are not bean-compliant

2016-01-21 Thread Andy Grove (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110772#comment-15110772 ] Andy Grove commented on SPARK-12932: After reviewing the code for this, I think it is just a case of

[jira] [Resolved] (SPARK-12534) Document missing command line options to Spark properties mapping

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-12534. --- Resolution: Fixed Fix Version/s: 2.0.0 Resolved by https://github.com/apache/spark/pull/10491

[jira] [Comment Edited] (SPARK-12843) Spark should avoid scanning all partitions when limit is set

2016-01-21 Thread dileep (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110712#comment-15110712 ] dileep edited comment on SPARK-12843 at 1/21/16 2:59 PM: - Its a caching issue,

[jira] [Commented] (SPARK-12932) Bad error message with trying to create Dataset from RDD of Java objects that are not bean-compliant

2016-01-21 Thread Andy Grove (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110785#comment-15110785 ] Andy Grove commented on SPARK-12932: Here is a pull request to change the error message:

[jira] [Updated] (SPARK-12534) Document missing command line options to Spark properties mapping

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-12534: -- Assignee: Felix Cheung (was: Apache Spark) > Document missing command line options to Spark

[jira] [Assigned] (SPARK-12932) Bad error message with trying to create Dataset from RDD of Java objects that are not bean-compliant

2016-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12932: Assignee: Apache Spark > Bad error message with trying to create Dataset from RDD of Java

[jira] [Assigned] (SPARK-12932) Bad error message with trying to create Dataset from RDD of Java objects that are not bean-compliant

2016-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12932: Assignee: (was: Apache Spark) > Bad error message with trying to create Dataset from

[jira] [Commented] (SPARK-12932) Bad error message with trying to create Dataset from RDD of Java objects that are not bean-compliant

2016-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110796#comment-15110796 ] Apache Spark commented on SPARK-12932: -- User 'andygrove' has created a pull request for this issue:

[jira] [Commented] (SPARK-12741) DataFrame count method return wrong size.

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110676#comment-15110676 ] Sean Owen commented on SPARK-12741: --- Wait, is this what you mean? "select count(*) ..." returns 1 row,

[jira] [Comment Edited] (SPARK-12741) DataFrame count method return wrong size.

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110676#comment-15110676 ] Sean Owen edited comment on SPARK-12741 at 1/21/16 2:40 PM: Wait, is this

[jira] [Comment Edited] (SPARK-12843) Spark should avoid scanning all partitions when limit is set

2016-01-21 Thread dileep (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110709#comment-15110709 ] dileep edited comment on SPARK-12843 at 1/21/16 2:57 PM: - Please see the below

[jira] [Updated] (SPARK-12534) Document missing command line options to Spark properties mapping

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-12534: -- Issue Type: Improvement (was: Bug) > Document missing command line options to Spark properties

[jira] [Resolved] (SPARK-5929) Pyspark: Register a pip requirements file with spark_context

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5929. -- Resolution: Won't Fix > Pyspark: Register a pip requirements file with spark_context >

[jira] [Resolved] (SPARK-5647) Output metrics do not show up for older hadoop versions (< 2.5)

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5647. -- Resolution: Duplicate I think this is nearly moot for Spark 2.x, given that Hadoop support may get to

[jira] [Resolved] (SPARK-4247) [SQL] use beeline execute "create table as" thriftserver is not use "hive" user ,but the new hdfs dir's owner is "hive"

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-4247. -- Resolution: Not A Problem I think this is stale anyway, but I think this is a question about Hive and

[jira] [Commented] (SPARK-12941) Spark-SQL JDBC Oracle dialect fails to map string datatypes to Oracle VARCHAR datatype

2016-01-21 Thread Jose Martinez Poblete (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110906#comment-15110906 ] Jose Martinez Poblete commented on SPARK-12941: --- Thanks, let us know if this can be worked

[jira] [Created] (SPARK-12956) add spark.yarn.hdfs.home.directory property

2016-01-21 Thread PJ Fanning (JIRA)
PJ Fanning created SPARK-12956: -- Summary: add spark.yarn.hdfs.home.directory property Key: SPARK-12956 URL: https://issues.apache.org/jira/browse/SPARK-12956 Project: Spark Issue Type:

[jira] [Updated] (SPARK-12946) The SQL page is empty

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-12946: -- Target Version/s: (was: 1.6.1) Fix Version/s: (was: 1.6.1) > The SQL page is empty >

[jira] [Resolved] (SPARK-6137) G-Means clustering algorithm implementation

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-6137. -- Resolution: Won't Fix > G-Means clustering algorithm implementation >

[jira] [Resolved] (SPARK-6056) Unlimit offHeap memory use cause RM killing the container

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-6056. -- Resolution: Not A Problem > Unlimit offHeap memory use cause RM killing the container >

[jira] [Commented] (SPARK-4878) driverPropsFetcher causes spurious Akka disassociate errors

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110865#comment-15110865 ] Sean Owen commented on SPARK-4878: -- I think this may be defunct anyway, but, the code in question does

[jira] [Commented] (SPARK-12760) inaccurate description for difference between local vs cluster mode in closure handling

2016-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110908#comment-15110908 ] Apache Spark commented on SPARK-12760: -- User 'mortada' has created a pull request for this issue:

[jira] [Commented] (SPARK-12650) No means to specify Xmx settings for SparkSubmit in yarn-cluster mode

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110964#comment-15110964 ] Sean Owen commented on SPARK-12650: --- [~vanzin] is that the intended way to set this? if so it sounds

[jira] [Commented] (SPARK-12843) Spark should avoid scanning all partitions when limit is set

2016-01-21 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-12843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110836#comment-15110836 ] Maciej BryƄski commented on SPARK-12843: [~dileep] I think you miss the point of this Jira. >

[jira] [Commented] (SPARK-5629) Add spark-ec2 action to return info about an existing cluster

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110853#comment-15110853 ] Sean Owen commented on SPARK-5629: -- Are all of the EC2 tickets becoming essentially "wont fix" as the

[jira] [Resolved] (SPARK-6009) IllegalArgumentException thrown by TimSort when SQL ORDER BY RAND ()

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-6009. -- Resolution: Duplicate > IllegalArgumentException thrown by TimSort when SQL ORDER BY RAND () >

[jira] [Resolved] (SPARK-4171) StreamingContext.actorStream throws serializationError

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-4171. -- Resolution: Won't Fix I think this is obsolete now that the Akka actor bits are being removed. >

[jira] [Created] (SPARK-12955) Spark-HiveSQL: It fail when is quering a nested structure

2016-01-21 Thread Gerardo Villarroel (JIRA)
Gerardo Villarroel created SPARK-12955: -- Summary: Spark-HiveSQL: It fail when is quering a nested structure Key: SPARK-12955 URL: https://issues.apache.org/jira/browse/SPARK-12955 Project: Spark

[jira] [Commented] (SPARK-5629) Add spark-ec2 action to return info about an existing cluster

2016-01-21 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111056#comment-15111056 ] Shivaram Venkataraman commented on SPARK-5629: -- Yes - though I think its beneficial to see if

[jira] [Commented] (SPARK-1680) Clean up use of setExecutorEnvs in SparkConf

2016-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110979#comment-15110979 ] Apache Spark commented on SPARK-1680: - User 'weineran' has created a pull request for this issue:

[jira] [Commented] (SPARK-12177) Update KafkaDStreams to new Kafka 0.9 Consumer API

2016-01-21 Thread Mario Briggs (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110994#comment-15110994 ] Mario Briggs commented on SPARK-12177: -- bq. If one uses the kafka v9 jar even when using the old

[jira] [Created] (SPARK-12957) Derive and propagate data constrains in logical plan

2016-01-21 Thread Yin Huai (JIRA)
Yin Huai created SPARK-12957: Summary: Derive and propagate data constrains in logical plan Key: SPARK-12957 URL: https://issues.apache.org/jira/browse/SPARK-12957 Project: Spark Issue Type:

[jira] [Commented] (SPARK-12650) No means to specify Xmx settings for SparkSubmit in yarn-cluster mode

2016-01-21 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111024#comment-15111024 ] Marcelo Vanzin commented on SPARK-12650: If it works it's a workaround; that's a pretty obscure

[jira] [Commented] (SPARK-11045) Contributing Receiver based Low Level Kafka Consumer from Spark-Packages to Apache Spark Project

2016-01-21 Thread Dan Dutrow (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111044#comment-15111044 ] Dan Dutrow commented on SPARK-11045: +1 to Dibyendu's comment that "Being at spark-packages, many

[jira] [Assigned] (SPARK-10911) Executors should System.exit on clean shutdown

2016-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10911: Assignee: Zhuo Liu (was: Apache Spark) > Executors should System.exit on clean shutdown

[jira] [Commented] (SPARK-11045) Contributing Receiver based Low Level Kafka Consumer from Spark-Packages to Apache Spark Project

2016-01-21 Thread Dibyendu Bhattacharya (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111206#comment-15111206 ] Dibyendu Bhattacharya commented on SPARK-11045: --- Thanks Dan for your comments . Same

[jira] [Updated] (SPARK-12797) Aggregation without grouping keys

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-12797: -- Assignee: Davies Liu > Aggregation without grouping keys > - > >

[jira] [Updated] (SPARK-12953) RDDRelation write set mode will be better to avoid error "pair.parquet already exists"

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-12953: -- Priority: Minor (was: Major) Fix Version/s: (was: 1.6.1) Issue Type: Improvement

[jira] [Updated] (SPARK-12945) ERROR LiveListenerBus: Listener JobProgressListener threw an exception

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-12945: -- Component/s: Web UI > ERROR LiveListenerBus: Listener JobProgressListener threw an exception >

[jira] [Resolved] (SPARK-6034) DESCRIBE EXTENDED viewname is not supported for HiveContext

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-6034. -- Resolution: Won't Fix > DESCRIBE EXTENDED viewname is not supported for HiveContext >

[jira] [Resolved] (SPARK-9282) Filter on Spark DataFrame with multiple columns

2016-01-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-9282. -- Resolution: Not A Problem > Filter on Spark DataFrame with multiple columns >

[jira] [Commented] (SPARK-11045) Contributing Receiver based Low Level Kafka Consumer from Spark-Packages to Apache Spark Project

2016-01-21 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111243#comment-15111243 ] Saisai Shao commented on SPARK-11045: - Hi [~dibbhatt], I'm afraid I could not agree with your comment

[jira] [Assigned] (SPARK-10498) Add requirements file for create dev python tools

2016-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10498: Assignee: (was: Apache Spark) > Add requirements file for create dev python tools >

[jira] [Commented] (SPARK-10498) Add requirements file for create dev python tools

2016-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111527#comment-15111527 ] Apache Spark commented on SPARK-10498: -- User 'holdenk' has created a pull request for this issue:

[jira] [Assigned] (SPARK-10498) Add requirements file for create dev python tools

2016-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10498: Assignee: Apache Spark > Add requirements file for create dev python tools >

[jira] [Created] (SPARK-12960) Some examples are missing support for python2

2016-01-21 Thread Mark Grover (JIRA)
Mark Grover created SPARK-12960: --- Summary: Some examples are missing support for python2 Key: SPARK-12960 URL: https://issues.apache.org/jira/browse/SPARK-12960 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-11045) Contributing Receiver based Low Level Kafka Consumer from Spark-Packages to Apache Spark Project

2016-01-21 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111398#comment-15111398 ] Cody Koeninger commented on SPARK-11045: There's already work being done on 0.9

[jira] [Commented] (SPARK-12946) The SQL page is empty

2016-01-21 Thread Alex Bozarth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111598#comment-15111598 ] Alex Bozarth commented on SPARK-12946: -- Can you give more details on this? Such as what commands did

[jira] [Commented] (SPARK-12957) Derive and propagate data constrains in logical plan

2016-01-21 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111597#comment-15111597 ] Xiao Li commented on SPARK-12957: - I have two related PRs that require a general null filtering function:

[jira] [Commented] (SPARK-12946) The SQL page is empty

2016-01-21 Thread Alex Bozarth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111645#comment-15111645 ] Alex Bozarth commented on SPARK-12946: -- These may be the same problem > The SQL page is empty >

[jira] [Issue Comment Deleted] (SPARK-12946) The SQL page is empty

2016-01-21 Thread Alex Bozarth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Bozarth updated SPARK-12946: - Comment: was deleted (was: These may be the same problem) > The SQL page is empty >

[jira] [Assigned] (SPARK-12859) Names of input streams with receivers don't fit in Streaming page

2016-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12859: Assignee: (was: Apache Spark) > Names of input streams with receivers don't fit in

[jira] [Assigned] (SPARK-12859) Names of input streams with receivers don't fit in Streaming page

2016-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12859: Assignee: Apache Spark > Names of input streams with receivers don't fit in Streaming

[jira] [Commented] (SPARK-12859) Names of input streams with receivers don't fit in Streaming page

2016-01-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111660#comment-15111660 ] Apache Spark commented on SPARK-12859: -- User 'ajbozarth' has created a pull request for this issue:

[jira] [Created] (SPARK-12959) Silent switch to normal table writing when writing bucketed data with bucketing disabled

2016-01-21 Thread Xiao Li (JIRA)
Xiao Li created SPARK-12959: --- Summary: Silent switch to normal table writing when writing bucketed data with bucketing disabled Key: SPARK-12959 URL: https://issues.apache.org/jira/browse/SPARK-12959

[jira] [Commented] (SPARK-9721) TreeTests.checkEqual should compare predictions on data

2016-01-21 Thread Seth Hendrickson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111579#comment-15111579 ] Seth Hendrickson commented on SPARK-9721: - I assume there is some motivation behind this JIRA, but

  1   2   >