[jira] [Created] (SPARK-13007) Document where configuration / properties are read and applied

2016-01-26 Thread Alan Braithwaite (JIRA)
Alan Braithwaite created SPARK-13007: Summary: Document where configuration / properties are read and applied Key: SPARK-13007 URL: https://issues.apache.org/jira/browse/SPARK-13007 Project: Spark

[jira] [Commented] (SPARK-8697) MatchIterator not serializable exception in RegexTokenizer

2016-01-26 Thread Oksana Romankova (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117789#comment-15117789 ] Oksana Romankova commented on SPARK-8697: - Spark 1.4.1 It seems like the issue ha

[jira] [Created] (SPARK-13006) Use Log4j for Spark Standalone Executor Logging

2016-01-26 Thread Alan Braithwaite (JIRA)
Alan Braithwaite created SPARK-13006: Summary: Use Log4j for Spark Standalone Executor Logging Key: SPARK-13006 URL: https://issues.apache.org/jira/browse/SPARK-13006 Project: Spark Issue

[jira] [Created] (SPARK-13005) Don't require spark.shuffle.service.port to be set on job conf

2016-01-26 Thread Alan Braithwaite (JIRA)
Alan Braithwaite created SPARK-13005: Summary: Don't require spark.shuffle.service.port to be set on job conf Key: SPARK-13005 URL: https://issues.apache.org/jira/browse/SPARK-13005 Project: Spark

[jira] [Commented] (SPARK-12945) ERROR LiveListenerBus: Listener JobProgressListener threw an exception

2016-01-26 Thread Ben Huntley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117770#comment-15117770 ] Ben Huntley commented on SPARK-12945: - Hi Shixiong, Sorry to be less than helpful.

[jira] [Created] (SPARK-13004) Support Non-Volatile Data and Operations

2016-01-26 Thread Wang, Gang (JIRA)
Wang, Gang created SPARK-13004: -- Summary: Support Non-Volatile Data and Operations Key: SPARK-13004 URL: https://issues.apache.org/jira/browse/SPARK-13004 Project: Spark Issue Type: Epic

[jira] [Created] (SPARK-13003) Provide a option to trim InputRelation metadata in QueryPlan.toJSON

2016-01-26 Thread Yin Huai (JIRA)
Yin Huai created SPARK-13003: Summary: Provide a option to trim InputRelation metadata in QueryPlan.toJSON Key: SPARK-13003 URL: https://issues.apache.org/jira/browse/SPARK-13003 Project: Spark

[jira] [Commented] (SPARK-9740) first/last aggregate NULL behavior

2016-01-26 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117651#comment-15117651 ] Yin Huai commented on SPARK-9740: - [~hvanhovell] Will you have time to take a look? Thanks

[jira] [Commented] (SPARK-12850) Support bucket pruning (predicate pushdown for bucketed tables)

2016-01-26 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117613#comment-15117613 ] Xiao Li commented on SPARK-12850: - Will submit a PR today. Hive cannot prune buckets wh

[jira] [Commented] (SPARK-12682) Hive will fail if the schema of a parquet table has a very wide schema

2016-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117518#comment-15117518 ] Apache Spark commented on SPARK-12682: -- User 'yhuai' has created a pull request for

[jira] [Commented] (SPARK-12988) Can't drop columns that contain dots

2016-01-26 Thread Dilip Biswal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117509#comment-15117509 ] Dilip Biswal commented on SPARK-12988: -- I would like to work on this one. > Can't d

[jira] [Resolved] (SPARK-12682) Hive will fail if the schema of a parquet table has a very wide schema

2016-01-26 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-12682. -- Resolution: Fixed Fix Version/s: 1.6.1 2.0.0 Issue resolved by pull request 1

[jira] [Updated] (SPARK-12682) Hive will fail if the schema of a parquet table has a very wide schema

2016-01-26 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-12682: - Assignee: Sameer Agarwal > Hive will fail if the schema of a parquet table has a very wide schema > -

[jira] [Resolved] (SPARK-10911) Executors should System.exit on clean shutdown

2016-01-26 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-10911. --- Resolution: Fixed Fix Version/s: 2.0.0 > Executors should System.exit on clean shutdow

[jira] [Commented] (SPARK-2629) Improved state management for Spark Streaming

2016-01-26 Thread Rodrigo Boavida (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117404#comment-15117404 ] Rodrigo Boavida commented on SPARK-2629: Hi, I've experimented the new API method

[jira] [Commented] (SPARK-12261) pyspark crash for large dataset

2016-01-26 Thread Christopher Bourez (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117345#comment-15117345 ] Christopher Bourez commented on SPARK-12261: The solution "Increase driver me

[jira] [Commented] (SPARK-13001) Coarse-grained Mesos scheduler should reject offers for longer period of time when reached max cores

2016-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117341#comment-15117341 ] Apache Spark commented on SPARK-13001: -- User 'sebastienrainville' has created a pull

[jira] [Assigned] (SPARK-13001) Coarse-grained Mesos scheduler should reject offers for longer period of time when reached max cores

2016-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13001: Assignee: (was: Apache Spark) > Coarse-grained Mesos scheduler should reject offers fo

[jira] [Assigned] (SPARK-13001) Coarse-grained Mesos scheduler should reject offers for longer period of time when reached max cores

2016-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13001: Assignee: Apache Spark > Coarse-grained Mesos scheduler should reject offers for longer pe

[jira] [Commented] (SPARK-12695) java.lang.ClassCastException: [B cannot be cast to java.lang.String

2016-01-26 Thread e.birukov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117327#comment-15117327 ] e.birukov commented on SPARK-12695: --- python 2.7 I use rdd: t = sqlContext.

[jira] [Resolved] (SPARK-12999) Guidance on adding a stopping criterion (maximul literal length or itemset count) for FPGrowth

2016-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-12999. --- Resolution: Duplicate Questions should go to user@. I think this is a duplicate of SPARK-12163 as yo

[jira] [Updated] (SPARK-12991) Establish correspondence between SparkPlan and LogicalPlan nodes

2016-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-12991: -- Component/s: SQL > Establish correspondence between SparkPlan and LogicalPlan nodes > -

[jira] [Updated] (SPARK-12985) Spark Hive thrift server big decimal data issue

2016-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-12985: -- Component/s: SQL > Spark Hive thrift server big decimal data issue > --

[jira] [Updated] (SPARK-12879) improve unsafe row writing framework

2016-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-12879: -- Assignee: Wenchen Fan > improve unsafe row writing framework > > >

[jira] [Commented] (SPARK-7889) Jobs progress of apps on complete page of HistoryServer shows uncompleted

2016-01-26 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117288#comment-15117288 ] Steve Loughran commented on SPARK-7889: --- I'm going to note something something probl

[jira] [Created] (SPARK-13002) Mesos scheduler backend does not follow the property spark.dynamicAllocation.initialExecutors

2016-01-26 Thread Luc Bourlier (JIRA)
Luc Bourlier created SPARK-13002: Summary: Mesos scheduler backend does not follow the property spark.dynamicAllocation.initialExecutors Key: SPARK-13002 URL: https://issues.apache.org/jira/browse/SPARK-13002

[jira] [Assigned] (SPARK-12979) Paths are resolved relative to the local file system

2016-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12979: Assignee: Apache Spark > Paths are resolved relative to the local file system > --

[jira] [Commented] (SPARK-12979) Paths are resolved relative to the local file system

2016-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117238#comment-15117238 ] Apache Spark commented on SPARK-12979: -- User 'dragos' has created a pull request for

[jira] [Assigned] (SPARK-12979) Paths are resolved relative to the local file system

2016-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12979: Assignee: (was: Apache Spark) > Paths are resolved relative to the local file system >

[jira] [Comment Edited] (SPARK-12989) Bad interaction between StarExpansion and ExtractWindowExpressions

2016-01-26 Thread Denton Cockburn (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117220#comment-15117220 ] Denton Cockburn edited comment on SPARK-12989 at 1/26/16 2:03 PM: -

[jira] [Commented] (SPARK-12989) Bad interaction between StarExpansion and ExtractWindowExpressions

2016-01-26 Thread Denton Cockburn (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117220#comment-15117220 ] Denton Cockburn commented on SPARK-12989: - It should be noted that it works if gi

[jira] [Created] (SPARK-13001) Coarse-grained Mesos scheduler should reject offers for longer period of time when reached max cores

2016-01-26 Thread Sebastien Rainville (JIRA)
Sebastien Rainville created SPARK-13001: --- Summary: Coarse-grained Mesos scheduler should reject offers for longer period of time when reached max cores Key: SPARK-13001 URL: https://issues.apache.org/jira/br

[jira] [Updated] (SPARK-13000) Corrupted results when using LIMIT clause via JDBC connections to ThriftServer

2016-01-26 Thread Daniel Harper (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Harper updated SPARK-13000: -- Description: h2. Steps to reproduce # Create table in HIVE (see below for definition) # Inser

[jira] [Created] (SPARK-13000) Corrupted results when using LIMIT clause via JDBC connections to ThriftServer

2016-01-26 Thread Daniel Harper (JIRA)
Daniel Harper created SPARK-13000: - Summary: Corrupted results when using LIMIT clause via JDBC connections to ThriftServer Key: SPARK-13000 URL: https://issues.apache.org/jira/browse/SPARK-13000 Proj

[jira] [Created] (SPARK-12999) Guidance on adding a stopping criterion (maximul literal length or itemset count) for FPGrowth

2016-01-26 Thread Tomas Kliegr (JIRA)
Tomas Kliegr created SPARK-12999: Summary: Guidance on adding a stopping criterion (maximul literal length or itemset count) for FPGrowth Key: SPARK-12999 URL: https://issues.apache.org/jira/browse/SPARK-12999

[jira] [Commented] (SPARK-12637) Print stage info of finished stages properly

2016-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117101#comment-15117101 ] Apache Spark commented on SPARK-12637: -- User 'srowen' has created a pull request for

[jira] [Resolved] (SPARK-3369) Java mapPartitions Iterator->Iterable is inconsistent with Scala's Iterator->Iterator

2016-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-3369. -- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 10413 [https://github.com/a

[jira] [Updated] (SPARK-12961) Work around memory leak in Snappy library

2016-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-12961: -- Assignee: Liang-Chi Hsieh > Work around memory leak in Snappy library > ---

[jira] [Resolved] (SPARK-12961) Work around memory leak in Snappy library

2016-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-12961. --- Resolution: Fixed Fix Version/s: 1.6.1 2.0.0 Issue resolved by pull request

[jira] [Updated] (SPARK-12998) Enable OrcRelation when connecting via spark thrift server

2016-01-26 Thread Rajesh Balamohan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated SPARK-12998: - Description: When a user connects via spark-thrift server to execute SQL, it does not ena

[jira] [Updated] (SPARK-12998) Enable OrcRelation when connecting via spark thrift server

2016-01-26 Thread Rajesh Balamohan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated SPARK-12998: - Description: When a user connects via spark-thrift server to execute SQL, it does not ena

[jira] [Created] (SPARK-12998) Enable OrcRelation when connecting via spark thrift server

2016-01-26 Thread Rajesh Balamohan (JIRA)
Rajesh Balamohan created SPARK-12998: Summary: Enable OrcRelation when connecting via spark thrift server Key: SPARK-12998 URL: https://issues.apache.org/jira/browse/SPARK-12998 Project: Spark

[jira] [Commented] (SPARK-12265) Spark calls System.exit inside driver instead of throwing exception

2016-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117026#comment-15117026 ] Apache Spark commented on SPARK-12265: -- User 'dragos' has created a pull request for

[jira] [Updated] (SPARK-12995) Remove deprecate APIs from GraphX

2016-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-12995: -- Summary: Remove deprecate APIs from GraphX (was: Remove deprecate APIs from Pregel) > Remove deprecat

[jira] [Commented] (SPARK-12995) Remove deprecate APIs from Pregel

2016-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117013#comment-15117013 ] Sean Owen commented on SPARK-12995: --- I support this. It seems like we're removing depre

[jira] [Resolved] (SPARK-5623) Replace an obsolete mapReduceTriplets with a new aggregateMessages in GraphSuite

2016-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5623. -- Resolution: Duplicate Let's call this a subset of removing the deprecated API. > Replace an obsolete ma

[jira] [Updated] (SPARK-12993) Remove usage of ADD_FILES in pyspark

2016-01-26 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-12993: --- Description: environment variable ADD_FILES is created for adding python files on spark context to be

[jira] [Updated] (SPARK-12993) Remove usage of ADD_FILES in pyspark

2016-01-26 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-12993: --- Description: environment variable ADD_FILES is created for adding python files to spark context (SPAR

[jira] [Comment Edited] (SPARK-9740) first/last aggregate NULL behavior

2016-01-26 Thread Emlyn Corrin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15116977#comment-15116977 ] Emlyn Corrin edited comment on SPARK-9740 at 1/26/16 9:40 AM: --

[jira] [Closed] (SPARK-12261) pyspark crash for large dataset

2016-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen closed SPARK-12261. - > pyspark crash for large dataset > --- > > Key: SPARK-12261 >

[jira] [Comment Edited] (SPARK-9740) first/last aggregate NULL behavior

2016-01-26 Thread Emlyn Corrin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15116977#comment-15116977 ] Emlyn Corrin edited comment on SPARK-9740 at 1/26/16 9:32 AM: --

[jira] [Comment Edited] (SPARK-9740) first/last aggregate NULL behavior

2016-01-26 Thread Emlyn Corrin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15116977#comment-15116977 ] Emlyn Corrin edited comment on SPARK-9740 at 1/26/16 9:33 AM: --

[jira] [Commented] (SPARK-9740) first/last aggregate NULL behavior

2016-01-26 Thread Emlyn Corrin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15116977#comment-15116977 ] Emlyn Corrin commented on SPARK-9740: - I've put together a minimal example to demonstr

[jira] [Resolved] (SPARK-12937) Bloom filter serialization

2016-01-26 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-12937. - Resolution: Fixed Fix Version/s: 2.0.0 > Bloom filter serialization >

[jira] [Comment Edited] (SPARK-12261) pyspark crash for large dataset

2016-01-26 Thread Christopher Bourez (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15116907#comment-15116907 ] Christopher Bourez edited comment on SPARK-12261 at 1/26/16 8:11 AM: --

[jira] [Comment Edited] (SPARK-12261) pyspark crash for large dataset

2016-01-26 Thread Christopher Bourez (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15116907#comment-15116907 ] Christopher Bourez edited comment on SPARK-12261 at 1/26/16 8:10 AM: --

[jira] [Commented] (SPARK-12261) pyspark crash for large dataset

2016-01-26 Thread Christopher Bourez (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15116907#comment-15116907 ] Christopher Bourez commented on SPARK-12261: To reproduce you can follow the

<    1   2   3