[jira] [Updated] (SPARK-13744) Dataframe RDD caching increases the input size for subsequent stages

2016-03-08 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-13744: Attachment: stages.png > Dataframe RDD caching increases the input size for subsequ

[jira] [Updated] (SPARK-13702) Use diamond operator for generic instance creation in Java code

2016-03-08 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-13702: -- Description: In order to make docs/example (and other related code) more simple and readable,

[jira] [Resolved] (SPARK-13738) Clean up ResolveDataSource

2016-03-08 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-13738. - Resolution: Fixed Fix Version/s: 2.0.0 > Clean up ResolveDataSource >

[jira] [Updated] (SPARK-13485) (Dataset-oriented) API evolution in Spark 2.0

2016-03-08 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-13485: Description: As part of Spark 2.0, we want to create a stable API foundation for Dataset to become

[jira] [Updated] (SPARK-13625) PySpark-ML method to get list of params for an obj should not check property attr

2016-03-08 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-13625: -- Shepherd: Joseph K. Bradley Assignee: Bryan Cutler Target Versio

[jira] [Resolved] (SPARK-13400) Stop using deprecated Octal escape literals

2016-03-08 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-13400. - Resolution: Fixed Assignee: Dongjoon Hyun Fix Version/s: 2.0.0 > Stop using depre

[jira] [Assigned] (SPARK-13754) Keep old data source name for backwards compatibility

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13754: Assignee: Apache Spark > Keep old data source name for backwards compatibility > -

[jira] [Commented] (SPARK-13754) Keep old data source name for backwards compatibility

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15186001#comment-15186001 ] Apache Spark commented on SPARK-13754: -- User 'falaki' has created a pull request for

[jira] [Assigned] (SPARK-13754) Keep old data source name for backwards compatibility

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13754: Assignee: (was: Apache Spark) > Keep old data source name for backwards compatibility

[jira] [Updated] (SPARK-11838) Spark SQL query fragment RDD reuse across queries

2016-03-08 Thread Mark Hamstra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hamstra updated SPARK-11838: - Summary: Spark SQL query fragment RDD reuse across queries (was: Spark SQL query fragment RDD re

[jira] [Comment Edited] (SPARK-13756) Reuse Query Fragments

2016-03-08 Thread Mark Hamstra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185991#comment-15185991 ] Mark Hamstra edited comment on SPARK-13756 at 3/8/16 10:42 PM:

[jira] [Updated] (SPARK-13756) Reuse Query Fragments

2016-03-08 Thread Mark Hamstra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hamstra updated SPARK-13756: - Description: Query fragments that have been materialized in RDDs can and should be reused either

[jira] [Commented] (SPARK-13756) Reuse Query Fragments

2016-03-08 Thread Mark Hamstra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185991#comment-15185991 ] Mark Hamstra commented on SPARK-13756: -- Fragment reuse across queries > Reuse Query

[jira] [Comment Edited] (SPARK-13523) Reuse the exchanges in a query

2016-03-08 Thread Mark Hamstra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185967#comment-15185967 ] Mark Hamstra edited comment on SPARK-13523 at 3/8/16 10:36 PM:

[jira] [Comment Edited] (SPARK-13523) Reuse the exchanges in a query

2016-03-08 Thread Mark Hamstra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185967#comment-15185967 ] Mark Hamstra edited comment on SPARK-13523 at 3/8/16 10:36 PM:

[jira] [Created] (SPARK-13756) Reuse Query Fragments

2016-03-08 Thread Mark Hamstra (JIRA)
Mark Hamstra created SPARK-13756: Summary: Reuse Query Fragments Key: SPARK-13756 URL: https://issues.apache.org/jira/browse/SPARK-13756 Project: Spark Issue Type: Umbrella Componen

[jira] [Commented] (SPARK-13523) Reuse the exchanges in a query

2016-03-08 Thread Mark Hamstra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185967#comment-15185967 ] Mark Hamstra commented on SPARK-13523: -- Yes that is a good point. But they are clos

[jira] [Commented] (SPARK-13733) Support initial weight distribution in personalized PageRank

2016-03-08 Thread Gayathri Murali (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185945#comment-15185945 ] Gayathri Murali commented on SPARK-13733: - I can work on this > Support initial

[jira] [Commented] (SPARK-7286) Precedence of operator not behaving properly

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185924#comment-15185924 ] Apache Spark commented on SPARK-7286: - User 'jodersky' has created a pull request for

[jira] [Resolved] (SPARK-13593) improve the `createDataFrame` method to accept data type string and verify the data

2016-03-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-13593. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11444 [https://github.

[jira] [Updated] (SPARK-13740) add null check for _verify_type in types.py

2016-03-08 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-13740: - Assignee: Wenchen Fan > add null check for _verify_type in types.py > ---

[jira] [Resolved] (SPARK-13740) add null check for _verify_type in types.py

2016-03-08 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-13740. -- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11574 [https://github.com/

[jira] [Updated] (SPARK-13755) Escape quotes in SQL plan visualization node labels

2016-03-08 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-13755: --- Summary: Escape quotes in SQL plan visualization node labels (was: Escape quotes in Graphviz / DOT n

[jira] [Assigned] (SPARK-11102) Uninformative exception when specifing non-exist input for JSON data source

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11102: Assignee: (was: Apache Spark) > Uninformative exception when specifing non-exist input

[jira] [Assigned] (SPARK-11102) Uninformative exception when specifing non-exist input for JSON data source

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11102: Assignee: Apache Spark > Uninformative exception when specifing non-exist input for JSON d

[jira] [Commented] (SPARK-7129) Add generic boosting algorithm to spark.ml

2016-03-08 Thread Seth Hendrickson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185861#comment-15185861 ] Seth Hendrickson commented on SPARK-7129: - cc [~josephkb] [~meihuawu] This has be

[jira] [Commented] (SPARK-13755) Escape quotes in Graphviz / DOT node labels

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185845#comment-15185845 ] Apache Spark commented on SPARK-13755: -- User 'JoshRosen' has created a pull request

[jira] [Assigned] (SPARK-13755) Escape quotes in Graphviz / DOT node labels

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13755: Assignee: Josh Rosen (was: Apache Spark) > Escape quotes in Graphviz / DOT node labels >

[jira] [Assigned] (SPARK-13755) Escape quotes in Graphviz / DOT node labels

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13755: Assignee: Apache Spark (was: Josh Rosen) > Escape quotes in Graphviz / DOT node labels >

[jira] [Created] (SPARK-13755) Escape quotes in Graphviz / DOT node labels

2016-03-08 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-13755: -- Summary: Escape quotes in Graphviz / DOT node labels Key: SPARK-13755 URL: https://issues.apache.org/jira/browse/SPARK-13755 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-13102) Run query using ThriftServer, and open web using IE11, i click ”+detail" in SQLPage, but not response

2016-03-08 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185824#comment-15185824 ] Josh Rosen commented on SPARK-13102: Can you please check the Javascript console for

[jira] [Assigned] (SPARK-13750) Fix sizeInBytes for HadoopFSRelation

2016-03-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-13750: -- Assignee: Davies Liu > Fix sizeInBytes for HadoopFSRelation >

[jira] [Created] (SPARK-13754) Keep old data source name for backwards compatibility

2016-03-08 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-13754: -- Summary: Keep old data source name for backwards compatibility Key: SPARK-13754 URL: https://issues.apache.org/jira/browse/SPARK-13754 Project: Spark Iss

[jira] [Commented] (SPARK-13744) Dataframe RDD caching increases the input size for subsequent stages

2016-03-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185758#comment-15185758 ] Sean Owen commented on SPARK-13744: --- It's reporting the number of bytes read, which doe

[jira] [Assigned] (SPARK-13747) Concurrent execution in SQL doesn't work with Scala ForkJoinPool

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13747: Assignee: Apache Spark (was: Andrew Or) > Concurrent execution in SQL doesn't work with S

[jira] [Assigned] (SPARK-13747) Concurrent execution in SQL doesn't work with Scala ForkJoinPool

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13747: Assignee: Andrew Or (was: Apache Spark) > Concurrent execution in SQL doesn't work with S

[jira] [Commented] (SPARK-13747) Concurrent execution in SQL doesn't work with Scala ForkJoinPool

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185743#comment-15185743 ] Apache Spark commented on SPARK-13747: -- User 'andrewor14' has created a pull request

[jira] [Created] (SPARK-13753) Column nullable is derived incorrectly

2016-03-08 Thread Jingwei Lu (JIRA)
Jingwei Lu created SPARK-13753: -- Summary: Column nullable is derived incorrectly Key: SPARK-13753 URL: https://issues.apache.org/jira/browse/SPARK-13753 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-13752) JSON array type parsing error

2016-03-08 Thread Jingwei Lu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingwei Lu updated SPARK-13752: --- Attachment: sparkissue.scala This is a repro case. > JSON array type parsing error > --

[jira] [Created] (SPARK-13752) JSON array type parsing error

2016-03-08 Thread Jingwei Lu (JIRA)
Jingwei Lu created SPARK-13752: -- Summary: JSON array type parsing error Key: SPARK-13752 URL: https://issues.apache.org/jira/browse/SPARK-13752 Project: Spark Issue Type: Bug Component

[jira] [Updated] (SPARK-13747) Concurrent execution in SQL doesn't work with Scala ForkJoinPool

2016-03-08 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-13747: - Assignee: Andrew Or > Concurrent execution in SQL doesn't work with Scala ForkJoinPool >

[jira] [Assigned] (SPARK-13751) Generate better code for Filter

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13751: Assignee: Apache Spark (was: Davies Liu) > Generate better code for Filter >

[jira] [Assigned] (SPARK-13751) Generate better code for Filter

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13751: Assignee: Davies Liu (was: Apache Spark) > Generate better code for Filter >

[jira] [Commented] (SPARK-13751) Generate better code for Filter

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185660#comment-15185660 ] Apache Spark commented on SPARK-13751: -- User 'davies' has created a pull request for

[jira] [Commented] (SPARK-13744) Dataframe RDD caching increases the input size for subsequent stages

2016-03-08 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185656#comment-15185656 ] Stavros Kontopoulos commented on SPARK-13744: - The problem is that ui shows l

[jira] [Created] (SPARK-13751) Generate better code for Filter

2016-03-08 Thread Davies Liu (JIRA)
Davies Liu created SPARK-13751: -- Summary: Generate better code for Filter Key: SPARK-13751 URL: https://issues.apache.org/jira/browse/SPARK-13751 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-13400) Stop using deprecated Octal escape literals

2016-03-08 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185638#comment-15185638 ] Dongjoon Hyun commented on SPARK-13400: --- Thank you! > Stop using deprecated Octal

[jira] [Assigned] (SPARK-13400) Stop using deprecated Octal escape literals

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13400: Assignee: (was: Apache Spark) > Stop using deprecated Octal escape literals >

[jira] [Assigned] (SPARK-13400) Stop using deprecated Octal escape literals

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13400: Assignee: Apache Spark > Stop using deprecated Octal escape literals > ---

[jira] [Commented] (SPARK-13400) Stop using deprecated Octal escape literals

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185635#comment-15185635 ] Apache Spark commented on SPARK-13400: -- User 'dongjoon-hyun' has created a pull requ

[jira] [Commented] (SPARK-4940) Support more evenly distributing cores for Mesos mode

2016-03-08 Thread Martin Tapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185633#comment-15185633 ] Martin Tapp commented on SPARK-4940: Any update or anyone working on this issue? > Su

[jira] [Resolved] (SPARK-12727) SQL generation support for distinct aggregation patterns that fit DistinctAggregationRewriter analysis rule

2016-03-08 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-12727. - Resolution: Fixed Assignee: Wenchen Fan Fix Version/s: 2.0.0 > SQL generation sup

[jira] [Commented] (SPARK-13400) Stop using deprecated Octal escape literals

2016-03-08 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185619#comment-15185619 ] holdenk commented on SPARK-13400: - +1 :) (Sorry I haven't had the chance to follow up on

[jira] [Commented] (SPARK-13400) Stop using deprecated Octal escape literals

2016-03-08 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185594#comment-15185594 ] Dongjoon Hyun commented on SPARK-13400: --- Oh, thank you! > Stop using deprecated Oc

[jira] [Comment Edited] (SPARK-13731) expression evaluation for NaN in select statement

2016-03-08 Thread Ian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183869#comment-15183869 ] Ian edited comment on SPARK-13731 at 3/8/16 7:29 PM: - We saw SPARK-9

[jira] [Assigned] (SPARK-13749) Faster pivot implementation for many distinct values with two phase aggregation

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13749: Assignee: Apache Spark > Faster pivot implementation for many distinct values with two pha

[jira] [Commented] (SPARK-13749) Faster pivot implementation for many distinct values with two phase aggregation

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185571#comment-15185571 ] Apache Spark commented on SPARK-13749: -- User 'aray' has created a pull request for t

[jira] [Assigned] (SPARK-13749) Faster pivot implementation for many distinct values with two phase aggregation

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13749: Assignee: (was: Apache Spark) > Faster pivot implementation for many distinct values w

[jira] [Updated] (SPARK-13748) createDataFrame and rows with omitted fields

2016-03-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13748: -- Component/s: Documentation Issue Type: Improvement (was: Bug) Sure, open a PR with proposed doc u

[jira] [Commented] (SPARK-13400) Stop using deprecated Octal escape literals

2016-03-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185568#comment-15185568 ] Sean Owen commented on SPARK-13400: --- Seems OK to proceed, yes. > Stop using deprecated

[jira] [Created] (SPARK-13750) Fix sizeInBytes for HadoopFSRelation

2016-03-08 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-13750: Summary: Fix sizeInBytes for HadoopFSRelation Key: SPARK-13750 URL: https://issues.apache.org/jira/browse/SPARK-13750 Project: Spark Issue Type: Sub-

[jira] [Created] (SPARK-13749) Faster pivot implementation for many distinct values with two phase aggregation

2016-03-08 Thread Andrew Ray (JIRA)
Andrew Ray created SPARK-13749: -- Summary: Faster pivot implementation for many distinct values with two phase aggregation Key: SPARK-13749 URL: https://issues.apache.org/jira/browse/SPARK-13749 Project:

[jira] [Commented] (SPARK-13400) Stop using deprecated Octal escape literals

2016-03-08 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185535#comment-15185535 ] Dongjoon Hyun commented on SPARK-13400: --- Hi, [~holdenk] and [~srowen]. If you don'

[jira] [Commented] (SPARK-13747) Concurrent execution in SQL doesn't work with Scala ForkJoinPool

2016-03-08 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185533#comment-15185533 ] Shixiong Zhu commented on SPARK-13747: -- FYI, I switched to branch-1.6, and ran the s

[jira] [Created] (SPARK-13748) createDataFrame and rows with omitted fields

2016-03-08 Thread Ethan Aubin (JIRA)
Ethan Aubin created SPARK-13748: --- Summary: createDataFrame and rows with omitted fields Key: SPARK-13748 URL: https://issues.apache.org/jira/browse/SPARK-13748 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-13747) Concurrent execution in SQL doesn't work with Scala ForkJoinPool

2016-03-08 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-13747: - Description: Run the following codes may fail {code} (1 to 100).par.foreach { _ => println(sc.p

[jira] [Commented] (SPARK-10548) Concurrent execution in SQL does not work

2016-03-08 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185528#comment-15185528 ] Shixiong Zhu commented on SPARK-10548: -- Open SPARK-13747 for further discussion > C

[jira] [Created] (SPARK-13747) Concurrent execution in SQL doesn't work with Scala ForkJoinPool

2016-03-08 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-13747: Summary: Concurrent execution in SQL doesn't work with Scala ForkJoinPool Key: SPARK-13747 URL: https://issues.apache.org/jira/browse/SPARK-13747 Project: Spark

[jira] [Commented] (SPARK-13034) PySpark ml.classification support export/import

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185524#comment-15185524 ] Apache Spark commented on SPARK-13034: -- User 'wangmiao1981' has created a pull reque

[jira] [Resolved] (SPARK-10548) Concurrent execution in SQL does not work

2016-03-08 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-10548. -- Resolution: Fixed Target Version/s: 1.6.0, 1.5.1 (was: 1.5.1, 1.6.0) In a second tho

[jira] [Updated] (SPARK-12555) Datasets: data is corrupted when input data is reordered

2016-03-08 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-12555: Description: Testcase --- {code} import org.apache.spark.sql.expressions.Aggregator import

[jira] [Commented] (SPARK-11857) Remove Mesos fine-grained mode subject to discussions

2016-03-08 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185511#comment-15185511 ] Andrew Or commented on SPARK-11857: --- [~dragos][~tnachen] any thoughts on at least depre

[jira] [Commented] (SPARK-11857) Remove Mesos fine-grained mode subject to discussions

2016-03-08 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185505#comment-15185505 ] Reynold Xin commented on SPARK-11857: - Maybe we should deprecate this first. > Remo

[jira] [Updated] (SPARK-13728) Fix ORC PPD

2016-03-08 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-13728: - Assignee: Hyukjin Kwon > Fix ORC PPD > --- > > Key: SPARK-13728 >

[jira] [Commented] (SPARK-13728) Fix ORC PPD

2016-03-08 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185463#comment-15185463 ] Michael Armbrust commented on SPARK-13728: -- That sounds like a good lead to foll

[jira] [Resolved] (SPARK-13695) Don't cache MEMORY_AND_DISK blocks as bytes in memory store when reading spills

2016-03-08 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-13695. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11533 [https://github.

[jira] [Commented] (SPARK-13665) Initial separation of concerns in HadoopFSRelation

2016-03-08 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185460#comment-15185460 ] Michael Armbrust commented on SPARK-13665: -- I think what everyone is going to wa

[jira] [Commented] (SPARK-12177) Update KafkaDStreams to new Kafka 0.9 Consumer API

2016-03-08 Thread Mark Grover (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185461#comment-15185461 ] Mark Grover commented on SPARK-12177: - For a) I think it's a larger discussion, that

[jira] [Updated] (SPARK-13665) Initial separation of concerns in HadoopFSRelation

2016-03-08 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-13665: - Summary: Initial separation of concerns in HadoopFSRelation (was: Initial separation of

[jira] [Commented] (SPARK-11888) Model export/import for spark.ml: DecisionTreeClassifier,Regressor

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185455#comment-15185455 ] Apache Spark commented on SPARK-11888: -- User 'jkbradley' has created a pull request

[jira] [Updated] (SPARK-13734) SparkR histogram

2016-03-08 Thread Oscar D. Lara Yejas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oscar D. Lara Yejas updated SPARK-13734: Description: Create method histogram() on SparkR to render a histogram of a given C

[jira] [Commented] (SPARK-11569) StringIndexer transform fails when column contains nulls

2016-03-08 Thread Timothy Hunter (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185444#comment-15185444 ] Timothy Hunter commented on SPARK-11569: Also, I suggest to look at Pandas' index

[jira] [Resolved] (SPARK-13657) Support parsing very long AND/OR expression

2016-03-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-13657. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11501 [https://github.

[jira] [Assigned] (SPARK-13746) Stop using depredated SynchronizedSet

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13746: Assignee: (was: Apache Spark) > Stop using depredated SynchronizedSet > --

[jira] [Assigned] (SPARK-13746) Stop using depredated SynchronizedSet

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13746: Assignee: Apache Spark > Stop using depredated SynchronizedSet > -

[jira] [Commented] (SPARK-13746) Stop using depredated SynchronizedSet

2016-03-08 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185374#comment-15185374 ] Apache Spark commented on SPARK-13746: -- User 'wilson8' has created a pull re

[jira] [Resolved] (SPARK-13744) Dataframe RDD caching increases the input size for subsequent stages

2016-03-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-13744. --- Resolution: Not A Problem Confirmed this on the stage detail page, where it shows some tasks reading

[jira] [Resolved] (SPARK-13715) Remove last usages of jblas in tests

2016-03-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-13715. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11560 [https://github.co

[jira] [Created] (SPARK-13746) Stop using depredated SynchronizedSet

2016-03-08 Thread Wilson Wu (JIRA)
Wilson Wu created SPARK-13746: - Summary: Stop using depredated SynchronizedSet Key: SPARK-13746 URL: https://issues.apache.org/jira/browse/SPARK-13746 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-13745) Support columnar in memory representation on Big Endian platforms

2016-03-08 Thread Tim Preece (JIRA)
Tim Preece created SPARK-13745: -- Summary: Support columnar in memory representation on Big Endian platforms Key: SPARK-13745 URL: https://issues.apache.org/jira/browse/SPARK-13745 Project: Spark

[jira] [Commented] (SPARK-13702) Use diamond operator for generic instance creation in Java code

2016-03-08 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185192#comment-15185192 ] Dongjoon Hyun commented on SPARK-13702: --- Hi, [~mengxr], Could you take a look at t

[jira] [Updated] (SPARK-13702) Use diamond operator for generic instance creation in Java code

2016-03-08 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-13702: -- Description: In order to make docs/example (and other related code) more simple and readable,

[jira] [Updated] (SPARK-12555) Datasets: data is corrupted when input data is reordered

2016-03-08 Thread Luciano Resende (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luciano Resende updated SPARK-12555: Labels: big-endian (was: ) > Datasets: data is corrupted when input data is reordered > --

[jira] [Updated] (SPARK-12319) ExchangeCoordinatorSuite fails on big-endian platforms

2016-03-08 Thread Luciano Resende (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luciano Resende updated SPARK-12319: Labels: big-endian (was: ) > ExchangeCoordinatorSuite fails on big-endian platforms >

[jira] [Commented] (SPARK-13736) Big-Endian plataform issues

2016-03-08 Thread Luciano Resende (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185173#comment-15185173 ] Luciano Resende commented on SPARK-13736: - We want to start testing proactively o

[jira] [Commented] (SPARK-13736) Big-Endian plataform issues

2016-03-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185182#comment-15185182 ] Sean Owen commented on SPARK-13736: --- I think this is what labels and search are for, re

[jira] [Commented] (SPARK-8360) Streaming DataFrames

2016-03-08 Thread Praveen Devarao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185168#comment-15185168 ] Praveen Devarao commented on SPARK-8360: Hi [~tdas],[~marmbrus],[~rxin] Any docs

[jira] [Commented] (SPARK-13744) Dataframe RDD caching increases the input size for subsequent stages

2016-03-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185133#comment-15185133 ] Sean Owen commented on SPARK-13744: --- Oh, I think I get it. The size of this data on dis

[jira] [Commented] (SPARK-12313) getPartitionsByFilter doesnt handle predicates on all / multiple Partition Columns

2016-03-08 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185139#comment-15185139 ] Imran Rashid commented on SPARK-12313: -- [~harshg] I think the issue here is just fr

[jira] [Updated] (SPARK-13744) Dataframe RDD caching increases the input size for subsequent stages

2016-03-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13744: -- Component/s: Web UI I get it, it's not the size in memory but the bytes read. I think ~500MB is correc

<    1   2   3   >