[jira] [Commented] (SPARK-14037) count(df) is very slow for dataframe constrcuted using SparkR::createDataFrame

2016-03-22 Thread Sun Rui (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207916#comment-15207916 ] Sun Rui commented on SPARK-14037: - spark 1.6.1 release, standalone mode. bin/sparkR --master spark:// run

[jira] [Commented] (SPARK-10925) Exception when joining DataFrames

2016-03-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207915#comment-15207915 ] Wenchen Fan commented on SPARK-10925: - If you wanna remove duplicated join keys, you can do

[jira] [Assigned] (SPARK-14091) Consider improving performance of SparkContext.getCallSite()

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14091: Assignee: Apache Spark > Consider improving performance of SparkContext.getCallSite() >

[jira] [Commented] (SPARK-14091) Consider improving performance of SparkContext.getCallSite()

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207914#comment-15207914 ] Apache Spark commented on SPARK-14091: -- User 'rajeshbalamohan' has created a pull request for this

[jira] [Assigned] (SPARK-14091) Consider improving performance of SparkContext.getCallSite()

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14091: Assignee: (was: Apache Spark) > Consider improving performance of

[jira] [Commented] (SPARK-11231) join returns schema with duplicated and ambiguous join columns

2016-03-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207877#comment-15207877 ] Wenchen Fan commented on SPARK-11231: - I'm not familiar with R or Spark R API, but for scala version,

[jira] [Commented] (SPARK-14074) Do not use install_github in SparkR build

2016-03-22 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207866#comment-15207866 ] Shivaram Venkataraman commented on SPARK-14074: --- [~sunrui] Would you have a chance to check

[jira] [Created] (SPARK-14091) Consider improving performance of SparkContext.getCallSite()

2016-03-22 Thread Rajesh Balamohan (JIRA)
Rajesh Balamohan created SPARK-14091: Summary: Consider improving performance of SparkContext.getCallSite() Key: SPARK-14091 URL: https://issues.apache.org/jira/browse/SPARK-14091 Project: Spark

[jira] [Updated] (SPARK-14085) Star Expansion for Hash

2016-03-22 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-14085: Description: Support star expansion in hash and concat. For example {code} val structDf =

[jira] [Updated] (SPARK-14085) Star Expansion for Hash

2016-03-22 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-14085: Description: Support star expansion in hash. For example {code} val structDf = testData2.select("a",

[jira] [Updated] (SPARK-14085) Star Expansion for Hash

2016-03-22 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-14085: Summary: Star Expansion for Hash (was: Star Expansion for Hash and Concat) > Star Expansion for Hash >

[jira] [Resolved] (SPARK-10146) Have an easy way to set data source reader/writer specific confs

2016-03-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-10146. - Resolution: Fixed Fix Version/s: 2.0.0 > Have an easy way to set data source

[jira] [Commented] (SPARK-10146) Have an easy way to set data source reader/writer specific confs

2016-03-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207858#comment-15207858 ] Reynold Xin commented on SPARK-10146: - I think we are already doing this. I'm going to close the

[jira] [Closed] (SPARK-12769) Remove If expression

2016-03-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-12769. --- Resolution: Won't Fix Closing as won't fix for now since doing this change would make the explain

[jira] [Resolved] (SPARK-12767) Improve conditional expressions

2016-03-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-12767. - Resolution: Fixed Assignee: Reynold Xin Fix Version/s: 2.0.0 > Improve

[jira] [Closed] (SPARK-12997) Use cast expression to perform type cast in csv

2016-03-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-12997. --- Resolution: Not A Problem > Use cast expression to perform type cast in csv >

[jira] [Resolved] (SPARK-13401) Fix SQL test warnings

2016-03-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-13401. - Resolution: Fixed Assignee: Yong Tang Fix Version/s: 2.0.0 > Fix SQL test

[jira] [Commented] (SPARK-12855) Remove parser pluggability

2016-03-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207844#comment-15207844 ] Reynold Xin commented on SPARK-12855: - Got it - we can add this back, but we need to wait till we

[jira] [Commented] (SPARK-14081) DataFrameNaFunctions fill should not convert float fields to double

2016-03-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207843#comment-15207843 ] Reynold Xin commented on SPARK-14081: - Yes a pull request would be great. Probably an one line

[jira] [Resolved] (SPARK-14072) Show JVM information when we run Benchmark

2016-03-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-14072. - Resolution: Fixed Assignee: Kazuaki Ishizaki Fix Version/s: 2.0.0 > Show JVM

[jira] [Created] (SPARK-14090) The optimization method of convex function

2016-03-22 Thread chenalong (JIRA)
chenalong created SPARK-14090: - Summary: The optimization method of convex function Key: SPARK-14090 URL: https://issues.apache.org/jira/browse/SPARK-14090 Project: Spark Issue Type: Task

[jira] [Commented] (SPARK-14081) DataFrameNaFunctions fill should not convert float fields to double

2016-03-22 Thread Travis Crawford (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207823#comment-15207823 ] Travis Crawford commented on SPARK-14081: - Agreed all data types should allow filling without

[jira] [Commented] (SPARK-14074) Do not use install_github in SparkR build

2016-03-22 Thread Sun Rui (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207780#comment-15207780 ] Sun Rui commented on SPARK-14074: - yes, unstable source may cause un-expected test failures. I agree we

[jira] [Commented] (SPARK-14089) Remove methods that has been deprecated since 1.1.x, 1.2.x and 1.3.x

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207773#comment-15207773 ] Apache Spark commented on SPARK-14089: -- User 'lw-lin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-14089) Remove methods that has been deprecated since 1.1.x, 1.2.x and 1.3.x

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14089: Assignee: Apache Spark > Remove methods that has been deprecated since 1.1.x, 1.2.x and

[jira] [Assigned] (SPARK-14089) Remove methods that has been deprecated since 1.1.x, 1.2.x and 1.3.x

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14089: Assignee: (was: Apache Spark) > Remove methods that has been deprecated since 1.1.x,

[jira] [Created] (SPARK-14089) Remove methods that has been deprecated since 1.1.x, 1.2.x and 1.3.x

2016-03-22 Thread Liwei Lin (JIRA)
Liwei Lin created SPARK-14089: - Summary: Remove methods that has been deprecated since 1.1.x, 1.2.x and 1.3.x Key: SPARK-14089 URL: https://issues.apache.org/jira/browse/SPARK-14089 Project: Spark

[jira] [Commented] (SPARK-12855) Remove parser pluggability

2016-03-22 Thread Joseph Levin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207765#comment-15207765 ] Joseph Levin commented on SPARK-12855: -- Reynold - We would except grudgingly. Of course we don't

[jira] [Commented] (SPARK-12855) Remove parser pluggability

2016-03-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207729#comment-15207729 ] Reynold Xin commented on SPARK-12855: - Joseph - in the case of creating your own parser, you are

[jira] [Commented] (SPARK-12855) Remove parser pluggability

2016-03-22 Thread Joseph Levin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207725#comment-15207725 ] Joseph Levin commented on SPARK-12855: -- I have some concern about this task. I am working on a

[jira] [Commented] (SPARK-14081) DataFrameNaFunctions fill should not convert float fields to double

2016-03-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207692#comment-15207692 ] Reynold Xin commented on SPARK-14081: - This is actually somewhat tricky, because we will lose

[jira] [Assigned] (SPARK-14088) Some Dataset API touch-up

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14088: Assignee: Apache Spark (was: Reynold Xin) > Some Dataset API touch-up >

[jira] [Assigned] (SPARK-14088) Some Dataset API touch-up

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14088: Assignee: Reynold Xin (was: Apache Spark) > Some Dataset API touch-up >

[jira] [Commented] (SPARK-14088) Some Dataset API touch-up

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207674#comment-15207674 ] Apache Spark commented on SPARK-14088: -- User 'rxin' has created a pull request for this issue:

[jira] [Created] (SPARK-14088) Some Dataset API touch-up

2016-03-22 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-14088: --- Summary: Some Dataset API touch-up Key: SPARK-14088 URL: https://issues.apache.org/jira/browse/SPARK-14088 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-14088) Some Dataset API touch-up

2016-03-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-14088: Description: 1. Deprecated unionAll. It is pretty confusing to have both "union" and "unionAll"

[jira] [Updated] (SPARK-14066) Set "spark.sql.dialect=sql", there is a problen in running query "select percentile(d,array(0,0.2,0.3,1)) as a from t;"

2016-03-22 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-14066: -- Description: In spark 1.5.1, I run "sh bin/spark-sql --conf spark.sql.dialect=sql", and run

[jira] [Comment Edited] (SPARK-14066) Set "spark.sql.dialect=sql", there is a problen in running query "select percentile(d,array(0,0.2,0.3,1)) as a from t;"

2016-03-22 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206280#comment-15206280 ] KaiXinXIaoLei edited comment on SPARK-14066 at 3/23/16 1:19 AM: In the

[jira] [Updated] (SPARK-14033) Merging Estimator & Model

2016-03-22 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14033: -- Summary: Merging Estimator & Model (was: Merging Estimator, Model, & Transformer) >

[jira] [Assigned] (SPARK-14087) PySpark ML JavaModel does not properly own params after being fit

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14087: Assignee: (was: Apache Spark) > PySpark ML JavaModel does not properly own params

[jira] [Commented] (SPARK-14087) PySpark ML JavaModel does not properly own params after being fit

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207604#comment-15207604 ] Apache Spark commented on SPARK-14087: -- User 'BryanCutler' has created a pull request for this

[jira] [Assigned] (SPARK-14087) PySpark ML JavaModel does not properly own params after being fit

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14087: Assignee: Apache Spark > PySpark ML JavaModel does not properly own params after being

[jira] [Resolved] (SPARK-13806) SQL round() produces incorrect results for negative values

2016-03-22 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-13806. Resolution: Fixed Fix Version/s: 1.6.2 1.5.3 2.1.0

[jira] [Commented] (SPARK-14087) PySpark ML JavaModel does not properly own params after being fit

2016-03-22 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207555#comment-15207555 ] Bryan Cutler commented on SPARK-14087: -- I can post a PR for this > PySpark ML JavaModel does not

[jira] [Updated] (SPARK-14087) PySpark ML JavaModel does not properly own params after being fit

2016-03-22 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-14087: - Attachment: feature.py > PySpark ML JavaModel does not properly own params after being fit >

[jira] [Created] (SPARK-14087) PySpark ML JavaModel does not properly own params after being fit

2016-03-22 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-14087: Summary: PySpark ML JavaModel does not properly own params after being fit Key: SPARK-14087 URL: https://issues.apache.org/jira/browse/SPARK-14087 Project: Spark

[jira] [Commented] (SPARK-5991) Python API for ML model import/export

2016-03-22 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207537#comment-15207537 ] Joseph K. Bradley commented on SPARK-5991: -- Reopening since we'll need to add items once more

[jira] [Reopened] (SPARK-5991) Python API for ML model import/export

2016-03-22 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reopened SPARK-5991: -- > Python API for ML model import/export > - > >

[jira] [Updated] (SPARK-5991) Python API for ML model import/export

2016-03-22 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5991: - Fix Version/s: (was: 2.0.0) > Python API for ML model import/export >

[jira] [Assigned] (SPARK-14086) Add DDL commands to ANTLR4 Parser

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14086: Assignee: (was: Apache Spark) > Add DDL commands to ANTLR4 Parser >

[jira] [Commented] (SPARK-14086) Add DDL commands to ANTLR4 Parser

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207528#comment-15207528 ] Apache Spark commented on SPARK-14086: -- User 'hvanhovell' has created a pull request for this issue:

[jira] [Assigned] (SPARK-14086) Add DDL commands to ANTLR4 Parser

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14086: Assignee: Apache Spark > Add DDL commands to ANTLR4 Parser >

[jira] [Resolved] (SPARK-5991) Python API for ML model import/export

2016-03-22 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-5991. -- Resolution: Fixed Fix Version/s: 2.0.0 > Python API for ML model import/export >

[jira] [Assigned] (SPARK-14085) Star Expansion for Hash and Concat

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14085: Assignee: (was: Apache Spark) > Star Expansion for Hash and Concat >

[jira] [Created] (SPARK-14086) Add DDL commands to ANTLR4 Parser

2016-03-22 Thread Herman van Hovell (JIRA)
Herman van Hovell created SPARK-14086: - Summary: Add DDL commands to ANTLR4 Parser Key: SPARK-14086 URL: https://issues.apache.org/jira/browse/SPARK-14086 Project: Spark Issue Type:

[jira] [Commented] (SPARK-14085) Star Expansion for Hash and Concat

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207515#comment-15207515 ] Apache Spark commented on SPARK-14085: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Assigned] (SPARK-14085) Star Expansion for Hash and Concat

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14085: Assignee: Apache Spark > Star Expansion for Hash and Concat >

[jira] [Updated] (SPARK-14085) Star Expansion for Hash and Concat

2016-03-22 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-14085: Description: Support star expansion in hash and concat. For example {code} val structDf =

[jira] [Created] (SPARK-14085) Star Expansion for Hash and Concat

2016-03-22 Thread Xiao Li (JIRA)
Xiao Li created SPARK-14085: --- Summary: Star Expansion for Hash and Concat Key: SPARK-14085 URL: https://issues.apache.org/jira/browse/SPARK-14085 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-14084) Parallel training jobs in model selection

2016-03-22 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-14084: - Summary: Parallel training jobs in model selection Key: SPARK-14084 URL: https://issues.apache.org/jira/browse/SPARK-14084 Project: Spark Issue Type: New

[jira] [Updated] (SPARK-14084) Parallel training jobs in model selection

2016-03-22 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-14084: -- Description: In CrossValidator and TrainValidationSplit, we run training jobs one by one. If

[jira] [Commented] (SPARK-6717) Clear shuffle files after checkpointing in ALS

2016-03-22 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207459#comment-15207459 ] holdenk commented on SPARK-6717: So looking at the code a little bit I think its probably better to need

[jira] [Commented] (SPARK-14079) Limit the number of queries on SQL UI

2016-03-22 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207455#comment-15207455 ] Shixiong Zhu commented on SPARK-14079: -- It's already there. See "spark.sql.ui.retainedExecutions" in

[jira] [Assigned] (SPARK-13952) spark.ml GBT algs need to use random seed

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13952: Assignee: (was: Apache Spark) > spark.ml GBT algs need to use random seed >

[jira] [Commented] (SPARK-13952) spark.ml GBT algs need to use random seed

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207444#comment-15207444 ] Apache Spark commented on SPARK-13952: -- User 'sethah' has created a pull request for this issue:

[jira] [Updated] (SPARK-14083) Analyze JVM bytecode and turn closures into Catalyst expressions

2016-03-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-14083: Description: One big advantage of the Dataset API is the type safety, at the cost of performance

[jira] [Assigned] (SPARK-13952) spark.ml GBT algs need to use random seed

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13952: Assignee: Apache Spark > spark.ml GBT algs need to use random seed >

[jira] [Created] (SPARK-14083) Analyze JVM bytecode and turn closures into Catalyst expressions

2016-03-22 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-14083: --- Summary: Analyze JVM bytecode and turn closures into Catalyst expressions Key: SPARK-14083 URL: https://issues.apache.org/jira/browse/SPARK-14083 Project: Spark

[jira] [Commented] (SPARK-14041) Locate possible duplicates and group them into subtasks

2016-03-22 Thread Xusen Yin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207417#comment-15207417 ] Xusen Yin commented on SPARK-14041: --- [~mengxr] Maybe no need to divide them into several JIRAs, since

[jira] [Updated] (SPARK-14041) Locate possible duplicates and group them into subtasks

2016-03-22 Thread Xusen Yin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xusen Yin updated SPARK-14041: -- Description: Please go through the current example code and list possible duplicates. Duplicates need

[jira] [Updated] (SPARK-14041) Locate possible duplicates and group them into subtasks

2016-03-22 Thread Xusen Yin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xusen Yin updated SPARK-14041: -- Description: Please go through the current example code and list possible duplicates. Duplicates need

[jira] [Assigned] (SPARK-13019) Replace example code in mllib-statistics.md using include_example

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13019: Assignee: Xin Ren (was: Apache Spark) > Replace example code in mllib-statistics.md

[jira] [Commented] (SPARK-13019) Replace example code in mllib-statistics.md using include_example

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207399#comment-15207399 ] Apache Spark commented on SPARK-13019: -- User 'keypointt' has created a pull request for this issue:

[jira] [Created] (SPARK-14082) Add support for GPU resource when running on Mesos

2016-03-22 Thread Timothy Chen (JIRA)
Timothy Chen created SPARK-14082: Summary: Add support for GPU resource when running on Mesos Key: SPARK-14082 URL: https://issues.apache.org/jira/browse/SPARK-14082 Project: Spark Issue

[jira] [Commented] (SPARK-11666) Find the best `k` by cutting bisecting k-means cluster tree without recomputation

2016-03-22 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-11666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207390#comment-15207390 ] Burak KÖSE commented on SPARK-11666: Hi, can you share links for references about that? > Find the

[jira] [Created] (SPARK-14081) DataFrameNaFunctions fill should not convert float fields to double

2016-03-22 Thread Travis Crawford (JIRA)
Travis Crawford created SPARK-14081: --- Summary: DataFrameNaFunctions fill should not convert float fields to double Key: SPARK-14081 URL: https://issues.apache.org/jira/browse/SPARK-14081 Project:

[jira] [Assigned] (SPARK-14080) Improve the codegen for Filter

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14080: Assignee: (was: Apache Spark) > Improve the codegen for Filter >

[jira] [Commented] (SPARK-14080) Improve the codegen for Filter

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207383#comment-15207383 ] Apache Spark commented on SPARK-14080: -- User 'bomeng' has created a pull request for this issue:

[jira] [Assigned] (SPARK-14080) Improve the codegen for Filter

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14080: Assignee: Apache Spark > Improve the codegen for Filter > --

[jira] [Created] (SPARK-14080) Improve the codegen for Filter

2016-03-22 Thread Bo Meng (JIRA)
Bo Meng created SPARK-14080: --- Summary: Improve the codegen for Filter Key: SPARK-14080 URL: https://issues.apache.org/jira/browse/SPARK-14080 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-14079) Limit the number of queries on SQL UI

2016-03-22 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207349#comment-15207349 ] Davies Liu commented on SPARK-14079: Yes, that's what I meant. > Limit the number of queries on SQL

[jira] [Reopened] (SPARK-13019) Replace example code in mllib-statistics.md using include_example

2016-03-22 Thread Xin Ren (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xin Ren reopened SPARK-13019: - need to fix scala-2.10 compile > Replace example code in mllib-statistics.md using include_example >

[jira] [Resolved] (SPARK-13449) Naive Bayes wrapper in SparkR

2016-03-22 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-13449. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11890

[jira] [Commented] (SPARK-14040) Null-safe and equality join produces incorrect result with filtered dataframe

2016-03-22 Thread Sunitha Kambhampati (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207294#comment-15207294 ] Sunitha Kambhampati commented on SPARK-14040: - I can reproduce this on my master ( v2.0

[jira] [Commented] (SPARK-14079) Limit the number of queries on SQL UI

2016-03-22 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207275#comment-15207275 ] Andrew Or commented on SPARK-14079: --- we should do `maxRetainedQueries` or something, similar to what we

[jira] [Commented] (SPARK-14075) Refactor MemoryStore to be testable independent of BlockManager

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207266#comment-15207266 ] Apache Spark commented on SPARK-14075: -- User 'JoshRosen' has created a pull request for this issue:

[jira] [Comment Edited] (SPARK-13514) Spark Shuffle Service 1.6.0 issue in Yarn

2016-03-22 Thread Satish Kolli (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207236#comment-15207236 ] Satish Kolli edited comment on SPARK-13514 at 3/22/16 8:37 PM: --- I just

[jira] [Comment Edited] (SPARK-13514) Spark Shuffle Service 1.6.0 issue in Yarn

2016-03-22 Thread Satish Kolli (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207236#comment-15207236 ] Satish Kolli edited comment on SPARK-13514 at 3/22/16 8:34 PM: --- I just

[jira] [Commented] (SPARK-13864) TPCDS query 74 returns wrong results compared to TPC official result set

2016-03-22 Thread JESSE CHEN (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207239#comment-15207239 ] JESSE CHEN commented on SPARK-13864: Tried on two recent builds having issues running to completion.

[jira] [Commented] (SPARK-13514) Spark Shuffle Service 1.6.0 issue in Yarn

2016-03-22 Thread Satish Kolli (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207236#comment-15207236 ] Satish Kolli commented on SPARK-13514: -- I just upgraded the shuffle service with 1.6.1 and the *YARN

[jira] [Updated] (SPARK-14079) Limit the number of queries on SQL UI

2016-03-22 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14079: --- Description: The SQL UI become very very slow if there are hundreds of SQL queries on it. > Limit

[jira] [Commented] (SPARK-14079) Limit the number of queries on SQL UI

2016-03-22 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207231#comment-15207231 ] Davies Liu commented on SPARK-14079: cc [~zsxwing] [~andrewor14] > Limit the number of queries on

[jira] [Commented] (SPARK-13887) PyLint should fail fast to make errors easier to discover

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207227#comment-15207227 ] Apache Spark commented on SPARK-13887: -- User 'holdenk' has created a pull request for this issue:

[jira] [Created] (SPARK-14079) Limit the number of queries on SQL UI

2016-03-22 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14079: -- Summary: Limit the number of queries on SQL UI Key: SPARK-14079 URL: https://issues.apache.org/jira/browse/SPARK-14079 Project: Spark Issue Type: Improvement

[jira] [Assigned] (SPARK-13887) PyLint should fail fast to make errors easier to discover

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13887: Assignee: (was: Apache Spark) > PyLint should fail fast to make errors easier to

[jira] [Assigned] (SPARK-13887) PyLint should fail fast to make errors easier to discover

2016-03-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13887: Assignee: Apache Spark > PyLint should fail fast to make errors easier to discover >

[jira] [Comment Edited] (SPARK-13733) Support initial weight distribution in personalized PageRank

2016-03-22 Thread Gayathri Murali (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207219#comment-15207219 ] Gayathri Murali edited comment on SPARK-13733 at 3/22/16 8:28 PM: --

[jira] [Commented] (SPARK-13733) Support initial weight distribution in personalized PageRank

2016-03-22 Thread Gayathri Murali (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207219#comment-15207219 ] Gayathri Murali commented on SPARK-13733: - [~mengxr] [~dwmclary]

[jira] [Closed] (SPARK-13858) TPCDS query 21 returns wrong results compared to TPC official result set

2016-03-22 Thread JESSE CHEN (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] JESSE CHEN closed SPARK-13858. -- Resolution: Not A Bug Schema updates generated correct results in both spark 1.6 and 2.0. Good to

[jira] [Commented] (SPARK-13971) Implicit group by with distinct modifier on having raises an unexpected error

2016-03-22 Thread Sunitha Kambhampati (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207168#comment-15207168 ] Sunitha Kambhampati commented on SPARK-13971: - fwiw, it is not the exact same env but I tried

  1   2   3   >