[jira] [Assigned] (SPARK-14374) PySpark ml GBTClassifier, Regressor support export/import

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14374: Assignee: (was: Apache Spark) > PySpark ml GBTClassifier, Regressor support

[jira] [Assigned] (SPARK-14374) PySpark ml GBTClassifier, Regressor support export/import

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14374: Assignee: Apache Spark > PySpark ml GBTClassifier, Regressor support export/import >

[jira] [Commented] (SPARK-14374) PySpark ml GBTClassifier, Regressor support export/import

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240652#comment-15240652 ] Apache Spark commented on SPARK-14374: -- User 'yanboliang' has created a pull request for this issue:

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

2016-04-13 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240640#comment-15240640 ] Cheng Lian commented on SPARK-14463: Should we simply throw an exception when text data source is

[jira] [Created] (SPARK-14624) Error at the end of installing Spark 1.6.1 using Spark-ec2 scipt

2016-04-13 Thread Mohaed Alibrahim (JIRA)
Mohaed Alibrahim created SPARK-14624: Summary: Error at the end of installing Spark 1.6.1 using Spark-ec2 scipt Key: SPARK-14624 URL: https://issues.apache.org/jira/browse/SPARK-14624 Project:

[jira] [Commented] (SPARK-14507) Decide if we should still support CREATE EXTERNAL TABLE AS SELECT

2016-04-13 Thread Yan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240616#comment-15240616 ] Yan commented on SPARK-14507: - In terms of Hive support vs Spark SQL support, the "external table" concept

[jira] [Commented] (SPARK-14603) SessionCatalog needs to check if a metadata operation is valid

2016-04-13 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240608#comment-15240608 ] Xiao Li commented on SPARK-14603: - To verify the error messages we issued from SessionCatalog and

[jira] [Commented] (SPARK-14409) Investigate adding a RankingEvaluator to ML

2016-04-13 Thread Yong Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240607#comment-15240607 ] Yong Tang commented on SPARK-14409: --- Thanks [~mlnick] [~josephkb]. Yes I think wrapping RankingMetrics

[jira] [Commented] (SPARK-10574) HashingTF should use MurmurHash3

2016-04-13 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240601#comment-15240601 ] Yanbo Liang commented on SPARK-10574: - Sure, I will sent a PR in a few days. Thanks! > HashingTF

[jira] [Commented] (SPARK-14622) Retain lost executors status

2016-04-13 Thread hujiayin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240550#comment-15240550 ] hujiayin commented on SPARK-14622: -- I think it is also better to maintain the number of lost executors.

[jira] [Assigned] (SPARK-14623) add label binarizer

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14623: Assignee: (was: Apache Spark) > add label binarizer > > >

[jira] [Commented] (SPARK-14623) add label binarizer

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240543#comment-15240543 ] Apache Spark commented on SPARK-14623: -- User 'hujy' has created a pull request for this issue:

[jira] [Assigned] (SPARK-14623) add label binarizer

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14623: Assignee: Apache Spark > add label binarizer > > >

[jira] [Created] (SPARK-14623) add label binarizer

2016-04-13 Thread hujiayin (JIRA)
hujiayin created SPARK-14623: Summary: add label binarizer Key: SPARK-14623 URL: https://issues.apache.org/jira/browse/SPARK-14623 Project: Spark Issue Type: Improvement Components:

[jira] [Created] (SPARK-14622) Retain lost executors status

2016-04-13 Thread Qingyang Hong (JIRA)
Qingyang Hong created SPARK-14622: - Summary: Retain lost executors status Key: SPARK-14622 URL: https://issues.apache.org/jira/browse/SPARK-14622 Project: Spark Issue Type: Improvement

[jira] [Issue Comment Deleted] (SPARK-7445) StringIndexer should handle binary labels properly

2016-04-13 Thread hujiayin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hujiayin updated SPARK-7445: Comment: was deleted (was: If no one works on it, I'd like to submit a code for this issue.) >

[jira] [Commented] (SPARK-14609) LOAD DATA

2016-04-13 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240509#comment-15240509 ] Xiao Li commented on SPARK-14609: - It is not hard, but we need to handle partitions and a few options.

[jira] [Updated] (SPARK-14621) add oracle hint optimizer

2016-04-13 Thread Qingyang Hong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qingyang Hong updated SPARK-14621: -- Flags: Patch Priority: Minor (was: Major) Description: Current SQL parser in

[jira] [Resolved] (SPARK-12133) Support dynamic allocation in Spark Streaming

2016-04-13 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or resolved SPARK-12133. --- Resolution: Fixed Fix Version/s: 2.0.0 > Support dynamic allocation in Spark Streaming >

[jira] [Created] (SPARK-14621) add

2016-04-13 Thread Qingyang Hong (JIRA)
Qingyang Hong created SPARK-14621: - Summary: add Key: SPARK-14621 URL: https://issues.apache.org/jira/browse/SPARK-14621 Project: Spark Issue Type: Wish Components: SQL Affects

[jira] [Issue Comment Deleted] (SPARK-14592) Create table like

2016-04-13 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-14592: Comment: was deleted (was: I am working on this...) > Create table like >

[jira] [Issue Comment Deleted] (SPARK-14592) Create table like

2016-04-13 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-14592: Comment: was deleted (was: Will submit PR soon.) > Create table like > -

[jira] [Commented] (SPARK-12133) Support dynamic allocation in Spark Streaming

2016-04-13 Thread WilliamZhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240453#comment-15240453 ] WilliamZhu commented on SPARK-12133: Here have a new Design: http://www.jianshu.com/p/ae7fdd4746f6

[jira] [Comment Edited] (SPARK-14516) Clustering evaluator

2016-04-13 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240434#comment-15240434 ] zhengruifeng edited comment on SPARK-14516 at 4/14/16 1:56 AM: --- [~akamal]

[jira] [Commented] (SPARK-14516) Clustering evaluator

2016-04-13 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240434#comment-15240434 ] zhengruifeng commented on SPARK-14516: -- [~akamal] In my opinion, both supervised and unsupervised

[jira] [Commented] (SPARK-14620) Use/benchmark a better hash in AggregateHashMap

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240411#comment-15240411 ] Apache Spark commented on SPARK-14620: -- User 'sameeragarwal' has created a pull request for this

[jira] [Assigned] (SPARK-14620) Use/benchmark a better hash in AggregateHashMap

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14620: Assignee: (was: Apache Spark) > Use/benchmark a better hash in AggregateHashMap >

[jira] [Created] (SPARK-14620) Use/benchmark a better hash in AggregateHashMap

2016-04-13 Thread Sameer Agarwal (JIRA)
Sameer Agarwal created SPARK-14620: -- Summary: Use/benchmark a better hash in AggregateHashMap Key: SPARK-14620 URL: https://issues.apache.org/jira/browse/SPARK-14620 Project: Spark Issue

[jira] [Commented] (SPARK-14582) Increase the parallelism for small tables

2016-04-13 Thread Mark Hamstra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240389#comment-15240389 ] Mark Hamstra commented on SPARK-14582: -- The total absence of any description in both this JIRA and

[jira] [Updated] (SPARK-14619) Track internal accumulators (metrics) by stage attempt rather than stage

2016-04-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-14619: Description: When there are multiple attempts for a stage, we currently only reset internal

[jira] [Commented] (SPARK-14619) Track internal accumulators (metrics) by stage attempt rather than stage

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240377#comment-15240377 ] Apache Spark commented on SPARK-14619: -- User 'rxin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-14619) Track internal accumulators (metrics) by stage attempt rather than stage

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14619: Assignee: Reynold Xin (was: Apache Spark) > Track internal accumulators (metrics) by

[jira] [Assigned] (SPARK-14619) Track internal accumulators (metrics) by stage attempt rather than stage

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14619: Assignee: Apache Spark (was: Reynold Xin) > Track internal accumulators (metrics) by

[jira] [Assigned] (SPARK-14618) RegressionEvaluator doc out of date

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14618: Assignee: Joseph K. Bradley (was: Apache Spark) > RegressionEvaluator doc out of date >

[jira] [Commented] (SPARK-14618) RegressionEvaluator doc out of date

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240374#comment-15240374 ] Apache Spark commented on SPARK-14618: -- User 'jkbradley' has created a pull request for this issue:

[jira] [Assigned] (SPARK-14618) RegressionEvaluator doc out of date

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14618: Assignee: Apache Spark (was: Joseph K. Bradley) > RegressionEvaluator doc out of date >

[jira] [Created] (SPARK-14618) RegressionEvaluator doc out of date

2016-04-13 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-14618: - Summary: RegressionEvaluator doc out of date Key: SPARK-14618 URL: https://issues.apache.org/jira/browse/SPARK-14618 Project: Spark Issue Type:

[jira] [Created] (SPARK-14619) Track internal accumulators (metrics) by stage attempt rather than stage

2016-04-13 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-14619: --- Summary: Track internal accumulators (metrics) by stage attempt rather than stage Key: SPARK-14619 URL: https://issues.apache.org/jira/browse/SPARK-14619 Project:

[jira] [Commented] (SPARK-14489) RegressionEvaluator returns NaN for ALS in Spark ml

2016-04-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240364#comment-15240364 ] Joseph K. Bradley commented on SPARK-14489: --- (Oh, I had not refreshed the page before

[jira] [Commented] (SPARK-14489) RegressionEvaluator returns NaN for ALS in Spark ml

2016-04-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240360#comment-15240360 ] Joseph K. Bradley commented on SPARK-14489: --- I'd to try to separate a few issues here based on

[jira] [Assigned] (SPARK-14614) Add `bround` function

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14614: Assignee: (was: Apache Spark) > Add `bround` function > - > >

[jira] [Commented] (SPARK-14614) Add `bround` function

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240351#comment-15240351 ] Apache Spark commented on SPARK-14614: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Assigned] (SPARK-14614) Add `bround` function

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14614: Assignee: Apache Spark > Add `bround` function > - > >

[jira] [Assigned] (SPARK-14617) Remove deprecated APIs in TaskMetrics

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14617: Assignee: Apache Spark (was: Reynold Xin) > Remove deprecated APIs in TaskMetrics >

[jira] [Assigned] (SPARK-14617) Remove deprecated APIs in TaskMetrics

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14617: Assignee: Reynold Xin (was: Apache Spark) > Remove deprecated APIs in TaskMetrics >

[jira] [Commented] (SPARK-14617) Remove deprecated APIs in TaskMetrics

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240331#comment-15240331 ] Apache Spark commented on SPARK-14617: -- User 'rxin' has created a pull request for this issue:

[jira] [Updated] (SPARK-14617) Remove deprecated APIs in TaskMetrics

2016-04-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-14617: Summary: Remove deprecated APIs in TaskMetrics (was: Remove deprecated APIs in accumulators) >

[jira] [Created] (SPARK-14617) Remove deprecated APIs in accumulators

2016-04-13 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-14617: --- Summary: Remove deprecated APIs in accumulators Key: SPARK-14617 URL: https://issues.apache.org/jira/browse/SPARK-14617 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-13944) Separate out local linear algebra as a standalone module without Spark dependency

2016-04-13 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240322#comment-15240322 ] Xiangrui Meng commented on SPARK-13944: --- There are more production workflows using RDD-based APIs

[jira] [Assigned] (SPARK-14607) Partition pruning is case sensitive even with HiveContext

2016-04-13 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-14607: -- Assignee: Davies Liu > Partition pruning is case sensitive even with HiveContext >

[jira] [Resolved] (SPARK-14484) Fail to create parquet filter if the column name does not match exactly

2016-04-13 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14484. Resolution: Fixed Assignee: Davies Liu > Fail to create parquet filter if the column name

[jira] [Resolved] (SPARK-14607) Partition pruning is case sensitive even with HiveContext

2016-04-13 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14607. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12371

[jira] [Commented] (SPARK-14559) Netty RPC didn't check channel is active before sending message

2016-04-13 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240299#comment-15240299 ] Shixiong Zhu commented on SPARK-14559: -- When this happens? When you are stopping the SparkContext?

[jira] [Commented] (SPARK-14614) Add `bround` function

2016-04-13 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240300#comment-15240300 ] Dongjoon Hyun commented on SPARK-14614: --- Since 1.3.0. :) > Add `bround` function >

[jira] [Commented] (SPARK-14614) Add `bround` function

2016-04-13 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240303#comment-15240303 ] Dongjoon Hyun commented on SPARK-14614: --- I'll send a PR soon. Actually, I tested hive 2.0 today. >

[jira] [Commented] (SPARK-13944) Separate out local linear algebra as a standalone module without Spark dependency

2016-04-13 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240291#comment-15240291 ] DB Tsai commented on SPARK-13944: - Can you elaborate the automatic conversion in VectorUDT? We will

[jira] [Assigned] (SPARK-14610) Remove superfluous split from random forest findSplitsForContinousFeature

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14610: Assignee: Apache Spark > Remove superfluous split from random forest

[jira] [Assigned] (SPARK-14610) Remove superfluous split from random forest findSplitsForContinousFeature

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14610: Assignee: (was: Apache Spark) > Remove superfluous split from random forest

[jira] [Commented] (SPARK-14610) Remove superfluous split from random forest findSplitsForContinousFeature

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240274#comment-15240274 ] Apache Spark commented on SPARK-14610: -- User 'sethah' has created a pull request for this issue:

[jira] [Commented] (SPARK-14614) Add `bround` function

2016-04-13 Thread Bo Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240259#comment-15240259 ] Bo Meng commented on SPARK-14614: - I have tried on Hive 1.2.1, actually this function seems dropped out.

[jira] [Commented] (SPARK-14409) Investigate adding a RankingEvaluator to ML

2016-04-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240252#comment-15240252 ] Joseph K. Bradley commented on SPARK-14409: --- Thanks for writing this! I just made a few

[jira] [Updated] (SPARK-14616) TreeNodeException running Q44 and 58 on Parquet tables

2016-04-13 Thread JESSE CHEN (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] JESSE CHEN updated SPARK-14616: --- Description: {code:title=tpcds q44} select asceding.rnk, i1.i_product_name best_performing,

[jira] [Updated] (SPARK-14616) TreeNodeException running Q44 and 58 on Parquet tables

2016-04-13 Thread JESSE CHEN (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] JESSE CHEN updated SPARK-14616: --- Environment: (was: spark 1.5.1 (official binary distribution) running on hadoop yarn 2.6 with

[jira] [Updated] (SPARK-14616) TreeNodeException running Q44 and 58 on Parquet tables

2016-04-13 Thread JESSE CHEN (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] JESSE CHEN updated SPARK-14616: --- Affects Version/s: (was: 1.5.1) 2.0.0 > TreeNodeException running Q44 and

[jira] [Created] (SPARK-14616) TreeNodeException running Q44 and 58 on Parquet tables

2016-04-13 Thread JESSE CHEN (JIRA)
JESSE CHEN created SPARK-14616: -- Summary: TreeNodeException running Q44 and 58 on Parquet tables Key: SPARK-14616 URL: https://issues.apache.org/jira/browse/SPARK-14616 Project: Spark Issue

[jira] [Assigned] (SPARK-14615) Use the new ML Vector and Matrix in the ML pipeline based algorithms

2016-04-13 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai reassigned SPARK-14615: --- Assignee: DB Tsai > Use the new ML Vector and Matrix in the ML pipeline based algorithms >

[jira] [Created] (SPARK-14615) Use the new ML Vector and Matrix in the ML pipeline based algorithms

2016-04-13 Thread DB Tsai (JIRA)
DB Tsai created SPARK-14615: --- Summary: Use the new ML Vector and Matrix in the ML pipeline based algorithms Key: SPARK-14615 URL: https://issues.apache.org/jira/browse/SPARK-14615 Project: Spark

[jira] [Assigned] (SPARK-14541) SQL function: IFNULL, NULLIF, NVL and NVL2

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14541: Assignee: Apache Spark > SQL function: IFNULL, NULLIF, NVL and NVL2 >

[jira] [Assigned] (SPARK-14541) SQL function: IFNULL, NULLIF, NVL and NVL2

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14541: Assignee: (was: Apache Spark) > SQL function: IFNULL, NULLIF, NVL and NVL2 >

[jira] [Commented] (SPARK-14541) SQL function: IFNULL, NULLIF, NVL and NVL2

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240206#comment-15240206 ] Apache Spark commented on SPARK-14541: -- User 'bomeng' has created a pull request for this issue:

[jira] [Created] (SPARK-14614) Add `bound` function

2016-04-13 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created SPARK-14614: - Summary: Add `bound` function Key: SPARK-14614 URL: https://issues.apache.org/jira/browse/SPARK-14614 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-14614) Add `bround` function

2016-04-13 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-14614: -- Summary: Add `bround` function (was: Add `bound` function) > Add `bround` function >

[jira] [Created] (SPARK-14613) Add @Since into the matrix and vector classes in spark-mllib-local

2016-04-13 Thread DB Tsai (JIRA)
DB Tsai created SPARK-14613: --- Summary: Add @Since into the matrix and vector classes in spark-mllib-local Key: SPARK-14613 URL: https://issues.apache.org/jira/browse/SPARK-14613 Project: Spark

[jira] [Created] (SPARK-14612) Consolidate the version of dependencies in mllib and mllib-local into one place

2016-04-13 Thread DB Tsai (JIRA)
DB Tsai created SPARK-14612: --- Summary: Consolidate the version of dependencies in mllib and mllib-local into one place Key: SPARK-14612 URL: https://issues.apache.org/jira/browse/SPARK-14612 Project:

[jira] [Closed] (SPARK-14457) Write a end to end test for DataSet with UDT

2016-04-13 Thread Joan Goyeau (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joan Goyeau closed SPARK-14457. --- Resolution: Fixed > Write a end to end test for DataSet with UDT >

[jira] [Updated] (SPARK-7861) Python wrapper for OneVsRest

2016-04-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7861: - Shepherd: Joseph K. Bradley Assignee: Xusen Yin (was: Ram Sriharsha)

[jira] [Created] (SPARK-14611) Second attempt observed after AM fails due to max number of executor failure in first attempt

2016-04-13 Thread Kshitij Badani (JIRA)
Kshitij Badani created SPARK-14611: -- Summary: Second attempt observed after AM fails due to max number of executor failure in first attempt Key: SPARK-14611 URL: https://issues.apache.org/jira/browse/SPARK-14611

[jira] [Commented] (SPARK-14541) SQL function: IFNULL, NULLIF, NVL and NVL2

2016-04-13 Thread Bo Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240157#comment-15240157 ] Bo Meng commented on SPARK-14541: - I will try to do it one by one. > SQL function: IFNULL, NULLIF, NVL

[jira] [Commented] (SPARK-14607) Partition pruning is case sensitive even with HiveContext

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240130#comment-15240130 ] Apache Spark commented on SPARK-14607: -- User 'davies' has created a pull request for this issue:

[jira] [Assigned] (SPARK-14484) Fail to create parquet filter if the column name does not match exactly

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14484: Assignee: Apache Spark > Fail to create parquet filter if the column name does not match

[jira] [Assigned] (SPARK-14484) Fail to create parquet filter if the column name does not match exactly

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14484: Assignee: (was: Apache Spark) > Fail to create parquet filter if the column name does

[jira] [Assigned] (SPARK-14607) Partition pruning is case sensitive even with HiveContext

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14607: Assignee: Apache Spark > Partition pruning is case sensitive even with HiveContext >

[jira] [Commented] (SPARK-14484) Fail to create parquet filter if the column name does not match exactly

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240131#comment-15240131 ] Apache Spark commented on SPARK-14484: -- User 'davies' has created a pull request for this issue:

[jira] [Commented] (SPARK-7146) Should ML sharedParams be a public API?

2016-04-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240127#comment-15240127 ] Joseph K. Bradley commented on SPARK-7146: -- I just did an audit of our current shared params.

[jira] [Updated] (SPARK-7146) Should ML sharedParams be a public API?

2016-04-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7146: - Description: Discussion: Should the Param traits in sharedParams.scala be public? Pros:

[jira] [Commented] (SPARK-14599) BaggedPoint should support weighted instances.

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240115#comment-15240115 ] Apache Spark commented on SPARK-14599: -- User 'sethah' has created a pull request for this issue:

[jira] [Assigned] (SPARK-14599) BaggedPoint should support weighted instances.

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14599: Assignee: (was: Apache Spark) > BaggedPoint should support weighted instances. >

[jira] [Assigned] (SPARK-14599) BaggedPoint should support weighted instances.

2016-04-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14599: Assignee: Apache Spark > BaggedPoint should support weighted instances. >

[jira] [Updated] (SPARK-14610) Remove superfluous split from random forest findSplitsForContinousFeature

2016-04-13 Thread Seth Hendrickson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Seth Hendrickson updated SPARK-14610: - Description: Currently, the method findSplitsForContinuousFeature in random forest

[jira] [Commented] (SPARK-14610) Remove superfluous split from random forest findSplitsForContinousFeature

2016-04-13 Thread Seth Hendrickson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240105#comment-15240105 ] Seth Hendrickson commented on SPARK-14610: -- One thing to note, is that fixing this actually

[jira] [Created] (SPARK-14610) Remove superfluous split from random forest findSplitsForContinousFeature

2016-04-13 Thread Seth Hendrickson (JIRA)
Seth Hendrickson created SPARK-14610: Summary: Remove superfluous split from random forest findSplitsForContinousFeature Key: SPARK-14610 URL: https://issues.apache.org/jira/browse/SPARK-14610

[jira] [Resolved] (SPARK-14574) Pure Java modules should not have _2.xx suffixes in their package names

2016-04-13 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-14574. Resolution: Later Resolving as "later" since this is prohibitively costly to fix and I don't have

[jira] [Created] (SPARK-14609) LOAD DATA

2016-04-13 Thread Yin Huai (JIRA)
Yin Huai created SPARK-14609: Summary: LOAD DATA Key: SPARK-14609 URL: https://issues.apache.org/jira/browse/SPARK-14609 Project: Spark Issue Type: Sub-task Components: SQL

[jira] [Updated] (SPARK-9961) ML prediction abstractions should have defaultEvaluator fields

2016-04-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-9961: - Target Version/s: 2.0.0 (was: ) > ML prediction abstractions should have

[jira] [Updated] (SPARK-9961) ML prediction abstractions should have defaultEvaluator fields

2016-04-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-9961: - Description: Predictor and PredictionModel should have abstract defaultEvaluator methods

[jira] [Commented] (SPARK-14606) Different maxBins value for categorical and continuous features in RandomForest implementation.

2016-04-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240058#comment-15240058 ] Joseph K. Bradley commented on SPARK-14606: --- We should choose a good way to support this

[jira] [Updated] (SPARK-14606) Different maxBins value for categorical and continuous features in RandomForest implementation.

2016-04-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14606: -- Fix Version/s: (was: 2.0.0) > Different maxBins value for categorical and

[jira] [Updated] (SPARK-14606) Different maxBins value for categorical and continuous features in RandomForest implementation.

2016-04-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14606: -- Affects Version/s: (was: 1.6.1) (was: 1.5.2)

[jira] [Commented] (SPARK-10574) HashingTF should use MurmurHash3

2016-04-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240048#comment-15240048 ] Joseph K. Bradley commented on SPARK-10574: --- [~yanboliang] Will you have time to work on this?

[jira] [Updated] (SPARK-14606) Different maxBins value for categorical and continuous features in RandomForest implementation.

2016-04-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14606: -- Shepherd: (was: Xiangrui Meng) > Different maxBins value for categorical and

  1   2   3   >