[jira] [Commented] (SPARK-14409) Investigate adding a RankingEvaluator to ML

2017-02-23 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882174#comment-15882174 ] Nick Pentreath commented on SPARK-14409: The other option is to work with [~danilo.ascione] PR

[jira] [Commented] (SPARK-14409) Investigate adding a RankingEvaluator to ML

2017-02-23 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882163#comment-15882163 ] Nick Pentreath commented on SPARK-14409: [~roberto.mirizzi] the {{goodThreshold}} param seems

[jira] [Commented] (SPARK-17495) Hive hash implementation

2017-02-23 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882161#comment-15882161 ] Tejas Patil commented on SPARK-17495: - I am looking into using hive-hash when `hash()` in called in a

[jira] [Commented] (SPARK-17495) Hive hash implementation

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882152#comment-15882152 ] Apache Spark commented on SPARK-17495: -- User 'tejasapatil' has created a pull request for this

[jira] [Assigned] (SPARK-19723) create table for data source tables should work with an non-existent location

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19723: Assignee: (was: Apache Spark) > create table for data source tables should work with

[jira] [Commented] (SPARK-19723) create table for data source tables should work with an non-existent location

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882147#comment-15882147 ] Apache Spark commented on SPARK-19723: -- User 'windpiger' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19723) create table for data source tables should work with an non-existent location

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19723: Assignee: Apache Spark > create table for data source tables should work with an

[jira] [Created] (SPARK-19723) create table for data source tables should work with an non-existent location

2017-02-23 Thread Song Jun (JIRA)
Song Jun created SPARK-19723: Summary: create table for data source tables should work with an non-existent location Key: SPARK-19723 URL: https://issues.apache.org/jira/browse/SPARK-19723 Project: Spark

[jira] [Resolved] (SPARK-14084) Parallel training jobs in model selection

2017-02-23 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14084. Resolution: Duplicate Target Version/s: (was: ) > Parallel training jobs in

[jira] [Commented] (SPARK-14084) Parallel training jobs in model selection

2017-02-23 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882123#comment-15882123 ] Nick Pentreath commented on SPARK-14084: I guess we could have put SPARK-19071 into this ticket

[jira] [Comment Edited] (SPARK-3246) Support weighted SVMWithSGD for classification of unbalanced dataset

2017-02-23 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882113#comment-15882113 ] Nick Pentreath edited comment on SPARK-3246 at 2/24/17 7:15 AM: Since

[jira] [Comment Edited] (SPARK-3246) Support weighted SVMWithSGD for classification of unbalanced dataset

2017-02-23 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882113#comment-15882113 ] Nick Pentreath edited comment on SPARK-3246 at 2/24/17 7:16 AM: Since

[jira] [Closed] (SPARK-3246) Support weighted SVMWithSGD for classification of unbalanced dataset

2017-02-23 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath closed SPARK-3246. - Resolution: Won't Fix > Support weighted SVMWithSGD for classification of unbalanced dataset >

[jira] [Commented] (SPARK-3246) Support weighted SVMWithSGD for classification of unbalanced dataset

2017-02-23 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882113#comment-15882113 ] Nick Pentreath commented on SPARK-3246: --- Since {{mllib}} is in maintenance mode and {{LinearSVC}}

[jira] [Assigned] (SPARK-19664) put 'hive.metastore.warehouse.dir' in hadoopConf place

2017-02-23 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-19664: --- Assignee: Song Jun > put 'hive.metastore.warehouse.dir' in hadoopConf place >

[jira] [Resolved] (SPARK-19664) put 'hive.metastore.warehouse.dir' in hadoopConf place

2017-02-23 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-19664. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16996

[jira] [Assigned] (SPARK-18939) Timezone support in partition values.

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18939: Assignee: (was: Apache Spark) > Timezone support in partition values. >

[jira] [Assigned] (SPARK-18939) Timezone support in partition values.

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18939: Assignee: Apache Spark > Timezone support in partition values. >

[jira] [Commented] (SPARK-18939) Timezone support in partition values.

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882083#comment-15882083 ] Apache Spark commented on SPARK-18939: -- User 'ueshin' has created a pull request for this issue:

[jira] [Commented] (SPARK-19690) Join a streaming DataFrame with a batch DataFrame may not work

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882073#comment-15882073 ] Apache Spark commented on SPARK-19690: -- User 'uncleGen' has created a pull request for this issue:

[jira] [Commented] (SPARK-17075) Cardinality Estimation of Predicate Expressions

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882043#comment-15882043 ] Apache Spark commented on SPARK-17075: -- User 'lins05' has created a pull request for this issue:

[jira] [Commented] (SPARK-19721) Good error message for version mismatch in log files

2017-02-23 Thread Liwei Lin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882034#comment-15882034 ] Liwei Lin commented on SPARK-19721: --- I'd like to work on this too. Thanks. > Good error message for

[jira] [Commented] (SPARK-19688) Spark on Yarn Credentials File set to different application directory

2017-02-23 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882027#comment-15882027 ] Saisai Shao commented on SPARK-19688: - According to my test, "spark.yarn.credentials.file" will be

[jira] [Commented] (SPARK-19698) Race condition in stale attempt task completion vs current attempt task completion when task is doing persistent state changes

2017-02-23 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881991#comment-15881991 ] Mridul Muralidharan commented on SPARK-19698: - Depending on ordering and semantics of task

[jira] [Comment Edited] (SPARK-19698) Race condition in stale attempt task completion vs current attempt task completion when task is doing persistent state changes

2017-02-23 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881991#comment-15881991 ] Mridul Muralidharan edited comment on SPARK-19698 at 2/24/17 5:25 AM:

[jira] [Assigned] (SPARK-19722) Clean up the usage of EliminateSubqueryAliases

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19722: Assignee: Xiao Li (was: Apache Spark) > Clean up the usage of EliminateSubqueryAliases >

[jira] [Assigned] (SPARK-19722) Clean up the usage of EliminateSubqueryAliases

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19722: Assignee: Apache Spark (was: Xiao Li) > Clean up the usage of EliminateSubqueryAliases >

[jira] [Commented] (SPARK-19722) Clean up the usage of EliminateSubqueryAliases

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881990#comment-15881990 ] Apache Spark commented on SPARK-19722: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Created] (SPARK-19722) Clean up the usage of EliminateSubqueryAliases

2017-02-23 Thread Xiao Li (JIRA)
Xiao Li created SPARK-19722: --- Summary: Clean up the usage of EliminateSubqueryAliases Key: SPARK-19722 URL: https://issues.apache.org/jira/browse/SPARK-19722 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-14703) Spark uses SLF4J, but actually relies quite heavily on Log4J

2017-02-23 Thread Sheng Luo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881981#comment-15881981 ] Sheng Luo commented on SPARK-14703: --- For a workaround, log4j-over-slf4j.jar can be used as a drop in

[jira] [Commented] (SPARK-17495) Hive hash implementation

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881980#comment-15881980 ] Apache Spark commented on SPARK-17495: -- User 'tejasapatil' has created a pull request for this

[jira] [Commented] (SPARK-19715) Option to Strip Paths in FileSource

2017-02-23 Thread Liwei Lin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881965#comment-15881965 ] Liwei Lin commented on SPARK-19715: --- I'll work on this. Thanks! > Option to Strip Paths in FileSource

[jira] [Commented] (SPARK-14772) Python ML Params.copy treats uid, paramMaps differently than Scala

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881927#comment-15881927 ] Apache Spark commented on SPARK-14772: -- User 'BryanCutler' has created a pull request for this

[jira] [Assigned] (SPARK-17075) Cardinality Estimation of Predicate Expressions

2017-02-23 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-17075: --- Assignee: Ron Hu > Cardinality Estimation of Predicate Expressions >

[jira] [Resolved] (SPARK-17075) Cardinality Estimation of Predicate Expressions

2017-02-23 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17075. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16395

[jira] [Comment Edited] (SPARK-4563) Allow spark driver to bind to different ip then advertise ip

2017-02-23 Thread Danny Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881893#comment-15881893 ] Danny Robinson edited comment on SPARK-4563 at 2/24/17 3:54 AM: Updated my

[jira] [Commented] (SPARK-4563) Allow spark driver to bind to different ip then advertise ip

2017-02-23 Thread Danny Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881893#comment-15881893 ] Danny Robinson commented on SPARK-4563: --- Updated my solution for Spark 1.6.3 which seemed to develop

[jira] [Commented] (SPARK-4563) Allow spark driver to bind to different ip then advertise ip

2017-02-23 Thread Alex Hanson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881819#comment-15881819 ] Alex Hanson commented on SPARK-4563: I have a similar issue where I'm running Spark v1.6.3 and wanted

[jira] [Updated] (SPARK-14772) Python ML Params.copy treats uid, paramMaps differently than Scala

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14772: -- Fix Version/s: 2.2.0 > Python ML Params.copy treats uid, paramMaps differently than

[jira] [Assigned] (SPARK-19720) Redact sensitive information from SparkSubmit console output

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19720: Assignee: (was: Apache Spark) > Redact sensitive information from SparkSubmit console

[jira] [Assigned] (SPARK-19720) Redact sensitive information from SparkSubmit console output

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19720: Assignee: Apache Spark > Redact sensitive information from SparkSubmit console output >

[jira] [Commented] (SPARK-19720) Redact sensitive information from SparkSubmit console output

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881773#comment-15881773 ] Apache Spark commented on SPARK-19720: -- User 'markgrover' has created a pull request for this issue:

[jira] [Commented] (SPARK-19698) Race condition in stale attempt task completion vs current attempt task completion when task is doing persistent state changes

2017-02-23 Thread Jisoo Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881771#comment-15881771 ] Jisoo Kim commented on SPARK-19698: --- Ah, I see what you mean. I don't use Spark's speculation feature,

[jira] [Updated] (SPARK-14772) Python ML Params.copy treats uid, paramMaps differently than Scala

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14772: -- Shepherd: Joseph K. Bradley Affects Version/s: 2.1.0 Target

[jira] [Assigned] (SPARK-14772) Python ML Params.copy treats uid, paramMaps differently than Scala

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-14772: - Assignee: Bryan Cutler > Python ML Params.copy treats uid, paramMaps

[jira] [Created] (SPARK-19721) Good error message for version mismatch in log files

2017-02-23 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-19721: Summary: Good error message for version mismatch in log files Key: SPARK-19721 URL: https://issues.apache.org/jira/browse/SPARK-19721 Project: Spark

[jira] [Created] (SPARK-19720) Redact sensitive information from SparkSubmit console output

2017-02-23 Thread Mark Grover (JIRA)
Mark Grover created SPARK-19720: --- Summary: Redact sensitive information from SparkSubmit console output Key: SPARK-19720 URL: https://issues.apache.org/jira/browse/SPARK-19720 Project: Spark

[jira] [Commented] (SPARK-19715) Option to Strip Paths in FileSource

2017-02-23 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881731#comment-15881731 ] Michael Armbrust commented on SPARK-19715: -- [~lwlin] another file source features you might want

[jira] [Commented] (SPARK-19596) After a Stage is completed, all Tasksets for the stage should be marked as zombie

2017-02-23 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881726#comment-15881726 ] Kay Ousterhout commented on SPARK-19596: I agree that this is an issue (although it would be

[jira] [Commented] (SPARK-19714) Bucketizer Bug Regarding Handling Unbucketed Inputs

2017-02-23 Thread Wojciech Szymanski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881723#comment-15881723 ] Wojciech Szymanski commented on SPARK-19714: I fully agree with you Bill, that "invalid" is

[jira] [Commented] (SPARK-19691) Calculating percentile of decimal column fails with ClassCastException

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881694#comment-15881694 ] Apache Spark commented on SPARK-19691: -- User 'maropu' has created a pull request for this issue:

[jira] [Commented] (SPARK-19698) Race condition in stale attempt task completion vs current attempt task completion when task is doing persistent state changes

2017-02-23 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881688#comment-15881688 ] Kay Ousterhout commented on SPARK-19698: My concern is that there are other cases in Spark where

[jira] [Commented] (SPARK-19714) Bucketizer Bug Regarding Handling Unbucketed Inputs

2017-02-23 Thread Bill Chambers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881676#comment-15881676 ] Bill Chambers commented on SPARK-19714: --- "Invalid" is a poor descriptor IMO. Invalid should be

[jira] [Commented] (SPARK-16122) Spark History Server REST API missing an environment endpoint per application

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881674#comment-15881674 ] Apache Spark commented on SPARK-16122: -- User 'uncleGen' has created a pull request for this issue:

[jira] [Commented] (SPARK-19709) CSV datasource fails to read empty file

2017-02-23 Thread Wojciech Szymanski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881672#comment-15881672 ] Wojciech Szymanski commented on SPARK-19709: Thanks, I will try to fix it soon. > CSV

[jira] [Commented] (SPARK-19709) CSV datasource fails to read empty file

2017-02-23 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881648#comment-15881648 ] Hyukjin Kwon commented on SPARK-19709: -- Please go ahead. (but I _personally_ recommend you open a PR

[jira] [Commented] (SPARK-19698) Race condition in stale attempt task completion vs current attempt task completion when task is doing persistent state changes

2017-02-23 Thread Jisoo Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881644#comment-15881644 ] Jisoo Kim commented on SPARK-19698: --- [~kayousterhout] If the failed task gets re-tried, as long as

[jira] [Commented] (SPARK-19709) CSV datasource fails to read empty file

2017-02-23 Thread Wojciech Szymanski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881633#comment-15881633 ] Wojciech Szymanski commented on SPARK-19709: [~hyukjin.kwon] I can also look at this if you

[jira] [Updated] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-02-23 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18057: - Summary: Update structured streaming kafka from 10.0.1 to 10.2.0 (was: Update

[jira] [Commented] (SPARK-7354) Flaky test: o.a.s.deploy.SparkSubmitSuite --jars

2017-02-23 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881593#comment-15881593 ] Andrew Ash commented on SPARK-7354: --- We saw a flake for this test in the k8s repo's Travis builds too:

[jira] [Assigned] (SPARK-19373) Mesos implementation of spark.scheduler.minRegisteredResourcesRatio looks at acquired cores rather than registerd cores

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19373: Assignee: (was: Apache Spark) > Mesos implementation of

[jira] [Assigned] (SPARK-19373) Mesos implementation of spark.scheduler.minRegisteredResourcesRatio looks at acquired cores rather than registerd cores

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19373: Assignee: Apache Spark > Mesos implementation of

[jira] [Commented] (SPARK-19373) Mesos implementation of spark.scheduler.minRegisteredResourcesRatio looks at acquired cores rather than registerd cores

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881572#comment-15881572 ] Apache Spark commented on SPARK-19373: -- User 'mgummelt' has created a pull request for this issue:

[jira] [Commented] (SPARK-19714) Bucketizer Bug Regarding Handling Unbucketed Inputs

2017-02-23 Thread Wojciech Szymanski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881568#comment-15881568 ] Wojciech Szymanski commented on SPARK-19714: IMHO Bucketizer works as expected. I guess that

[jira] [Closed] (SPARK-14523) Feature parity for Statistics ML with MLlib

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley closed SPARK-14523. - Resolution: Done > Feature parity for Statistics ML with MLlib >

[jira] [Commented] (SPARK-14523) Feature parity for Statistics ML with MLlib

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881567#comment-15881567 ] Joseph K. Bradley commented on SPARK-14523: --- Alright, given that there are now 3 more subtasks

[jira] [Commented] (SPARK-19698) Race condition in stale attempt task completion vs current attempt task completion when task is doing persistent state changes

2017-02-23 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881564#comment-15881564 ] Kay Ousterhout commented on SPARK-19698: I see -- I agree that everything in your description is

[jira] [Commented] (SPARK-16920) Investigate and fix issues introduced in SPARK-15858

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881537#comment-15881537 ] Joseph K. Bradley commented on SPARK-16920: --- Thanks for adding that gist! I agree with your

[jira] [Resolved] (SPARK-16920) Investigate and fix issues introduced in SPARK-15858

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-16920. --- Resolution: Done Fix Version/s: 2.2.0 Target Version/s: 2.2.0 >

[jira] [Assigned] (SPARK-16920) Investigate and fix issues introduced in SPARK-15858

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-16920: - Assignee: Mahmoud Rawas > Investigate and fix issues introduced in SPARK-15858

[jira] [Updated] (SPARK-16920) Investigate and fix issues introduced in SPARK-15858

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-16920: -- Target Version/s: (was: 2.2.0) > Investigate and fix issues introduced in

[jira] [Commented] (SPARK-19698) Race condition in stale attempt task completion vs current attempt task completion when task is doing persistent state changes

2017-02-23 Thread Jisoo Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881525#comment-15881525 ] Jisoo Kim commented on SPARK-19698: --- [~kayousterhout] Thanks for linking the JIRA ticket, I agree that

[jira] [Updated] (SPARK-19459) ORC tables cannot be read when they contain char/varchar columns

2017-02-23 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-19459: Fix Version/s: 2.1.1 > ORC tables cannot be read when they contain char/varchar columns >

[jira] [Commented] (SPARK-19716) Dataset should allow by-name resolution for struct type elements in array

2017-02-23 Thread Simeon Simeonov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881490#comment-15881490 ] Simeon Simeonov commented on SPARK-19716: - This is an important issue because it prevent schema

[jira] [Commented] (SPARK-19635) Feature parity for Chi-square hypothesis testing in MLlib

2017-02-23 Thread Timothy Hunter (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881471#comment-15881471 ] Timothy Hunter commented on SPARK-19635: After working on it, I realized that Column operations

[jira] [Commented] (SPARK-19636) Feature parity for correlation statistics in MLlib

2017-02-23 Thread Timothy Hunter (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881457#comment-15881457 ] Timothy Hunter commented on SPARK-19636: After working on it, I realized that Column operations

[jira] [Updated] (SPARK-14658) when executor lost DagScheduer may submit one stage twice even if the first running taskset for this stage is not finished

2017-02-23 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout updated SPARK-14658: --- Fix Version/s: 2.2.0 > when executor lost DagScheduer may submit one stage twice even if the

[jira] [Assigned] (SPARK-19674) Ignore driver accumulator updates don't belong to the execution when merging all accumulator updates

2017-02-23 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-19674: --- Assignee: Carson Wang > Ignore driver accumulator updates don't belong to the execution

[jira] [Resolved] (SPARK-19674) Ignore driver accumulator updates don't belong to the execution when merging all accumulator updates

2017-02-23 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-19674. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17009

[jira] [Commented] (SPARK-19718) Fix flaky test: org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite: stress test for failOnDataLoss=false

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881418#comment-15881418 ] Apache Spark commented on SPARK-19718: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19718) Fix flaky test: org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite: stress test for failOnDataLoss=false

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19718: Assignee: Apache Spark > Fix flaky test: >

[jira] [Assigned] (SPARK-19718) Fix flaky test: org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite: stress test for failOnDataLoss=false

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19718: Assignee: (was: Apache Spark) > Fix flaky test: >

[jira] [Resolved] (SPARK-14658) when executor lost DagScheduer may submit one stage twice even if the first running taskset for this stage is not finished

2017-02-23 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout resolved SPARK-14658. Resolution: Duplicate I'm fairly sure this duplicates SPARK-19263, as Mark mentioned on

[jira] [Commented] (SPARK-19263) DAGScheduler should avoid sending conflicting task set.

2017-02-23 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881406#comment-15881406 ] Kay Ousterhout commented on SPARK-19263: Just noting that this was fixed by

[jira] [Updated] (SPARK-19718) Fix flaky test: org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite: stress test for failOnDataLoss=false

2017-02-23 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-19718: - Description: SPARK-19617 changed HDFSMetadataLog to enable interrupts when using the local file

[jira] [Comment Edited] (SPARK-19698) Race condition in stale attempt task completion vs current attempt task completion when task is doing persistent state changes

2017-02-23 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881358#comment-15881358 ] Kay Ousterhout edited comment on SPARK-19698 at 2/23/17 9:57 PM: - I think

[jira] [Commented] (SPARK-19698) Race condition in stale attempt task completion vs current attempt task completion when task is doing persistent state changes

2017-02-23 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881358#comment-15881358 ] Kay Ousterhout commented on SPARK-19698: I think this is the same issue as SPARK-19263 -- can you

[jira] [Commented] (SPARK-19719) Structured Streaming write to Kafka

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881346#comment-15881346 ] Apache Spark commented on SPARK-19719: -- User 'tcondie' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19719) Structured Streaming write to Kafka

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19719: Assignee: Apache Spark > Structured Streaming write to Kafka >

[jira] [Assigned] (SPARK-19719) Structured Streaming write to Kafka

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19719: Assignee: (was: Apache Spark) > Structured Streaming write to Kafka >

[jira] [Created] (SPARK-19719) Structured Streaming write to Kafka

2017-02-23 Thread Tyson Condie (JIRA)
Tyson Condie created SPARK-19719: Summary: Structured Streaming write to Kafka Key: SPARK-19719 URL: https://issues.apache.org/jira/browse/SPARK-19719 Project: Spark Issue Type: New Feature

[jira] [Resolved] (SPARK-19717) Expanding Spark ML under Different Namespace

2017-02-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-19717. --- Resolution: Duplicate > Expanding Spark ML under Different Namespace >

[jira] [Reopened] (SPARK-19717) Expanding Spark ML under Different Namespace

2017-02-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reopened SPARK-19717: --- > Expanding Spark ML under Different Namespace > > >

[jira] [Closed] (SPARK-19717) Expanding Spark ML under Different Namespace

2017-02-23 Thread Shouheng Yi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shouheng Yi closed SPARK-19717. --- Resolution: Fixed Duplicated issue https://issues.apache.org/jira/browse/SPARK-19498 > Expanding

[jira] [Commented] (SPARK-19717) Expanding Spark ML under Different Namespace

2017-02-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881331#comment-15881331 ] Sean Owen commented on SPARK-19717: --- I don't know that this should be a JIRA. What are you specifically

[jira] [Created] (SPARK-19718) Fix flaky test: org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite: stress test for failOnDataLoss=false

2017-02-23 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-19718: Summary: Fix flaky test: org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite: stress test for failOnDataLoss=false Key: SPARK-19718 URL:

[jira] [Updated] (SPARK-19716) Dataset should allow by-name resolution for struct type elements in array

2017-02-23 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-19716: Description: if we have a DataFrame with schema {{a: int, b: int, c: int}}, and convert it to

[jira] [Updated] (SPARK-19716) Dataset should allow by-name resolution for struct type elements in array

2017-02-23 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-19716: Description: if we have a DataFrame with schema {{a: int, b: int, c: int}}, and convert it to

[jira] [Created] (SPARK-19717) Expanding Spark ML under Different Namespace

2017-02-23 Thread Shouheng Yi (JIRA)
Shouheng Yi created SPARK-19717: --- Summary: Expanding Spark ML under Different Namespace Key: SPARK-19717 URL: https://issues.apache.org/jira/browse/SPARK-19717 Project: Spark Issue Type: Wish

[jira] [Resolved] (SPARK-19684) Move info about running specific tests to developer website

2017-02-23 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout resolved SPARK-19684. Resolution: Fixed Fix Version/s: 2.2.0 > Move info about running specific tests to

  1   2   >