[jira] [Commented] (SPARK-19691) Calculating percentile of decimal column fails with ClassCastException

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881694#comment-15881694 ] Apache Spark commented on SPARK-19691: -- User 'maropu' has created a pull request for this issue:

[jira] [Commented] (SPARK-19714) Bucketizer Bug Regarding Handling Unbucketed Inputs

2017-02-23 Thread Wojciech Szymanski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881723#comment-15881723 ] Wojciech Szymanski commented on SPARK-19714: I fully agree with you Bill, that "invalid" is

[jira] [Commented] (SPARK-19698) Race condition in stale attempt task completion vs current attempt task completion when task is doing persistent state changes

2017-02-23 Thread Jisoo Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881771#comment-15881771 ] Jisoo Kim commented on SPARK-19698: --- Ah, I see what you mean. I don't use Spark's speculation feature,

[jira] [Commented] (SPARK-19720) Redact sensitive information from SparkSubmit console output

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881773#comment-15881773 ] Apache Spark commented on SPARK-19720: -- User 'markgrover' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19720) Redact sensitive information from SparkSubmit console output

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19720: Assignee: Apache Spark > Redact sensitive information from SparkSubmit console output >

[jira] [Assigned] (SPARK-19720) Redact sensitive information from SparkSubmit console output

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19720: Assignee: (was: Apache Spark) > Redact sensitive information from SparkSubmit console

[jira] [Commented] (SPARK-19263) DAGScheduler should avoid sending conflicting task set.

2017-02-23 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881406#comment-15881406 ] Kay Ousterhout commented on SPARK-19263: Just noting that this was fixed by

[jira] [Resolved] (SPARK-19674) Ignore driver accumulator updates don't belong to the execution when merging all accumulator updates

2017-02-23 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-19674. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17009

[jira] [Commented] (SPARK-19636) Feature parity for correlation statistics in MLlib

2017-02-23 Thread Timothy Hunter (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881457#comment-15881457 ] Timothy Hunter commented on SPARK-19636: After working on it, I realized that Column operations

[jira] [Updated] (SPARK-16920) Investigate and fix issues introduced in SPARK-15858

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-16920: -- Target Version/s: (was: 2.2.0) > Investigate and fix issues introduced in

[jira] [Assigned] (SPARK-16920) Investigate and fix issues introduced in SPARK-15858

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-16920: - Assignee: Mahmoud Rawas > Investigate and fix issues introduced in SPARK-15858

[jira] [Commented] (SPARK-16920) Investigate and fix issues introduced in SPARK-15858

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881537#comment-15881537 ] Joseph K. Bradley commented on SPARK-16920: --- Thanks for adding that gist! I agree with your

[jira] [Resolved] (SPARK-16920) Investigate and fix issues introduced in SPARK-15858

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-16920. --- Resolution: Done Fix Version/s: 2.2.0 Target Version/s: 2.2.0 >

[jira] [Commented] (SPARK-19698) Race condition in stale attempt task completion vs current attempt task completion when task is doing persistent state changes

2017-02-23 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881564#comment-15881564 ] Kay Ousterhout commented on SPARK-19698: I see -- I agree that everything in your description is

[jira] [Commented] (SPARK-19596) After a Stage is completed, all Tasksets for the stage should be marked as zombie

2017-02-23 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881726#comment-15881726 ] Kay Ousterhout commented on SPARK-19596: I agree that this is an issue (although it would be

[jira] [Created] (SPARK-19720) Redact sensitive information from SparkSubmit console output

2017-02-23 Thread Mark Grover (JIRA)
Mark Grover created SPARK-19720: --- Summary: Redact sensitive information from SparkSubmit console output Key: SPARK-19720 URL: https://issues.apache.org/jira/browse/SPARK-19720 Project: Spark

[jira] [Updated] (SPARK-14772) Python ML Params.copy treats uid, paramMaps differently than Scala

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14772: -- Shepherd: Joseph K. Bradley Affects Version/s: 2.1.0 Target

[jira] [Assigned] (SPARK-14772) Python ML Params.copy treats uid, paramMaps differently than Scala

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-14772: - Assignee: Bryan Cutler > Python ML Params.copy treats uid, paramMaps

[jira] [Commented] (SPARK-4563) Allow spark driver to bind to different ip then advertise ip

2017-02-23 Thread Alex Hanson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881819#comment-15881819 ] Alex Hanson commented on SPARK-4563: I have a similar issue where I'm running Spark v1.6.3 and wanted

[jira] [Commented] (SPARK-19698) Race condition in stale attempt task completion vs current attempt task completion when task is doing persistent state changes

2017-02-23 Thread Jisoo Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881525#comment-15881525 ] Jisoo Kim commented on SPARK-19698: --- [~kayousterhout] Thanks for linking the JIRA ticket, I agree that

[jira] [Commented] (SPARK-19715) Option to Strip Paths in FileSource

2017-02-23 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881731#comment-15881731 ] Michael Armbrust commented on SPARK-19715: -- [~lwlin] another file source features you might want

[jira] [Created] (SPARK-19721) Good error message for version mismatch in log files

2017-02-23 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-19721: Summary: Good error message for version mismatch in log files Key: SPARK-19721 URL: https://issues.apache.org/jira/browse/SPARK-19721 Project: Spark

[jira] [Updated] (SPARK-14772) Python ML Params.copy treats uid, paramMaps differently than Scala

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14772: -- Fix Version/s: 2.2.0 > Python ML Params.copy treats uid, paramMaps differently than

[jira] [Updated] (SPARK-18812) Clarify "Spark ML"

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18812: -- Target Version/s: 2.1.0 (was: 2.1.1, 2.2.0) > Clarify "Spark ML" > --

[jira] [Commented] (SPARK-18822) Support ML Pipeline in SparkR

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881098#comment-15881098 ] Joseph K. Bradley commented on SPARK-18822: --- How's this going? Just checking in; I know

[jira] [Updated] (SPARK-13786) Pyspark ml.tuning support export/import

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-13786: -- Target Version/s: 2.3.0 (was: 2.2.0) > Pyspark ml.tuning support export/import >

[jira] [Resolved] (SPARK-18699) Spark CSV parsing types other than String throws exception when malformed

2017-02-23 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-18699. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16928

[jira] [Assigned] (SPARK-18699) Spark CSV parsing types other than String throws exception when malformed

2017-02-23 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-18699: --- Assignee: Takeshi Yamamuro > Spark CSV parsing types other than String throws exception

[jira] [Resolved] (SPARK-19706) add Column.contains in pyspark

2017-02-23 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-19706. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17036

[jira] [Comment Edited] (SPARK-19711) Bug in gapply function

2017-02-23 Thread Luis Felipe Sant Ana (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881108#comment-15881108 ] Luis Felipe Sant Ana edited comment on SPARK-19711 at 2/23/17 7:45 PM:

[jira] [Closed] (SPARK-19717) Expanding Spark ML under Different Namespace

2017-02-23 Thread Shouheng Yi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shouheng Yi closed SPARK-19717. --- Resolution: Fixed Duplicated issue https://issues.apache.org/jira/browse/SPARK-19498 > Expanding

[jira] [Updated] (SPARK-15571) Pipeline unit test improvements for 2.3

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-15571: -- Target Version/s: 2.3.0 (was: 2.2.0) > Pipeline unit test improvements for 2.3 >

[jira] [Commented] (SPARK-15571) Pipeline unit test improvements for 2.2

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881102#comment-15881102 ] Joseph K. Bradley commented on SPARK-15571: --- [~rowanv] Thanks, and sorry for the long delay!

[jira] [Updated] (SPARK-15571) Pipeline unit test improvements for 2.3

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-15571: -- Summary: Pipeline unit test improvements for 2.3 (was: Pipeline unit test

[jira] [Created] (SPARK-19717) Expanding Spark ML under Different Namespace

2017-02-23 Thread Shouheng Yi (JIRA)
Shouheng Yi created SPARK-19717: --- Summary: Expanding Spark ML under Different Namespace Key: SPARK-19717 URL: https://issues.apache.org/jira/browse/SPARK-19717 Project: Spark Issue Type: Wish

[jira] [Assigned] (SPARK-19719) Structured Streaming write to Kafka

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19719: Assignee: Apache Spark > Structured Streaming write to Kafka >

[jira] [Commented] (SPARK-19719) Structured Streaming write to Kafka

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881346#comment-15881346 ] Apache Spark commented on SPARK-19719: -- User 'tcondie' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19719) Structured Streaming write to Kafka

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19719: Assignee: (was: Apache Spark) > Structured Streaming write to Kafka >

[jira] [Comment Edited] (SPARK-19698) Race condition in stale attempt task completion vs current attempt task completion when task is doing persistent state changes

2017-02-23 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881358#comment-15881358 ] Kay Ousterhout edited comment on SPARK-19698 at 2/23/17 9:57 PM: - I think

[jira] [Resolved] (SPARK-19684) Move info about running specific tests to developer website

2017-02-23 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout resolved SPARK-19684. Resolution: Fixed Fix Version/s: 2.2.0 > Move info about running specific tests to

[jira] [Updated] (SPARK-19716) Dataset should allow by-name resolution for struct type elements in array

2017-02-23 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-19716: Description: if we have a DataFrame with schema {{a: int, b: int, c: int}}, and convert it to

[jira] [Commented] (SPARK-19717) Expanding Spark ML under Different Namespace

2017-02-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881331#comment-15881331 ] Sean Owen commented on SPARK-19717: --- I don't know that this should be a JIRA. What are you specifically

[jira] [Created] (SPARK-19719) Structured Streaming write to Kafka

2017-02-23 Thread Tyson Condie (JIRA)
Tyson Condie created SPARK-19719: Summary: Structured Streaming write to Kafka Key: SPARK-19719 URL: https://issues.apache.org/jira/browse/SPARK-19719 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-19711) Bug in gapply function

2017-02-23 Thread Luis Felipe Sant Ana (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881108#comment-15881108 ] Luis Felipe Sant Ana commented on SPARK-19711: -- Hi Felix, I have removed the DATA column

[jira] [Updated] (SPARK-18618) SparkR GLM model predict should support type as a argument

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18618: -- Labels: (was: 2.2.0) > SparkR GLM model predict should support type as a argument >

[jira] [Updated] (SPARK-18592) Move DT/RF/GBT Param setter methods to subclasses

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18592: -- Target Version/s: 2.1.0 (was: 2.1.0, 2.2.0) > Move DT/RF/GBT Param setter methods to

[jira] [Updated] (SPARK-18924) Improve collect/createDataFrame performance in SparkR

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18924: -- Target Version/s: (was: 2.2.0) > Improve collect/createDataFrame performance in

[jira] [Comment Edited] (SPARK-19711) Bug in gapply function

2017-02-23 Thread Luis Felipe Sant Ana (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881108#comment-15881108 ] Luis Felipe Sant Ana edited comment on SPARK-19711 at 2/23/17 7:47 PM:

[jira] [Created] (SPARK-19716) Dataset should allow by-name resolution for struct type elements in array

2017-02-23 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-19716: --- Summary: Dataset should allow by-name resolution for struct type elements in array Key: SPARK-19716 URL: https://issues.apache.org/jira/browse/SPARK-19716 Project:

[jira] [Updated] (SPARK-19716) Dataset should allow by-name resolution for struct type elements in array

2017-02-23 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-19716: Description: if we have a DataFrame with schema {{}}, and convert it to Dataset with {{case class

[jira] [Reopened] (SPARK-19717) Expanding Spark ML under Different Namespace

2017-02-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reopened SPARK-19717: --- > Expanding Spark ML under Different Namespace > > >

[jira] [Resolved] (SPARK-19717) Expanding Spark ML under Different Namespace

2017-02-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-19717. --- Resolution: Duplicate > Expanding Spark ML under Different Namespace >

[jira] [Updated] (SPARK-17822) JVMObjectTracker.objMap may leak JVM objects

2017-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-17822: -- Target Version/s: 2.1.0, 2.0.3 (was: 2.0.3, 2.1.1, 2.2.0) > JVMObjectTracker.objMap

[jira] [Created] (SPARK-19718) Fix flaky test: org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite: stress test for failOnDataLoss=false

2017-02-23 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-19718: Summary: Fix flaky test: org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite: stress test for failOnDataLoss=false Key: SPARK-19718 URL:

[jira] [Updated] (SPARK-19716) Dataset should allow by-name resolution for struct type elements in array

2017-02-23 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-19716: Description: if we have a DataFrame with schema {{a: int, b: int, c: int}}, and convert it to

[jira] [Commented] (SPARK-19698) Race condition in stale attempt task completion vs current attempt task completion when task is doing persistent state changes

2017-02-23 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881358#comment-15881358 ] Kay Ousterhout commented on SPARK-19698: I think this is the same issue as SPARK-19263 -- can you

[jira] [Commented] (SPARK-19459) ORC tables cannot be read when they contain char/varchar columns

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881176#comment-15881176 ] Apache Spark commented on SPARK-19459: -- User 'hvanhovell' has created a pull request for this issue:

[jira] [Updated] (SPARK-19716) Dataset should allow by-name resolution for struct type elements in array

2017-02-23 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-19716: Description: if we have a DataFrame with schema {{a: int, b: int, c: int}}, and convert it to

[jira] [Updated] (SPARK-19718) Fix flaky test: org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite: stress test for failOnDataLoss=false

2017-02-23 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-19718: - Description: SPARK-19617 changed HDFSMetadataLog to enable interrupts when using the local file

[jira] [Commented] (SPARK-19715) Option to Strip Paths in FileSource

2017-02-23 Thread Liwei Lin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881965#comment-15881965 ] Liwei Lin commented on SPARK-19715: --- I'll work on this. Thanks! > Option to Strip Paths in FileSource

[jira] [Commented] (SPARK-19690) Join a streaming DataFrame with a batch DataFrame may not work

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882073#comment-15882073 ] Apache Spark commented on SPARK-19690: -- User 'uncleGen' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19723) create table for data source tables should work with an non-existent location

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19723: Assignee: (was: Apache Spark) > create table for data source tables should work with

[jira] [Commented] (SPARK-4563) Allow spark driver to bind to different ip then advertise ip

2017-02-23 Thread Danny Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881893#comment-15881893 ] Danny Robinson commented on SPARK-4563: --- Updated my solution for Spark 1.6.3 which seemed to develop

[jira] [Commented] (SPARK-14772) Python ML Params.copy treats uid, paramMaps differently than Scala

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881927#comment-15881927 ] Apache Spark commented on SPARK-14772: -- User 'BryanCutler' has created a pull request for this

[jira] [Comment Edited] (SPARK-19698) Race condition in stale attempt task completion vs current attempt task completion when task is doing persistent state changes

2017-02-23 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881991#comment-15881991 ] Mridul Muralidharan edited comment on SPARK-19698 at 2/24/17 5:25 AM:

[jira] [Commented] (SPARK-19698) Race condition in stale attempt task completion vs current attempt task completion when task is doing persistent state changes

2017-02-23 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881991#comment-15881991 ] Mridul Muralidharan commented on SPARK-19698: - Depending on ordering and semantics of task

[jira] [Assigned] (SPARK-19722) Clean up the usage of EliminateSubqueryAliases

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19722: Assignee: Apache Spark (was: Xiao Li) > Clean up the usage of EliminateSubqueryAliases >

[jira] [Assigned] (SPARK-19722) Clean up the usage of EliminateSubqueryAliases

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19722: Assignee: Xiao Li (was: Apache Spark) > Clean up the usage of EliminateSubqueryAliases >

[jira] [Commented] (SPARK-19722) Clean up the usage of EliminateSubqueryAliases

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881990#comment-15881990 ] Apache Spark commented on SPARK-19722: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Commented] (SPARK-19721) Good error message for version mismatch in log files

2017-02-23 Thread Liwei Lin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882034#comment-15882034 ] Liwei Lin commented on SPARK-19721: --- I'd like to work on this too. Thanks. > Good error message for

[jira] [Commented] (SPARK-19723) create table for data source tables should work with an non-existent location

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882147#comment-15882147 ] Apache Spark commented on SPARK-19723: -- User 'windpiger' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19723) create table for data source tables should work with an non-existent location

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19723: Assignee: Apache Spark > create table for data source tables should work with an

[jira] [Commented] (SPARK-14409) Investigate adding a RankingEvaluator to ML

2017-02-23 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882163#comment-15882163 ] Nick Pentreath commented on SPARK-14409: [~roberto.mirizzi] the {{goodThreshold}} param seems

[jira] [Commented] (SPARK-17495) Hive hash implementation

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881980#comment-15881980 ] Apache Spark commented on SPARK-17495: -- User 'tejasapatil' has created a pull request for this

[jira] [Commented] (SPARK-14703) Spark uses SLF4J, but actually relies quite heavily on Log4J

2017-02-23 Thread Sheng Luo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881981#comment-15881981 ] Sheng Luo commented on SPARK-14703: --- For a workaround, log4j-over-slf4j.jar can be used as a drop in

[jira] [Commented] (SPARK-19688) Spark on Yarn Credentials File set to different application directory

2017-02-23 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882027#comment-15882027 ] Saisai Shao commented on SPARK-19688: - According to my test, "spark.yarn.credentials.file" will be

[jira] [Assigned] (SPARK-19664) put 'hive.metastore.warehouse.dir' in hadoopConf place

2017-02-23 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-19664: --- Assignee: Song Jun > put 'hive.metastore.warehouse.dir' in hadoopConf place >

[jira] [Resolved] (SPARK-19664) put 'hive.metastore.warehouse.dir' in hadoopConf place

2017-02-23 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-19664. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16996

[jira] [Comment Edited] (SPARK-3246) Support weighted SVMWithSGD for classification of unbalanced dataset

2017-02-23 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882113#comment-15882113 ] Nick Pentreath edited comment on SPARK-3246 at 2/24/17 7:16 AM: Since

[jira] [Closed] (SPARK-3246) Support weighted SVMWithSGD for classification of unbalanced dataset

2017-02-23 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath closed SPARK-3246. - Resolution: Won't Fix > Support weighted SVMWithSGD for classification of unbalanced dataset >

[jira] [Comment Edited] (SPARK-3246) Support weighted SVMWithSGD for classification of unbalanced dataset

2017-02-23 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882113#comment-15882113 ] Nick Pentreath edited comment on SPARK-3246 at 2/24/17 7:15 AM: Since

[jira] [Resolved] (SPARK-14084) Parallel training jobs in model selection

2017-02-23 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14084. Resolution: Duplicate Target Version/s: (was: ) > Parallel training jobs in

[jira] [Commented] (SPARK-14409) Investigate adding a RankingEvaluator to ML

2017-02-23 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882174#comment-15882174 ] Nick Pentreath commented on SPARK-14409: The other option is to work with [~danilo.ascione] PR

[jira] [Resolved] (SPARK-17075) Cardinality Estimation of Predicate Expressions

2017-02-23 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17075. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16395

[jira] [Created] (SPARK-19722) Clean up the usage of EliminateSubqueryAliases

2017-02-23 Thread Xiao Li (JIRA)
Xiao Li created SPARK-19722: --- Summary: Clean up the usage of EliminateSubqueryAliases Key: SPARK-19722 URL: https://issues.apache.org/jira/browse/SPARK-19722 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-17075) Cardinality Estimation of Predicate Expressions

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882043#comment-15882043 ] Apache Spark commented on SPARK-17075: -- User 'lins05' has created a pull request for this issue:

[jira] [Commented] (SPARK-17495) Hive hash implementation

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882152#comment-15882152 ] Apache Spark commented on SPARK-17495: -- User 'tejasapatil' has created a pull request for this

[jira] [Assigned] (SPARK-18939) Timezone support in partition values.

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18939: Assignee: (was: Apache Spark) > Timezone support in partition values. >

[jira] [Assigned] (SPARK-18939) Timezone support in partition values.

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18939: Assignee: Apache Spark > Timezone support in partition values. >

[jira] [Commented] (SPARK-18939) Timezone support in partition values.

2017-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882083#comment-15882083 ] Apache Spark commented on SPARK-18939: -- User 'ueshin' has created a pull request for this issue:

[jira] [Commented] (SPARK-3246) Support weighted SVMWithSGD for classification of unbalanced dataset

2017-02-23 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882113#comment-15882113 ] Nick Pentreath commented on SPARK-3246: --- Since {{mllib}} is in maintenance mode and {{LinearSVC}}

[jira] [Created] (SPARK-19723) create table for data source tables should work with an non-existent location

2017-02-23 Thread Song Jun (JIRA)
Song Jun created SPARK-19723: Summary: create table for data source tables should work with an non-existent location Key: SPARK-19723 URL: https://issues.apache.org/jira/browse/SPARK-19723 Project: Spark

[jira] [Comment Edited] (SPARK-4563) Allow spark driver to bind to different ip then advertise ip

2017-02-23 Thread Danny Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881893#comment-15881893 ] Danny Robinson edited comment on SPARK-4563 at 2/24/17 3:54 AM: Updated my

[jira] [Assigned] (SPARK-17075) Cardinality Estimation of Predicate Expressions

2017-02-23 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-17075: --- Assignee: Ron Hu > Cardinality Estimation of Predicate Expressions >

[jira] [Commented] (SPARK-14084) Parallel training jobs in model selection

2017-02-23 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882123#comment-15882123 ] Nick Pentreath commented on SPARK-14084: I guess we could have put SPARK-19071 into this ticket

[jira] [Commented] (SPARK-17495) Hive hash implementation

2017-02-23 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882161#comment-15882161 ] Tejas Patil commented on SPARK-17495: - I am looking into using hive-hash when `hash()` in called in a

[jira] [Updated] (SPARK-18966) NOT IN subquery with correlated expressions may return incorrect result

2017-02-23 Thread Nattavut Sutyanyong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nattavut Sutyanyong updated SPARK-18966: Issue Type: Sub-task (was: Bug) Parent: SPARK-18455 > NOT IN subquery

[jira] [Updated] (SPARK-14658) when executor lost DagScheduer may submit one stage twice even if the first running taskset for this stage is not finished

2017-02-23 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout updated SPARK-14658: --- Fix Version/s: 2.2.0 > when executor lost DagScheduer may submit one stage twice even if the

[jira] [Commented] (SPARK-19716) Dataset should allow by-name resolution for struct type elements in array

2017-02-23 Thread Simeon Simeonov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881490#comment-15881490 ] Simeon Simeonov commented on SPARK-19716: - This is an important issue because it prevent schema

[jira] [Updated] (SPARK-19459) ORC tables cannot be read when they contain char/varchar columns

2017-02-23 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-19459: Fix Version/s: 2.1.1 > ORC tables cannot be read when they contain char/varchar columns >

  1   2   >