[jira] [Commented] (SPARK-24959) Do not invoke the CSV/JSON parser for empty schema

2019-01-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756983#comment-16756983 ] Apache Spark commented on SPARK-24959: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Assigned] (SPARK-24959) Do not invoke the CSV/JSON parser for empty schema

2019-01-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24959: Assignee: Apache Spark > Do not invoke the CSV/JSON parser for empty schema >

[jira] [Assigned] (SPARK-24959) Do not invoke the CSV/JSON parser for empty schema

2019-01-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24959: Assignee: (was: Apache Spark) > Do not invoke the CSV/JSON parser for empty schema >

[jira] [Resolved] (SPARK-26745) Non-parsing Dataset.count() optimization causes inconsistent results for JSON inputs with empty lines

2019-01-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-26745. -- Resolution: Fixed Assignee: Hyukjin Kwon Fix Version/s: 3.0.0 Fixed in

[jira] [Reopened] (SPARK-24959) Do not invoke the CSV/JSON parser for empty schema

2019-01-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reopened SPARK-24959: -- Assignee: (was: Maxim Gekk) Reverted by SPARK-26745. > Do not invoke the CSV/JSON

[jira] [Resolved] (SPARK-24360) Support Hive 3.1 metastore

2019-01-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-24360. --- Resolution: Fixed Assignee: Dongjoon Hyun Fix Version/s: 3.0.0 This is

[jira] [Commented] (SPARK-12216) Spark failed to delete temp directory

2019-01-30 Thread Kingsley Jones (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756826#comment-16756826 ] Kingsley Jones commented on SPARK-12216: Well, that fits. No source-code formatter for

[jira] [Commented] (SPARK-12216) Spark failed to delete temp directory

2019-01-30 Thread Kingsley Jones (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756825#comment-16756825 ] Kingsley Jones commented on SPARK-12216: {code:powershell} # # Shell to launch local Apache

[jira] [Updated] (SPARK-26791) Some scala codes doesn't show friendly and some description about foreachBatch is misleading

2019-01-30 Thread chaiyongqiang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chaiyongqiang updated SPARK-26791: -- Attachment: multi-watermark.jpg foreachBatch.jpg > Some scala codes doesn't

[jira] [Commented] (SPARK-17998) Reading Parquet files coalesces parts into too few in-memory partitions

2019-01-30 Thread Nicholas Resnick (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756801#comment-16756801 ] Nicholas Resnick commented on SPARK-17998: -- I reproduced the OP's steps above on my local

[jira] [Commented] (SPARK-26786) Handle to treat escaped newline characters('\r','\n') in spark csv

2019-01-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756795#comment-16756795 ] Hyukjin Kwon commented on SPARK-26786: -- Are you saying the newlines should be escaped even they are

[jira] [Assigned] (SPARK-26793) Remove spark.shuffle.manager

2019-01-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26793: Assignee: (was: Apache Spark) > Remove spark.shuffle.manager >

[jira] [Assigned] (SPARK-26793) Remove spark.shuffle.manager

2019-01-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26793: Assignee: Apache Spark > Remove spark.shuffle.manager > > >

[jira] [Created] (SPARK-26793) Remove spark.shuffle.manager

2019-01-30 Thread liuxian (JIRA)
liuxian created SPARK-26793: --- Summary: Remove spark.shuffle.manager Key: SPARK-26793 URL: https://issues.apache.org/jira/browse/SPARK-26793 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-26792) Apply custom log URL to Spark UI

2019-01-30 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756754#comment-16756754 ] Jungtaek Lim commented on SPARK-26792: -- OK. Maybe initiating discussion on dev. mailing list (or

[jira] [Commented] (SPARK-26792) Apply custom log URL to Spark UI

2019-01-30 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756750#comment-16756750 ] Thomas Graves commented on SPARK-26792: --- don't see a problem with changing the default in 3.0, its

[jira] [Commented] (SPARK-26792) Apply custom log URL to Spark UI

2019-01-30 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756747#comment-16756747 ] Jungtaek Lim commented on SPARK-26792: -- cc. [~tgraves] [~jira.shegalov] While I'm not sure we can

[jira] [Created] (SPARK-26792) Apply custom log URL to Spark UI

2019-01-30 Thread Jungtaek Lim (JIRA)
Jungtaek Lim created SPARK-26792: Summary: Apply custom log URL to Spark UI Key: SPARK-26792 URL: https://issues.apache.org/jira/browse/SPARK-26792 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-26791) Some scala codes doesn't show friendly and some description about foreachBatch is misleading

2019-01-30 Thread chaiyongqiang (JIRA)
chaiyongqiang created SPARK-26791: - Summary: Some scala codes doesn't show friendly and some description about foreachBatch is misleading Key: SPARK-26791 URL: https://issues.apache.org/jira/browse/SPARK-26791

[jira] [Commented] (SPARK-26154) Stream-stream joins - left outer join gives inconsistent output

2019-01-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756716#comment-16756716 ] Sean Owen commented on SPARK-26154: --- Sure, it's a judgment call. If you see people contemplating it as

[jira] [Commented] (SPARK-26154) Stream-stream joins - left outer join gives inconsistent output

2019-01-30 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756709#comment-16756709 ] Jungtaek Lim commented on SPARK-26154: -- [~srowen] Yeah, I just wanted to avoid changing priority by

[jira] [Updated] (SPARK-26154) Stream-stream joins - left outer join gives inconsistent output

2019-01-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26154: -- Labels: correctness (was: ) Priority: Critical (was: Major) [~kabhwan] I think everyone is

[jira] [Updated] (SPARK-26786) Handle to treat escaped newline characters('\r','\n') in spark csv

2019-01-30 Thread vishnuram selvaraj (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vishnuram selvaraj updated SPARK-26786: --- Affects Version/s: (was: 2.4.0) 2.3.0 > Handle to treat

[jira] [Assigned] (SPARK-26790) Yarn executor to self-retrieve log urls and attributes

2019-01-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26790: Assignee: Apache Spark > Yarn executor to self-retrieve log urls and attributes >

[jira] [Updated] (SPARK-26786) Handle to treat escaped newline characters('\r','\n') in spark csv

2019-01-30 Thread vishnuram selvaraj (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vishnuram selvaraj updated SPARK-26786: --- Description: There are some systems like AWS redshift which writes csv files by

[jira] [Updated] (SPARK-26786) Handle to treat escaped newline characters('\r','\n') in spark csv

2019-01-30 Thread vishnuram selvaraj (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vishnuram selvaraj updated SPARK-26786: --- Component/s: SQL PySpark > Handle to treat escaped newline

[jira] [Assigned] (SPARK-26790) Yarn executor to self-retrieve log urls and attributes

2019-01-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26790: Assignee: (was: Apache Spark) > Yarn executor to self-retrieve log urls and

[jira] [Commented] (SPARK-23155) YARN-aggregated executor/driver logs appear unavailable when NM is down

2019-01-30 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756667#comment-16756667 ] Jungtaek Lim commented on SPARK-23155: -- Just added the link of effective PR here. Thanks Vanzin for

[jira] [Updated] (SPARK-26786) Handle to treat escaped newline characters('\r','\n') in spark csv

2019-01-30 Thread vishnuram selvaraj (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vishnuram selvaraj updated SPARK-26786: --- Issue Type: Bug (was: New Feature) > Handle to treat escaped newline

[jira] [Commented] (SPARK-26677) Incorrect results of not(eqNullSafe) when data read from Parquet file

2019-01-30 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756654#comment-16756654 ] Ryan Blue commented on SPARK-26677: --- Thanks, sorry about the mistake. > Incorrect results of

[jira] [Created] (SPARK-26790) Yarn executor to self-retrieve log urls and attributes

2019-01-30 Thread Jungtaek Lim (JIRA)
Jungtaek Lim created SPARK-26790: Summary: Yarn executor to self-retrieve log urls and attributes Key: SPARK-26790 URL: https://issues.apache.org/jira/browse/SPARK-26790 Project: Spark Issue

[jira] [Commented] (SPARK-23155) YARN-aggregated executor/driver logs appear unavailable when NM is down

2019-01-30 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756602#comment-16756602 ] Marcelo Vanzin commented on SPARK-23155: For those looking the patch is actually linked from

[jira] [Commented] (SPARK-26677) Incorrect results of not(eqNullSafe) when data read from Parquet file

2019-01-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756609#comment-16756609 ] Dongjoon Hyun commented on SPARK-26677: --- Hi, [~rdblue]. I moved `2.4.1` from `Fixed Versions`

[jira] [Updated] (SPARK-26677) Incorrect results of not(eqNullSafe) when data read from Parquet file

2019-01-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26677: -- Target Version/s: 2.4.1 Fix Version/s: (was: 2.4.1) > Incorrect results of

[jira] [Commented] (SPARK-23155) YARN-aggregated executor/driver logs appear unavailable when NM is down

2019-01-30 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756593#comment-16756593 ] Jungtaek Lim commented on SPARK-23155: -- [~vanzin] Could we mark this as resolved and fix version

[jira] [Resolved] (SPARK-23155) YARN-aggregated executor/driver logs appear unavailable when NM is down

2019-01-30 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-23155. Resolution: Fixed Assignee: Jungtaek Lim Fix Version/s: 3.0.0 >

[jira] [Commented] (SPARK-22798) Add multiple column support to PySpark StringIndexer

2019-01-30 Thread Huaxin Gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756562#comment-16756562 ] Huaxin Gao commented on SPARK-22798: I will submit a PR soon. Thanks.  > Add multiple column

[jira] [Updated] (SPARK-26771) Make .unpersist(), .destroy() consistently non-blocking by default

2019-01-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26771: -- Docs Text: The RDD and DataFrame .unpersist() method, and Broadcast .destroy() method, take an

[jira] [Created] (SPARK-26789) [k8s] pyspark needs to upload local resources to driver and executor pods

2019-01-30 Thread Oleg Frenkel (JIRA)
Oleg Frenkel created SPARK-26789: Summary: [k8s] pyspark needs to upload local resources to driver and executor pods Key: SPARK-26789 URL: https://issues.apache.org/jira/browse/SPARK-26789 Project:

[jira] [Resolved] (SPARK-26784) Allow running driver pod as provided user

2019-01-30 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-26784. Resolution: Won't Fix As far as I know you can do that with pod templates. So no point in

[jira] [Commented] (SPARK-26777) SQL worked in 2.3.2 and fails in 2.4.0

2019-01-30 Thread Yuri Budilov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756515#comment-16756515 ] Yuri Budilov commented on SPARK-26777: -- good luck.  > SQL worked in 2.3.2 and fails in 2.4.0 >

[jira] [Created] (SPARK-26788) Remove SchedulerExtensionService

2019-01-30 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-26788: -- Summary: Remove SchedulerExtensionService Key: SPARK-26788 URL: https://issues.apache.org/jira/browse/SPARK-26788 Project: Spark Issue Type: Task

[jira] [Updated] (SPARK-26737) Executor/Task STDERR & STDOUT log urls are not correct in Yarn deployment mode

2019-01-30 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-26737: --- Fix Version/s: 3.0.0 > Executor/Task STDERR & STDOUT log urls are not correct in Yarn

[jira] [Resolved] (SPARK-26737) Executor/Task STDERR & STDOUT log urls are not correct in Yarn deployment mode

2019-01-30 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-26737. Resolution: Fixed Assignee: Jungtaek Lim The patch for SPARK-26311 ended up fixing

[jira] [Assigned] (SPARK-26311) [YARN] New feature: custom log URL for stdout/stderr

2019-01-30 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-26311: -- Assignee: Jungtaek Lim > [YARN] New feature: custom log URL for stdout/stderr >

[jira] [Updated] (SPARK-26787) Fix standardization error message in WeightedLeastSquares

2019-01-30 Thread Brian Scannell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Scannell updated SPARK-26787: --- Environment: Tested in Spark 2.4.0 on DataBricks running in 5.1 ML Beta.   was: Tested

[jira] [Assigned] (SPARK-26787) Fix standardization error message in WeightedLeastSquares

2019-01-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26787: Assignee: Apache Spark > Fix standardization error message in WeightedLeastSquares >

[jira] [Assigned] (SPARK-26787) Fix standardization error message in WeightedLeastSquares

2019-01-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26787: Assignee: (was: Apache Spark) > Fix standardization error message in

[jira] [Comment Edited] (SPARK-26775) Update Jenkins nodes to support local volumes for K8s integration tests

2019-01-30 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756415#comment-16756415 ] shane knapp edited comment on SPARK-26775 at 1/30/19 7:23 PM: -- ill install

[jira] [Created] (SPARK-26787) Fix standardization error message in WeightedLeastSquares

2019-01-30 Thread Brian Scannell (JIRA)
Brian Scannell created SPARK-26787: -- Summary: Fix standardization error message in WeightedLeastSquares Key: SPARK-26787 URL: https://issues.apache.org/jira/browse/SPARK-26787 Project: Spark

[jira] [Updated] (SPARK-26718) Fixed integer overflow in SS kafka rateLimit calculation

2019-01-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26718: -- Fix Version/s: 2.3.3 > Fixed integer overflow in SS kafka rateLimit calculation >

[jira] [Resolved] (SPARK-26753) Log4j customization not working for spark-shell

2019-01-30 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-26753. Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23675

[jira] [Assigned] (SPARK-26753) Log4j customization not working for spark-shell

2019-01-30 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-26753: -- Assignee: Ankur Gupta > Log4j customization not working for spark-shell >

[jira] [Commented] (SPARK-26775) Update Jenkins nodes to support local volumes for K8s integration tests

2019-01-30 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756415#comment-16756415 ] shane knapp commented on SPARK-26775: - ill install the latest version on my test worker and run a

[jira] [Commented] (SPARK-26775) Update Jenkins nodes to support local volumes for K8s integration tests

2019-01-30 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756388#comment-16756388 ] shane knapp commented on SPARK-26775: - all done! {noformat} -bash-4.1$ pssh -i -h

[jira] [Commented] (SPARK-26775) Update Jenkins nodes to support local volumes for K8s integration tests

2019-01-30 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756399#comment-16756399 ] Stavros Kontopoulos commented on SPARK-26775: - Yeah v0.25.0 is ancient history I guess :) >

[jira] [Commented] (SPARK-26775) Update Jenkins nodes to support local volumes for K8s integration tests

2019-01-30 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756390#comment-16756390 ] shane knapp commented on SPARK-26775: - we're quite behind w/the minikube version however: v0.25.0

[jira] [Updated] (SPARK-26677) Incorrect results of not(eqNullSafe) when data read from Parquet file

2019-01-30 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-26677: -- Fix Version/s: 2.4.1 > Incorrect results of not(eqNullSafe) when data read from Parquet file >

[jira] [Commented] (SPARK-26775) Update Jenkins nodes to support local volumes for K8s integration tests

2019-01-30 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756375#comment-16756375 ] shane knapp commented on SPARK-26775: - okie dokie, i should get this sorted today. > Update Jenkins

[jira] [Updated] (SPARK-26784) Allow running driver pod as provided user

2019-01-30 Thread Alexander Mukhopad (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Mukhopad updated SPARK-26784: --- Priority: Major (was: Minor) > Allow running driver pod as provided user >

[jira] [Updated] (SPARK-26786) Handle to treat escaped newline characters('\r','\n') in spark csv

2019-01-30 Thread vishnuram selvaraj (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vishnuram selvaraj updated SPARK-26786: --- Description: There are some systems like AWS redshift which writes csv files by

[jira] [Updated] (SPARK-26786) Handle to treat escaped newline characters('\r','\n') in spark csv

2019-01-30 Thread vishnuram selvaraj (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vishnuram selvaraj updated SPARK-26786: --- Description: There are some systems like AWS redshift which writes csv files by

[jira] [Assigned] (SPARK-26775) Update Jenkins nodes to support local volumes for K8s integration tests

2019-01-30 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shane knapp reassigned SPARK-26775: --- Assignee: shane knapp > Update Jenkins nodes to support local volumes for K8s integration

[jira] [Created] (SPARK-26786) Handle to treat escaped newline characters('\r','\n') in spark csv

2019-01-30 Thread vishnuram selvaraj (JIRA)
vishnuram selvaraj created SPARK-26786: -- Summary: Handle to treat escaped newline characters('\r','\n') in spark csv Key: SPARK-26786 URL: https://issues.apache.org/jira/browse/SPARK-26786

[jira] [Assigned] (SPARK-26785) data source v2 API refactor: streaming write

2019-01-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26785: Assignee: Apache Spark (was: Wenchen Fan) > data source v2 API refactor: streaming

[jira] [Assigned] (SPARK-26785) data source v2 API refactor: streaming write

2019-01-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26785: Assignee: Wenchen Fan (was: Apache Spark) > data source v2 API refactor: streaming

[jira] [Assigned] (SPARK-26741) Analyzer incorrectly resolves aggregate function outside of Aggregate operators

2019-01-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26741: Assignee: (was: Apache Spark) > Analyzer incorrectly resolves aggregate function

[jira] [Created] (SPARK-26785) data source v2 API refactor: streaming write

2019-01-30 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-26785: --- Summary: data source v2 API refactor: streaming write Key: SPARK-26785 URL: https://issues.apache.org/jira/browse/SPARK-26785 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-26741) Analyzer incorrectly resolves aggregate function outside of Aggregate operators

2019-01-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26741: Assignee: Apache Spark > Analyzer incorrectly resolves aggregate function outside of

[jira] [Commented] (SPARK-26176) Verify column name when creating table via `STORED AS`

2019-01-30 Thread kevin yu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756245#comment-16756245 ] kevin yu commented on SPARK-26176: -- Hi Mikhail: Sorry for the delay, yes, I am still looking into it.

[jira] [Commented] (SPARK-25692) Flaky test: ChunkFetchIntegrationSuite.fetchBothChunks

2019-01-30 Thread Sanket Reddy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756226#comment-16756226 ] Sanket Reddy commented on SPARK-25692: -- Created a pr [https://github.com/apache/spark/pull/23700] 

[jira] [Assigned] (SPARK-25692) Flaky test: ChunkFetchIntegrationSuite.fetchBothChunks

2019-01-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25692: Assignee: Apache Spark > Flaky test: ChunkFetchIntegrationSuite.fetchBothChunks >

[jira] [Assigned] (SPARK-25692) Flaky test: ChunkFetchIntegrationSuite.fetchBothChunks

2019-01-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25692: Assignee: (was: Apache Spark) > Flaky test:

[jira] [Resolved] (SPARK-26732) Flaky test: SparkContextInfoSuite.getRDDStorageInfo only reports on RDDs that actually persist data

2019-01-30 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro resolved SPARK-26732. -- Resolution: Fixed Assignee: Marcelo Vanzin Fix Version/s: 3.0.0

[jira] [Commented] (SPARK-25692) Flaky test: ChunkFetchIntegrationSuite.fetchBothChunks

2019-01-30 Thread Sanket Reddy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756210#comment-16756210 ] Sanket Reddy commented on SPARK-25692: -- Did some further digging How to reproduce ./build/mvn test

[jira] [Commented] (SPARK-26765) Avro: Validate input and output schema

2019-01-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756178#comment-16756178 ] Apache Spark commented on SPARK-26765: -- User 'gengliangwang' has created a pull request for this

[jira] [Created] (SPARK-26784) Allow running driver pod as provided user

2019-01-30 Thread Alexander Mukhopad (JIRA)
Alexander Mukhopad created SPARK-26784: -- Summary: Allow running driver pod as provided user Key: SPARK-26784 URL: https://issues.apache.org/jira/browse/SPARK-26784 Project: Spark Issue

[jira] [Assigned] (SPARK-26766) Remove the list of filesystems from HadoopDelegationTokenProvider.obtainDelegationTokens

2019-01-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26766: Assignee: (was: Apache Spark) > Remove the list of filesystems from >

[jira] [Assigned] (SPARK-26766) Remove the list of filesystems from HadoopDelegationTokenProvider.obtainDelegationTokens

2019-01-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26766: Assignee: Apache Spark > Remove the list of filesystems from >

[jira] [Updated] (SPARK-26783) Kafka parameter documentation doesn't match with the reality (upper/lowercase)

2019-01-30 Thread Gabor Somogyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated SPARK-26783: -- Priority: Minor (was: Major) > Kafka parameter documentation doesn't match with the reality

[jira] [Commented] (SPARK-23685) Spark Structured Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction)

2019-01-30 Thread Gabor Somogyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756104#comment-16756104 ] Gabor Somogyi commented on SPARK-23685: --- Filed SPARK-26783. > Spark Structured Streaming Kafka

[jira] [Created] (SPARK-26783) Kafka parameter documentation doesn't match with the reality (upper/lowercase)

2019-01-30 Thread Gabor Somogyi (JIRA)
Gabor Somogyi created SPARK-26783: - Summary: Kafka parameter documentation doesn't match with the reality (upper/lowercase) Key: SPARK-26783 URL: https://issues.apache.org/jira/browse/SPARK-26783

[jira] [Comment Edited] (SPARK-23685) Spark Structured Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction)

2019-01-30 Thread Gabor Somogyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756077#comment-16756077 ] Gabor Somogyi edited comment on SPARK-23685 at 1/30/19 1:19 PM: Comment

[jira] [Resolved] (SPARK-23685) Spark Structured Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction)

2019-01-30 Thread Gabor Somogyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved SPARK-23685. --- Resolution: Information Provided > Spark Structured Streaming Kafka 0.10 Consumer Can't

[jira] [Comment Edited] (SPARK-23685) Spark Structured Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction)

2019-01-30 Thread Gabor Somogyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756077#comment-16756077 ] Gabor Somogyi edited comment on SPARK-23685 at 1/30/19 1:22 PM: Comment

[jira] [Commented] (SPARK-26758) Idle Executors are not getting killed after spark.dynamicAllocation.executorIdleTimeout value

2019-01-30 Thread sandeep katta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756070#comment-16756070 ] sandeep katta commented on SPARK-26758: --- I am able to reproduce this issue and soon will be

[jira] [Commented] (SPARK-23685) Spark Structured Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction)

2019-01-30 Thread Gabor Somogyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756077#comment-16756077 ] Gabor Somogyi commented on SPARK-23685: --- Comment from [~sindiri] on the PR: {quote}Originally this

[jira] [Assigned] (SPARK-26758) Idle Executors are not getting killed after spark.dynamicAllocation.executorIdleTimeout value

2019-01-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26758: Assignee: (was: Apache Spark) > Idle Executors are not getting killed after >

[jira] [Assigned] (SPARK-26758) Idle Executors are not getting killed after spark.dynamicAllocation.executorIdleTimeout value

2019-01-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26758: Assignee: Apache Spark > Idle Executors are not getting killed after >

[jira] [Commented] (SPARK-26176) Verify column name when creating table via `STORED AS`

2019-01-30 Thread Mikhail (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756002#comment-16756002 ] Mikhail commented on SPARK-26176: - Hello [~kevinyu98] Are you still looking into it? > Verify column

[jira] [Resolved] (SPARK-26782) Wrong column resolved when joining twice with the same dataframe

2019-01-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-26782. - Resolution: Duplicate > Wrong column resolved when joining twice with the same dataframe >

[jira] [Commented] (SPARK-26782) Wrong column resolved when joining twice with the same dataframe

2019-01-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755987#comment-16755987 ] Marco Gaido commented on SPARK-26782: - This is a duplicate of many others. I also started a thread

[jira] [Created] (SPARK-26782) Wrong column resolved when joining twice with the same dataframe

2019-01-30 Thread Vladimir Prus (JIRA)
Vladimir Prus created SPARK-26782: - Summary: Wrong column resolved when joining twice with the same dataframe Key: SPARK-26782 URL: https://issues.apache.org/jira/browse/SPARK-26782 Project: Spark

[jira] [Commented] (SPARK-26777) SQL worked in 2.3.2 and fails in 2.4.0

2019-01-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755947#comment-16755947 ] Hyukjin Kwon commented on SPARK-26777: -- Please provide a _minimised_, _self-runninable_ reproducers

[jira] [Commented] (SPARK-26777) SQL worked in 2.3.2 and fails in 2.4.0

2019-01-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755949#comment-16755949 ] Hyukjin Kwon commented on SPARK-26777: -- Please don't reopen. Take a look

[jira] [Resolved] (SPARK-26777) SQL worked in 2.3.2 and fails in 2.4.0

2019-01-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-26777. -- Resolution: Incomplete > SQL worked in 2.3.2 and fails in 2.4.0 >

[jira] [Commented] (SPARK-26766) Remove the list of filesystems from HadoopDelegationTokenProvider.obtainDelegationTokens

2019-01-30 Thread Gabor Somogyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755934#comment-16755934 ] Gabor Somogyi commented on SPARK-26766: --- Considering the size of the hadoopFSsToAccess dependency,

[jira] [Commented] (SPARK-25420) Dataset.count() every time is different.

2019-01-30 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755910#comment-16755910 ] Jungtaek Lim commented on SPARK-25420: -- [~jeffrey.mak] Could you rerun your query against master

[jira] [Commented] (SPARK-26779) NullPointerException when disable wholestage codegen

2019-01-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755881#comment-16755881 ] Marco Gaido commented on SPARK-26779: - I'd say this is most likely just a duplicate of SPARK-23731.

[jira] [Commented] (SPARK-25420) Dataset.count() every time is different.

2019-01-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755837#comment-16755837 ] Marco Gaido commented on SPARK-25420: - [~jeffrey.mak] I cannot reproduce your issue on current

  1   2   >