[jira] [Updated] (SPARK-31186) toPandas fails on simple query (collect() works)

2020-04-14 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-31186: - Fix Version/s: 2.4.6 > toPandas fails on simple query (collect() works) >

[jira] [Commented] (SPARK-31447) DATE_PART functions produces incorrect result

2020-04-14 Thread Sathyaprakash Govindasamy (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083672#comment-17083672 ] Sathyaprakash Govindasamy commented on SPARK-31447: --- I am proposing in

[jira] [Created] (SPARK-31447) DATE_PART functions produces incorrect result

2020-04-14 Thread Sathyaprakash Govindasamy (Jira)
Sathyaprakash Govindasamy created SPARK-31447: - Summary: DATE_PART functions produces incorrect result Key: SPARK-31447 URL: https://issues.apache.org/jira/browse/SPARK-31447 Project:

[jira] [Commented] (SPARK-31256) Dropna doesn't work for struct columns

2020-04-14 Thread Terry Kim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083666#comment-17083666 ] Terry Kim commented on SPARK-31256: --- Let me look into this. > Dropna doesn't work for struct columns

[jira] [Commented] (SPARK-31256) Dropna doesn't work for struct columns

2020-04-14 Thread Sunitha Kambhampati (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083659#comment-17083659 ] Sunitha Kambhampati commented on SPARK-31256: - I can repro the issue using the Scala api on

[jira] [Commented] (SPARK-9621) Closure inside RDD doesn't properly close over environment

2020-04-14 Thread Guillaume Martres (Jira)
[ https://issues.apache.org/jira/browse/SPARK-9621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083639#comment-17083639 ] Guillaume Martres commented on SPARK-9621: -- > unfixed as of scala 2.12.10 / 2.13.1 ... but

[jira] [Comment Edited] (SPARK-2620) case class cannot be used as key for reduce

2020-04-14 Thread Alexandre Archambault (Jira)
[ https://issues.apache.org/jira/browse/SPARK-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083606#comment-17083606 ] Alexandre Archambault edited comment on SPARK-2620 at 4/14/20, 10:02 PM:

[jira] [Commented] (SPARK-31423) DATES and TIMESTAMPS for a certain range are off by 10 days when stored in ORC

2020-04-14 Thread Bruce Robbins (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083611#comment-17083611 ] Bruce Robbins commented on SPARK-31423: --- {quote}It is questionable how to handle the date in such

[jira] [Commented] (SPARK-9621) Closure inside RDD doesn't properly close over environment

2020-04-14 Thread Alexandre Archambault (Jira)
[ https://issues.apache.org/jira/browse/SPARK-9621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083608#comment-17083608 ] Alexandre Archambault commented on SPARK-9621: -- FYI, like in

[jira] [Commented] (SPARK-2620) case class cannot be used as key for reduce

2020-04-14 Thread Alexandre Archambault (Jira)
[ https://issues.apache.org/jira/browse/SPARK-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083606#comment-17083606 ] Alexandre Archambault commented on SPARK-2620: -- FYI, adding "final" in front of "case class

[jira] [Commented] (SPARK-31423) DATES and TIMESTAMPS for a certain range are off by 10 days when stored in ORC

2020-04-14 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083595#comment-17083595 ] Maxim Gekk commented on SPARK-31423: I have debugged this slightly on Spark 2.4, so, '1582-10-14'

[jira] [Commented] (SPARK-31403) TreeNode asCode function incorrectly handles null literals

2020-04-14 Thread Carl Sverre (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083576#comment-17083576 ] Carl Sverre commented on SPARK-31403: - Thanks for checking this on master [~hyukjin.kwon]!  If you

[jira] [Resolved] (SPARK-31445) Avoid floating-point division in millisToDays

2020-04-14 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk resolved SPARK-31445. Resolution: Won't Fix > Avoid floating-point division in millisToDays >

[jira] [Created] (SPARK-31446) Make html elements for a paged table possible to have different id attribute.

2020-04-14 Thread Kousuke Saruta (Jira)
Kousuke Saruta created SPARK-31446: -- Summary: Make html elements for a paged table possible to have different id attribute. Key: SPARK-31446 URL: https://issues.apache.org/jira/browse/SPARK-31446

[jira] [Created] (SPARK-31445) Avoid floating-point division in millisToDays

2020-04-14 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31445: -- Summary: Avoid floating-point division in millisToDays Key: SPARK-31445 URL: https://issues.apache.org/jira/browse/SPARK-31445 Project: Spark Issue Type:

[jira] [Commented] (SPARK-31236) Spark error while consuming data from Kinesis direct end point

2020-04-14 Thread Thukarama Prabhu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083327#comment-17083327 ] Thukarama Prabhu commented on SPARK-31236: -- Raised this issue with AWS support and got below

[jira] [Commented] (SPARK-31423) DATES and TIMESTAMPS for a certain range are off by 10 days when stored in ORC

2020-04-14 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083314#comment-17083314 ] Maxim Gekk commented on SPARK-31423: I am working on the issue. > DATES and TIMESTAMPS for a

[jira] [Resolved] (SPARK-31439) Perf regression of fromJavaDate

2020-04-14 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-31439. - Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 28205

[jira] [Assigned] (SPARK-31439) Perf regression of fromJavaDate

2020-04-14 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-31439: --- Assignee: Maxim Gekk > Perf regression of fromJavaDate > --- >

[jira] [Commented] (SPARK-30466) remove dependency on jackson-mapper-asl-1.9.13 and jackson-core-asl-1.9.13

2020-04-14 Thread Nicholas Marion (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083270#comment-17083270 ] Nicholas Marion commented on SPARK-30466: - Also there were two more CVEs opened late last year

[jira] [Created] (SPARK-31444) Pyspark memory and cores calculation doesn't account for task cpus

2020-04-14 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-31444: - Summary: Pyspark memory and cores calculation doesn't account for task cpus Key: SPARK-31444 URL: https://issues.apache.org/jira/browse/SPARK-31444 Project: Spark

[jira] [Commented] (SPARK-31437) Try assigning tasks to existing executors by which required resources in ResourceProfile are satisfied

2020-04-14 Thread Thomas Graves (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083250#comment-17083250 ] Thomas Graves commented on SPARK-31437: --- This is something I wanted to eventually do but I have

[jira] [Updated] (SPARK-31437) Try assigning tasks to existing executors by which required resources in ResourceProfile are satisfied

2020-04-14 Thread Thomas Graves (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-31437: -- Affects Version/s: (was: 3.0.0) 3.1.0 > Try assigning tasks to

[jira] [Comment Edited] (SPARK-31443) Perf regression of toJavaDate

2020-04-14 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083217#comment-17083217 ] Maxim Gekk edited comment on SPARK-31443 at 4/14/20, 1:21 PM: -- FYI

[jira] [Commented] (SPARK-31443) Perf regression of toJavaDate

2020-04-14 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083217#comment-17083217 ] Maxim Gekk commented on SPARK-31443: FYI [~cloud_fan] > Perf regression of toJavaDate >

[jira] [Updated] (SPARK-31443) Perf regression of toJavaDate

2020-04-14 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-31443: --- Description: DateTimeBenchmark shows the regression Spark 2.4.6-SNAPSHOT at the PR

[jira] [Created] (SPARK-31443) Perf regression of toJavaDate

2020-04-14 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31443: -- Summary: Perf regression of toJavaDate Key: SPARK-31443 URL: https://issues.apache.org/jira/browse/SPARK-31443 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-31429) Add additional fields in ExpressionDescription for more granular category in documentation

2020-04-14 Thread Takeshi Yamamuro (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083156#comment-17083156 ] Takeshi Yamamuro commented on SPARK-31429: -- Thanks! > Add additional fields in

[jira] [Commented] (SPARK-31429) Add additional fields in ExpressionDescription for more granular category in documentation

2020-04-14 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083154#comment-17083154 ] Hyukjin Kwon commented on SPARK-31429: -- I haven't started it yet. Please go ahead, I would

[jira] [Comment Edited] (SPARK-27296) Efficient User Defined Aggregators

2020-04-14 Thread Patrick Cording (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083098#comment-17083098 ] Patrick Cording edited comment on SPARK-27296 at 4/14/20, 10:48 AM:

[jira] [Comment Edited] (SPARK-27296) Efficient User Defined Aggregators

2020-04-14 Thread Patrick Cording (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083098#comment-17083098 ] Patrick Cording edited comment on SPARK-27296 at 4/14/20, 10:43 AM:

[jira] [Comment Edited] (SPARK-27296) Efficient User Defined Aggregators

2020-04-14 Thread Patrick Cording (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083098#comment-17083098 ] Patrick Cording edited comment on SPARK-27296 at 4/14/20, 10:42 AM:

[jira] [Commented] (SPARK-27296) Efficient User Defined Aggregators

2020-04-14 Thread Patrick Cording (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083098#comment-17083098 ] Patrick Cording commented on SPARK-27296: - I've been trying this out, and I have a couple of

[jira] [Commented] (SPARK-31429) Add additional fields in ExpressionDescription for more granular category in documentation

2020-04-14 Thread Takeshi Yamamuro (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083024#comment-17083024 ] Takeshi Yamamuro commented on SPARK-31429: -- [~hyukjin.kwon] You've already started to work on

[jira] [Created] (SPARK-31442) Print shuffle id at coalesce partitions target size

2020-04-14 Thread ulysses you (Jira)
ulysses you created SPARK-31442: --- Summary: Print shuffle id at coalesce partitions target size Key: SPARK-31442 URL: https://issues.apache.org/jira/browse/SPARK-31442 Project: Spark Issue

[jira] [Comment Edited] (SPARK-31437) Try assigning tasks to existing executors by which required resources in ResourceProfile are satisfied

2020-04-14 Thread Hongze Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17082985#comment-17082985 ] Hongze Zhang edited comment on SPARK-31437 at 4/14/20, 8:28 AM: Hi

[jira] [Commented] (SPARK-31437) Try assigning tasks to existing executors by which required resources in ResourceProfile are satisfied

2020-04-14 Thread Hongze Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17082985#comment-17082985 ] Hongze Zhang commented on SPARK-31437: -- Hi [~tgraves], is there already any plan on this or similar

[jira] [Updated] (SPARK-31425) UnsafeKVExternalSorter/VariableLengthRowBasedKeyValueBatch should also respect UnsafeAlignedOffset

2020-04-14 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-31425: - Summary: UnsafeKVExternalSorter/VariableLengthRowBasedKeyValueBatch should also respect UnsafeAlignedOffset

[jira] [Commented] (SPARK-31336) Support Oracle Kerberos login in JDBC connector

2020-04-14 Thread Gabor Somogyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17082931#comment-17082931 ] Gabor Somogyi commented on SPARK-31336: --- Started to work on this... > Support Oracle Kerberos

[jira] [Resolved] (SPARK-31301) flatten the result dataframe of tests in stat

2020-04-14 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-31301. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 28176

[jira] [Assigned] (SPARK-31301) flatten the result dataframe of tests in stat

2020-04-14 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-31301: Assignee: zhengruifeng > flatten the result dataframe of tests in stat >

[jira] [Commented] (SPARK-26385) YARN - Spark Stateful Structured streaming HDFS_DELEGATION_TOKEN not found in cache

2020-04-14 Thread Zhou Jiashuai (Jira)
[ https://issues.apache.org/jira/browse/SPARK-26385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17082896#comment-17082896 ] Zhou Jiashuai commented on SPARK-26385: --- I enable the log with -Dsun.security.krb5.debug=true and