[jira] [Resolved] (SPARK-30492) Eliminate deprecation warnings in ORC datasource

2020-01-13 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk resolved SPARK-30492. Resolution: Won't Fix > Eliminate deprecation warnings in ORC datasource >

[jira] [Created] (SPARK-30505) Deprecate Avro option `ignoreExtension` in a doc

2020-01-13 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-30505: -- Summary: Deprecate Avro option `ignoreExtension` in a doc Key: SPARK-30505 URL: https://issues.apache.org/jira/browse/SPARK-30505 Project: Spark Issue Type:

[jira] [Created] (SPARK-30606) Applying the `like` function with 2 parameters fails

2020-01-22 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-30606: -- Summary: Applying the `like` function with 2 parameters fails Key: SPARK-30606 URL: https://issues.apache.org/jira/browse/SPARK-30606 Project: Spark Issue Type:

[jira] [Commented] (SPARK-30530) CSV load followed by "is null" filter produces incorrect results

2020-01-17 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17017806#comment-17017806 ] Maxim Gekk commented on SPARK-30530: [~jlowe] I prepared a fix for the issue. [~hyukjin.kwon]

[jira] [Created] (SPARK-30554) Return Iterable from FailureSafeParser.rawParser

2020-01-17 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-30554: -- Summary: Return Iterable from FailureSafeParser.rawParser Key: SPARK-30554 URL: https://issues.apache.org/jira/browse/SPARK-30554 Project: Spark Issue Type:

[jira] [Created] (SPARK-30587) Run test suites for CSV v1 and JSON v1

2020-01-20 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-30587: -- Summary: Run test suites for CSV v1 and JSON v1 Key: SPARK-30587 URL: https://issues.apache.org/jira/browse/SPARK-30587 Project: Spark Issue Type: Test

[jira] [Commented] (SPARK-30485) Remove SQL configs deprecated before v2.4

2020-01-10 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17012678#comment-17012678 ] Maxim Gekk commented on SPARK-30485: [~dongjoon] [~srowen] [~cloud_fan] [~hyukjin.kwon] WDYT of the

[jira] [Created] (SPARK-30485) Remove SQL configs deprecated before v2.4

2020-01-10 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-30485: -- Summary: Remove SQL configs deprecated before v2.4 Key: SPARK-30485 URL: https://issues.apache.org/jira/browse/SPARK-30485 Project: Spark Issue Type: Test

[jira] [Created] (SPARK-30492) Eliminate deprecation warning in ORC datasource

2020-01-12 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-30492: -- Summary: Eliminate deprecation warning in ORC datasource Key: SPARK-30492 URL: https://issues.apache.org/jira/browse/SPARK-30492 Project: Spark Issue Type:

[jira] [Updated] (SPARK-30492) Eliminate deprecation warnings in ORC datasource

2020-01-12 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-30492: --- Summary: Eliminate deprecation warnings in ORC datasource (was: Eliminate deprecation warning in

[jira] [Created] (SPARK-30509) Deprecation log warning is not printed in Avro schema inferring

2020-01-14 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-30509: -- Summary: Deprecation log warning is not printed in Avro schema inferring Key: SPARK-30509 URL: https://issues.apache.org/jira/browse/SPARK-30509 Project: Spark

[jira] [Created] (SPARK-30482) Add sub-class of AppenderSkeleton reusable in tests

2020-01-10 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-30482: -- Summary: Add sub-class of AppenderSkeleton reusable in tests Key: SPARK-30482 URL: https://issues.apache.org/jira/browse/SPARK-30482 Project: Spark Issue Type:

[jira] [Updated] (SPARK-30482) Add sub-class of AppenderSkeleton reusable in tests

2020-01-10 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-30482: --- Component/s: (was: SQL) > Add sub-class of AppenderSkeleton reusable in tests >

[jira] [Created] (SPARK-30409) Use `NoOp` datasource in SQL benchmarks

2020-01-02 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-30409: -- Summary: Use `NoOp` datasource in SQL benchmarks Key: SPARK-30409 URL: https://issues.apache.org/jira/browse/SPARK-30409 Project: Spark Issue Type: Test

[jira] [Commented] (SPARK-30174) Eliminate warnings :part 4

2020-01-02 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006953#comment-17006953 ] Maxim Gekk commented on SPARK-30174: [~shivuson...@gmail.com] Are you still working on this? If so,

[jira] [Commented] (SPARK-30172) Eliminate warnings: part3

2020-01-02 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006952#comment-17006952 ] Maxim Gekk commented on SPARK-30172: [~Ankitraj] Are you still working on this? > Eliminate

[jira] [Commented] (SPARK-30171) Eliminate warnings: part2

2020-01-02 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006949#comment-17006949 ] Maxim Gekk commented on SPARK-30171: [~srowen] SPARK-30258 fixes warnings AvroFunctionsSuite.scala

[jira] [Created] (SPARK-30412) Eliminate warnings in Java tests regarding to deprecated API

2020-01-02 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-30412: -- Summary: Eliminate warnings in Java tests regarding to deprecated API Key: SPARK-30412 URL: https://issues.apache.org/jira/browse/SPARK-30412 Project: Spark

[jira] [Created] (SPARK-30416) Log a warning for deprecated SQL config in `set()` and `unset()`

2020-01-03 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-30416: -- Summary: Log a warning for deprecated SQL config in `set()` and `unset()` Key: SPARK-30416 URL: https://issues.apache.org/jira/browse/SPARK-30416 Project: Spark

[jira] [Commented] (SPARK-30401) Call requireNonStaticConf() only once

2020-01-01 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006392#comment-17006392 ] Maxim Gekk commented on SPARK-30401: I am working on it > Call requireNonStaticConf() only once >

[jira] [Created] (SPARK-30401) Call requireNonStaticConf() only once

2020-01-01 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-30401: -- Summary: Call requireNonStaticConf() only once Key: SPARK-30401 URL: https://issues.apache.org/jira/browse/SPARK-30401 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-30429) WideSchemaBenchmark fails with OOM

2020-01-05 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-30429: -- Summary: WideSchemaBenchmark fails with OOM Key: SPARK-30429 URL: https://issues.apache.org/jira/browse/SPARK-30429 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-30429) WideSchemaBenchmark fails with OOM

2020-01-05 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-30429: --- Attachment: WideSchemaBenchmark_console.txt > WideSchemaBenchmark fails with OOM >

[jira] [Commented] (SPARK-30429) WideSchemaBenchmark fails with OOM

2020-01-06 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009189#comment-17009189 ] Maxim Gekk commented on SPARK-30429: [~dongjoon] I ran git bisect. Let see what it will find during

[jira] [Commented] (SPARK-30429) WideSchemaBenchmark fails with OOM

2020-01-06 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009398#comment-17009398 ] Maxim Gekk commented on SPARK-30429: Bisect have found the first bad commit. I specified the recent

[jira] [Commented] (SPARK-30442) Write mode ignored when using CodecStreams

2020-01-07 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009963#comment-17009963 ] Maxim Gekk commented on SPARK-30442: > This can cause issues, particularly with aws tools, that make

[jira] [Commented] (SPARK-30565) Regression in the ORC benchmark

2020-03-11 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057619#comment-17057619 ] Maxim Gekk commented on SPARK-30565: Per [~dongjoon] , default ORC reader doesn't fully cover

[jira] [Created] (SPARK-31076) Convert Catalyst's DATE/TIMESTAMP to Java Date/Timestamp via local date-time

2020-03-06 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31076: -- Summary: Convert Catalyst's DATE/TIMESTAMP to Java Date/Timestamp via local date-time Key: SPARK-31076 URL: https://issues.apache.org/jira/browse/SPARK-31076 Project:

[jira] [Created] (SPARK-31402) Incorrect rebasing of BCE dates

2020-04-09 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31402: -- Summary: Incorrect rebasing of BCE dates Key: SPARK-31402 URL: https://issues.apache.org/jira/browse/SPARK-31402 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-28624) make_date is inconsistent when reading from table

2020-04-10 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080312#comment-17080312 ] Maxim Gekk commented on SPARK-28624: toJavaDate is implemented differently in the master 

[jira] [Commented] (SPARK-31423) DATES and TIMESTAMPS for a certain range are off by 10 days when stored in ORC

2020-04-14 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083595#comment-17083595 ] Maxim Gekk commented on SPARK-31423: I have debugged this slightly on Spark 2.4, so, '1582-10-14'

[jira] [Resolved] (SPARK-31445) Avoid floating-point division in millisToDays

2020-04-14 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk resolved SPARK-31445. Resolution: Won't Fix > Avoid floating-point division in millisToDays >

[jira] [Commented] (SPARK-31423) DATES and TIMESTAMPS for a certain range are off by 10 days when stored in ORC

2020-04-15 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17084308#comment-17084308 ] Maxim Gekk commented on SPARK-31423: [~bersprockets] I think we should take the next valid date for

[jira] [Created] (SPARK-31449) Is there a difference between JDK and Spark's time zone offset calculation

2020-04-15 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31449: -- Summary: Is there a difference between JDK and Spark's time zone offset calculation Key: SPARK-31449 URL: https://issues.apache.org/jira/browse/SPARK-31449 Project:

[jira] [Created] (SPARK-31439) Perf regression of fromJavaDate

2020-04-13 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31439: -- Summary: Perf regression of fromJavaDate Key: SPARK-31439 URL: https://issues.apache.org/jira/browse/SPARK-31439 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-31426) Regression in loading/saving timestamps from/to ORC files

2020-04-13 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-31426: --- Parent: SPARK-31404 Issue Type: Sub-task (was: Bug) > Regression in loading/saving

[jira] [Created] (SPARK-31426) Regression in loading/saving timestamps from/to ORC files

2020-04-12 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31426: -- Summary: Regression in loading/saving timestamps from/to ORC files Key: SPARK-31426 URL: https://issues.apache.org/jira/browse/SPARK-31426 Project: Spark Issue

[jira] [Issue Comment Deleted] (SPARK-31423) DATES and TIMESTAMPS for a certain range are off by 10 days when stored in ORC

2020-04-12 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-31423: --- Comment: was deleted (was: This is intentional behavior because ORC format assumes the hybrid

[jira] [Commented] (SPARK-31423) DATES and TIMESTAMPS for a certain range are off by 10 days when stored in ORC

2020-04-12 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17082051#comment-17082051 ] Maxim Gekk commented on SPARK-31423: This is intentional behavior because ORC format assumes the

[jira] [Comment Edited] (SPARK-31443) Perf regression of toJavaDate

2020-04-14 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083217#comment-17083217 ] Maxim Gekk edited comment on SPARK-31443 at 4/14/20, 1:21 PM: -- FYI

[jira] [Created] (SPARK-31489) Failure on pushing down filters with java.time.LocalDate values in ORC

2020-04-19 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31489: -- Summary: Failure on pushing down filters with java.time.LocalDate values in ORC Key: SPARK-31489 URL: https://issues.apache.org/jira/browse/SPARK-31489 Project: Spark

[jira] [Updated] (SPARK-31488) Support `java.time.LocalDate` in Parquet filter pushdown

2020-04-19 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-31488: --- Description: Currently, ParquetFilters supports only java.sql.Date values of DateType, and

[jira] [Created] (SPARK-31490) Benchmark conversions to/from Java 8 date-time types

2020-04-19 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31490: -- Summary: Benchmark conversions to/from Java 8 date-time types Key: SPARK-31490 URL: https://issues.apache.org/jira/browse/SPARK-31490 Project: Spark Issue Type:

[jira] [Created] (SPARK-31488) Support `java.time.LocalDate` in Parquet filter pushdown

2020-04-19 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31488: -- Summary: Support `java.time.LocalDate` in Parquet filter pushdown Key: SPARK-31488 URL: https://issues.apache.org/jira/browse/SPARK-31488 Project: Spark Issue

[jira] [Created] (SPARK-31471) Add a script to run multiple benchmarks

2020-04-17 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31471: -- Summary: Add a script to run multiple benchmarks Key: SPARK-31471 URL: https://issues.apache.org/jira/browse/SPARK-31471 Project: Spark Issue Type: New Feature

[jira] [Created] (SPARK-31398) Speed up reading dates in ORC

2020-04-09 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31398: -- Summary: Speed up reading dates in ORC Key: SPARK-31398 URL: https://issues.apache.org/jira/browse/SPARK-31398 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-31385) Results of Julian-Gregorian rebasing don't match to Gregorian-Julian rebasing

2020-04-08 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31385: -- Summary: Results of Julian-Gregorian rebasing don't match to Gregorian-Julian rebasing Key: SPARK-31385 URL: https://issues.apache.org/jira/browse/SPARK-31385 Project:

[jira] [Created] (SPARK-31159) Incompatible Parquet dates/timestamps with Spark 2.4

2020-03-15 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31159: -- Summary: Incompatible Parquet dates/timestamps with Spark 2.4 Key: SPARK-31159 URL: https://issues.apache.org/jira/browse/SPARK-31159 Project: Spark Issue Type:

[jira] [Commented] (SPARK-31159) Incompatible Parquet dates/timestamps with Spark 2.4

2020-03-15 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17059617#comment-17059617 ] Maxim Gekk commented on SPARK-31159: [~cloud_fan] FYI > Incompatible Parquet dates/timestamps with

[jira] [Created] (SPARK-31328) Incorrect timestamps rebasing on autumn daylight saving time

2020-04-02 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31328: -- Summary: Incorrect timestamps rebasing on autumn daylight saving time Key: SPARK-31328 URL: https://issues.apache.org/jira/browse/SPARK-31328 Project: Spark

[jira] [Updated] (SPARK-31328) Incorrect timestamps rebasing on autumn daylight saving time

2020-04-02 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-31328: --- Description: Run the following code in the *America/Los_Angeles* time zone: {code:scala}

[jira] [Updated] (SPARK-31328) Incorrect timestamps rebasing on autumn daylight saving time

2020-04-02 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-31328: --- Description: Run the following code in the *America/Los_Angeles* time zone: {code:scala}

[jira] [Created] (SPARK-31353) Set time zone in DateTimeBenchmark and DateTimeRebaseBenchmark

2020-04-05 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31353: -- Summary: Set time zone in DateTimeBenchmark and DateTimeRebaseBenchmark Key: SPARK-31353 URL: https://issues.apache.org/jira/browse/SPARK-31353 Project: Spark

[jira] [Created] (SPARK-31277) Migrate `DateTimeTestUtils` from `TimeZone` to `ZoneId`

2020-03-26 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31277: -- Summary: Migrate `DateTimeTestUtils` from `TimeZone` to `ZoneId` Key: SPARK-31277 URL: https://issues.apache.org/jira/browse/SPARK-31277 Project: Spark Issue

[jira] [Created] (SPARK-31254) `HiveResult.toHiveString` does not use the current session time zone

2020-03-25 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31254: -- Summary: `HiveResult.toHiveString` does not use the current session time zone Key: SPARK-31254 URL: https://issues.apache.org/jira/browse/SPARK-31254 Project: Spark

[jira] [Created] (SPARK-31284) Check rebasing of timestamps in ORC datasource

2020-03-27 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31284: -- Summary: Check rebasing of timestamps in ORC datasource Key: SPARK-31284 URL: https://issues.apache.org/jira/browse/SPARK-31284 Project: Spark Issue Type: Test

[jira] [Updated] (SPARK-31286) Specify formats of time zone ID for JSON/CSV option and from/to_utc_timestamp

2020-03-27 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-31286: --- Description: There are two distinct types of ID (see

[jira] [Created] (SPARK-31286) Specify formats of time zone ID for JSON/CSV option and from/to_utc_timestamp

2020-03-27 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31286: -- Summary: Specify formats of time zone ID for JSON/CSV option and from/to_utc_timestamp Key: SPARK-31286 URL: https://issues.apache.org/jira/browse/SPARK-31286 Project:

[jira] [Created] (SPARK-31343) Check codegen does not fail on expressions with special characters in string parameters

2020-04-03 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31343: -- Summary: Check codegen does not fail on expressions with special characters in string parameters Key: SPARK-31343 URL: https://issues.apache.org/jira/browse/SPARK-31343

[jira] [Created] (SPARK-31296) Benchmark date-time rebasing to/from Julian calendar

2020-03-29 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31296: -- Summary: Benchmark date-time rebasing to/from Julian calendar Key: SPARK-31296 URL: https://issues.apache.org/jira/browse/SPARK-31296 Project: Spark Issue Type:

[jira] [Updated] (SPARK-31296) Benchmark date-time rebasing in Parquet datasource

2020-03-29 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-31296: --- Summary: Benchmark date-time rebasing in Parquet datasource (was: Benchmark date-time rebasing

[jira] [Created] (SPARK-31297) Speed-up date-time rebasing

2020-03-29 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31297: -- Summary: Speed-up date-time rebasing Key: SPARK-31297 URL: https://issues.apache.org/jira/browse/SPARK-31297 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-31297) Speed-up date-time rebasing

2020-03-29 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17070286#comment-17070286 ] Maxim Gekk commented on SPARK-31297: [~cloud_fan] [~hyukjin.kwon] [~dongjoon] WDYT? > Speed-up

[jira] [Created] (SPARK-31311) Benchmark date-time rebasing in ORC datasource

2020-03-31 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31311: -- Summary: Benchmark date-time rebasing in ORC datasource Key: SPARK-31311 URL: https://issues.apache.org/jira/browse/SPARK-31311 Project: Spark Issue Type:

[jira] [Updated] (SPARK-31311) Benchmark date-time rebasing in ORC datasource

2020-03-31 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-31311: --- Description: * Benchmark saving dates/timestamps before and after 1582-10-15 * Benchmark loading

[jira] [Created] (SPARK-31318) Split Parquet/Avro configs for rebasing dates/timestamps in read and in write

2020-03-31 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31318: -- Summary: Split Parquet/Avro configs for rebasing dates/timestamps in read and in write Key: SPARK-31318 URL: https://issues.apache.org/jira/browse/SPARK-31318 Project:

[jira] [Updated] (SPARK-31318) Split Parquet/Avro configs for rebasing dates/timestamps in read and in write

2020-03-31 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-31318: --- Parent: SPARK-30951 Issue Type: Sub-task (was: Improvement) > Split Parquet/Avro configs

[jira] [Commented] (SPARK-31297) Speed-up date-time rebasing

2020-03-29 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17070457#comment-17070457 ] Maxim Gekk commented on SPARK-31297: The rebasing of days doesn't depend on time zone, and has just

[jira] [Commented] (SPARK-31238) Incompatible ORC dates with Spark 2.4

2020-03-25 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066427#comment-17066427 ] Maxim Gekk commented on SPARK-31238: I am working on the issue. > Incompatible ORC dates with Spark

[jira] [Created] (SPARK-31359) Speed up timestamps rebasing

2020-04-06 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31359: -- Summary: Speed up timestamps rebasing Key: SPARK-31359 URL: https://issues.apache.org/jira/browse/SPARK-31359 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-31579) Replace floorDiv by / in localRebaseGregorianToJulianDays()

2020-04-27 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31579: -- Summary: Replace floorDiv by / in localRebaseGregorianToJulianDays() Key: SPARK-31579 URL: https://issues.apache.org/jira/browse/SPARK-31579 Project: Spark

[jira] [Commented] (SPARK-31463) Enhance JsonDataSource by replacing jackson with simdjson

2020-04-24 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091389#comment-17091389 ] Maxim Gekk commented on SPARK-31463: Parsing itself takes 10-20%. JSON datasource spends significant

[jira] [Updated] (SPARK-31449) Investigate the difference between JDK and Spark's time zone offset calculation

2020-04-24 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-31449: --- Summary: Investigate the difference between JDK and Spark's time zone offset calculation (was: Is

[jira] [Updated] (SPARK-31449) Investigate the difference between JDK and Spark's time zone offset calculation

2020-04-24 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-31449: --- Issue Type: Improvement (was: Question) > Investigate the difference between JDK and Spark's time

[jira] [Commented] (SPARK-31554) Flaky test suite org.apache.spark.sql.hive.thriftserver.CliSuite

2020-04-24 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091614#comment-17091614 ] Maxim Gekk commented on SPARK-31554: [~cloud_fan] [~hyukjin.kwon] Can I we disable the flaky test

[jira] [Created] (SPARK-31554) Flaky test suite org.apache.spark.sql.hive.thriftserver.CliSuite

2020-04-24 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31554: -- Summary: Flaky test suite org.apache.spark.sql.hive.thriftserver.CliSuite Key: SPARK-31554 URL: https://issues.apache.org/jira/browse/SPARK-31554 Project: Spark

[jira] [Commented] (SPARK-31553) Wrong result of isInCollection for large collections

2020-04-24 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091490#comment-17091490 ] Maxim Gekk commented on SPARK-31553: I am working on the issue > Wrong result of isInCollection for

[jira] [Created] (SPARK-31553) Wrong result of isInCollection for large collections

2020-04-24 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31553: -- Summary: Wrong result of isInCollection for large collections Key: SPARK-31553 URL: https://issues.apache.org/jira/browse/SPARK-31553 Project: Spark Issue Type:

[jira] [Commented] (SPARK-31449) Investigate the difference between JDK and Spark's time zone offset calculation

2020-04-26 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17092824#comment-17092824 ] Maxim Gekk commented on SPARK-31449: [~cloud_fan] [~hyukjin.kwon] I compared results of those 2

[jira] [Created] (SPARK-31623) Benchmark rebasing of INT96 and TIMESTAMP_MILLIS timestamps in read/write

2020-05-01 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31623: -- Summary: Benchmark rebasing of INT96 and TIMESTAMP_MILLIS timestamps in read/write Key: SPARK-31623 URL: https://issues.apache.org/jira/browse/SPARK-31623 Project: Spark

[jira] [Commented] (SPARK-31579) Replace floorDiv by / in localRebaseGregorianToJulianDays()

2020-05-05 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17099579#comment-17099579 ] Maxim Gekk commented on SPARK-31579: [~suddhuASF] The replace floorDiv by / is trivial. Please,

[jira] [Created] (SPARK-31641) Incorrect days conversion by JSON legacy parser

2020-05-05 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31641: -- Summary: Incorrect days conversion by JSON legacy parser Key: SPARK-31641 URL: https://issues.apache.org/jira/browse/SPARK-31641 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-31563) Failure of InSet.sql for UTF8String collection

2020-04-25 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31563: -- Summary: Failure of InSet.sql for UTF8String collection Key: SPARK-31563 URL: https://issues.apache.org/jira/browse/SPARK-31563 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-31563) Failure of InSet.sql for UTF8String collection

2020-04-25 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17092168#comment-17092168 ] Maxim Gekk commented on SPARK-31563: I am working on the issue > Failure of InSet.sql for

[jira] [Resolved] (SPARK-31554) Flaky test suite org.apache.spark.sql.hive.thriftserver.CliSuite

2020-04-30 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk resolved SPARK-31554. Resolution: Not A Problem > Flaky test suite org.apache.spark.sql.hive.thriftserver.CliSuite >

[jira] [Commented] (SPARK-31423) DATES and TIMESTAMPS for a certain range are off by 10 days when stored in ORC

2020-04-14 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083314#comment-17083314 ] Maxim Gekk commented on SPARK-31423: I am working on the issue. > DATES and TIMESTAMPS for a

[jira] [Created] (SPARK-31443) Perf regression of toJavaDate

2020-04-14 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31443: -- Summary: Perf regression of toJavaDate Key: SPARK-31443 URL: https://issues.apache.org/jira/browse/SPARK-31443 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-31443) Perf regression of toJavaDate

2020-04-14 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083217#comment-17083217 ] Maxim Gekk commented on SPARK-31443: FYI [~cloud_fan] > Perf regression of toJavaDate >

[jira] [Updated] (SPARK-31443) Perf regression of toJavaDate

2020-04-14 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-31443: --- Description: DateTimeBenchmark shows the regression Spark 2.4.6-SNAPSHOT at the PR

[jira] [Created] (SPARK-31445) Avoid floating-point division in millisToDays

2020-04-14 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31445: -- Summary: Avoid floating-point division in millisToDays Key: SPARK-31445 URL: https://issues.apache.org/jira/browse/SPARK-31445 Project: Spark Issue Type:

[jira] [Created] (SPARK-31680) Support Java 8 datetime types by Random data generator

2020-05-11 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31680: -- Summary: Support Java 8 datetime types by Random data generator Key: SPARK-31680 URL: https://issues.apache.org/jira/browse/SPARK-31680 Project: Spark Issue

[jira] [Created] (SPARK-31738) Describe 'L' and 'M' month pattern letters

2020-05-17 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31738: -- Summary: Describe 'L' and 'M' month pattern letters Key: SPARK-31738 URL: https://issues.apache.org/jira/browse/SPARK-31738 Project: Spark Issue Type:

[jira] [Updated] (SPARK-31672) Reading wrong timestamps from dictionary encoded columns in Parquet files

2020-05-10 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-31672: --- Description: Write dates with dictionary encoding enabled to parquet files: {code:scala} Welcome to

[jira] [Created] (SPARK-31672) Reading wrong timestamps from dictionary encoded columns in Parquet files

2020-05-10 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31672: -- Summary: Reading wrong timestamps from dictionary encoded columns in Parquet files Key: SPARK-31672 URL: https://issues.apache.org/jira/browse/SPARK-31672 Project: Spark

[jira] [Created] (SPARK-31662) Reading wrong dates from dictionary encoded columns in Parquet files

2020-05-08 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31662: -- Summary: Reading wrong dates from dictionary encoded columns in Parquet files Key: SPARK-31662 URL: https://issues.apache.org/jira/browse/SPARK-31662 Project: Spark

[jira] [Updated] (SPARK-31662) Reading wrong dates from dictionary encoded columns in Parquet files

2020-05-08 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-31662: --- Description: Write dates with dictionary encoding enabled to parquet files: {code:scala} Welcome to

[jira] [Created] (SPARK-31727) Inconsistent error messages of casting timestamp to int

2020-05-15 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31727: -- Summary: Inconsistent error messages of casting timestamp to int Key: SPARK-31727 URL: https://issues.apache.org/jira/browse/SPARK-31727 Project: Spark Issue

[jira] [Created] (SPARK-31725) Set America/Los_Angeles time zone and Locale.US by default

2020-05-15 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31725: -- Summary: Set America/Los_Angeles time zone and Locale.US by default Key: SPARK-31725 URL: https://issues.apache.org/jira/browse/SPARK-31725 Project: Spark Issue

[jira] [Created] (SPARK-31712) Check casting timestamps to byte/short/int/long before 1970-01-01

2020-05-14 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31712: -- Summary: Check casting timestamps to byte/short/int/long before 1970-01-01 Key: SPARK-31712 URL: https://issues.apache.org/jira/browse/SPARK-31712 Project: Spark

[jira] [Created] (SPARK-31665) Test parquet dictionary encoding of random dates/timestamps

2020-05-08 Thread Maxim Gekk (Jira)
Maxim Gekk created SPARK-31665: -- Summary: Test parquet dictionary encoding of random dates/timestamps Key: SPARK-31665 URL: https://issues.apache.org/jira/browse/SPARK-31665 Project: Spark

<    2   3   4   5   6   7   8   9   10   11   >