[ 
https://issues.apache.org/jira/browse/PARQUET-2240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shani Elharrar updated PARQUET-2240:
------------------------------------
    Description: 
DateTimeFormatter is used in static context but not thread safe, a formatter 
instance is created in PrimitiveStringifer.DateStringifier, and DateStringifier 
is created in static final DATE_STRINGIFIER, TIMESTAMP_MILLIS_STRINGIFIER, 
TIMESTAMP_MICROS_STRINGIFIER, TIMESTAMP_NANOS_STRINGIFIER, 
TIMESTAMP_MILLIS_UTC_STRINGIFIER, TIMESTAMP_MICROS_UTC_STRINGIFIER, and 
TIMESTAMP_NANOS_UTC_STRINGIFIER.

This causes exceptions like the following to be thrown from parquet-code:

java.lang.ArrayIndexOutOfBoundsException: Index 633 out of bounds for length 13

stacktrace:

    at 
java.base/sun.util.calendar.BaseCalendar.getCalendarDateFromFixedDate(BaseCalendar.java:457)
    at 
java.base/java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2358)
    at 
java.base/java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2273)
    at java.base/java.util.Calendar.setTimeInMillis(Calendar.java:1827)
    at java.base/java.util.Calendar.setTime(Calendar.java:1793)
    at java.base/java.text.SimpleDateFormat.format(SimpleDateFormat.java:978)
    at java.base/java.text.SimpleDateFormat.format(SimpleDateFormat.java:971)
    at java.base/java.text.DateFormat.format(DateFormat.java:339)
    at java.base/java.text.Format.format(Format.java:159)
    at 
org.apache.parquet.schema.PrimitiveStringifier$DateStringifier.toFormattedString(PrimitiveStringifier.java:265)
    at 
org.apache.parquet.schema.PrimitiveStringifier$DateStringifier.stringify(PrimitiveStringifier.java:256)
    at 
org.apache.parquet.column.statistics.IntStatistics.stringify(IntStatistics.java:92)
    at 
org.apache.parquet.column.statistics.IntStatistics.stringify(IntStatistics.java:25)
    at 
org.apache.parquet.column.statistics.Statistics.minAsString(Statistics.java:423)

    (... unrelated code)

A simple solution would be to change those from static to non static values.

I can create a PR if the solution is ok by the maintainers of the library.

  was:
DateTimeFormatter is used in static context but not thread safe, a formatter 
instance is created in PrimitiveStringifer.DateStringifier, and DateStringifier 
is created in static final DATE_STRINGIFIER, TIMESTAMP_MILLIS_STRINGIFIER, 
TIMESTAMP_MICROS_STRINGIFIER, TIMESTAMP_NANOS_STRINGIFIER, 
TIMESTAMP_MILLIS_UTC_STRINGIFIER, TIMESTAMP_MICROS_UTC_STRINGIFIER, and 
TIMESTAMP_NANOS_UTC_STRINGIFIER.

This causes exceptions like the following to be thrown from parquet-code:

java.lang.ArrayIndexOutOfBoundsException: Index 633 out of bounds for length 13

stacktrace:

    at 
java.base/sun.util.calendar.BaseCalendar.getCalendarDateFromFixedDate(BaseCalendar.java:457)
    at 
java.base/java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2358)
    at 
java.base/java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2273)
    at java.base/java.util.Calendar.setTimeInMillis(Calendar.java:1827)
    at java.base/java.util.Calendar.setTime(Calendar.java:1793)
    at java.base/java.text.SimpleDateFormat.format(SimpleDateFormat.java:978)
    at java.base/java.text.SimpleDateFormat.format(SimpleDateFormat.java:971)
    at java.base/java.text.DateFormat.format(DateFormat.java:339)
    at java.base/java.text.Format.format(Format.java:159)
    at 
org.apache.parquet.schema.PrimitiveStringifier$DateStringifier.toFormattedString(PrimitiveStringifier.java:265)
    at 
org.apache.parquet.schema.PrimitiveStringifier$DateStringifier.stringify(PrimitiveStringifier.java:256)
    at 
org.apache.parquet.column.statistics.IntStatistics.stringify(IntStatistics.java:92)
    at 
org.apache.parquet.column.statistics.IntStatistics.stringify(IntStatistics.java:25)
    at 
org.apache.parquet.column.statistics.Statistics.minAsString(Statistics.java:423)

    (... unrelated code)

A simple solution would be to change those from static to non static values.


> DateTimeFormatter is used in static context, but not thread safe
> ----------------------------------------------------------------
>
>                 Key: PARQUET-2240
>                 URL: https://issues.apache.org/jira/browse/PARQUET-2240
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-mr
>         Environment: Linux, OpenJDK 17 (based on docker image openjdk:17-slim)
>  
>            Reporter: Shani Elharrar
>            Priority: Trivial
>              Labels: bug
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> DateTimeFormatter is used in static context but not thread safe, a formatter 
> instance is created in PrimitiveStringifer.DateStringifier, and 
> DateStringifier is created in static final DATE_STRINGIFIER, 
> TIMESTAMP_MILLIS_STRINGIFIER, TIMESTAMP_MICROS_STRINGIFIER, 
> TIMESTAMP_NANOS_STRINGIFIER, TIMESTAMP_MILLIS_UTC_STRINGIFIER, 
> TIMESTAMP_MICROS_UTC_STRINGIFIER, and TIMESTAMP_NANOS_UTC_STRINGIFIER.
> This causes exceptions like the following to be thrown from parquet-code:
> java.lang.ArrayIndexOutOfBoundsException: Index 633 out of bounds for length 
> 13
> stacktrace:
>     at 
> java.base/sun.util.calendar.BaseCalendar.getCalendarDateFromFixedDate(BaseCalendar.java:457)
>     at 
> java.base/java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2358)
>     at 
> java.base/java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2273)
>     at java.base/java.util.Calendar.setTimeInMillis(Calendar.java:1827)
>     at java.base/java.util.Calendar.setTime(Calendar.java:1793)
>     at java.base/java.text.SimpleDateFormat.format(SimpleDateFormat.java:978)
>     at java.base/java.text.SimpleDateFormat.format(SimpleDateFormat.java:971)
>     at java.base/java.text.DateFormat.format(DateFormat.java:339)
>     at java.base/java.text.Format.format(Format.java:159)
>     at 
> org.apache.parquet.schema.PrimitiveStringifier$DateStringifier.toFormattedString(PrimitiveStringifier.java:265)
>     at 
> org.apache.parquet.schema.PrimitiveStringifier$DateStringifier.stringify(PrimitiveStringifier.java:256)
>     at 
> org.apache.parquet.column.statistics.IntStatistics.stringify(IntStatistics.java:92)
>     at 
> org.apache.parquet.column.statistics.IntStatistics.stringify(IntStatistics.java:25)
>     at 
> org.apache.parquet.column.statistics.Statistics.minAsString(Statistics.java:423)
>     (... unrelated code)
> A simple solution would be to change those from static to non static values.
> I can create a PR if the solution is ok by the maintainers of the library.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to