[ https://issues.apache.org/jira/browse/PARQUET-2240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shani Elharrar updated PARQUET-2240: ------------------------------------ Description: DateTimeFormatter is used in static context but not thread safe, a formatter instance is created in PrimitiveStringifer.DateStringifier, and DateStringifier is created in static final DATE_STRINGIFIER, TIMESTAMP_MILLIS_STRINGIFIER, TIMESTAMP_MICROS_STRINGIFIER, TIMESTAMP_NANOS_STRINGIFIER, TIMESTAMP_MILLIS_UTC_STRINGIFIER, TIMESTAMP_MICROS_UTC_STRINGIFIER, and TIMESTAMP_NANOS_UTC_STRINGIFIER. This causes exceptions like the following to be thrown from parquet-code: java.lang.ArrayIndexOutOfBoundsException: Index 633 out of bounds for length 13 stacktrace: at java.base/sun.util.calendar.BaseCalendar.getCalendarDateFromFixedDate(BaseCalendar.java:457) at java.base/java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2358) at java.base/java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2273) at java.base/java.util.Calendar.setTimeInMillis(Calendar.java:1827) at java.base/java.util.Calendar.setTime(Calendar.java:1793) at java.base/java.text.SimpleDateFormat.format(SimpleDateFormat.java:978) at java.base/java.text.SimpleDateFormat.format(SimpleDateFormat.java:971) at java.base/java.text.DateFormat.format(DateFormat.java:339) at java.base/java.text.Format.format(Format.java:159) at org.apache.parquet.schema.PrimitiveStringifier$DateStringifier.toFormattedString(PrimitiveStringifier.java:265) at org.apache.parquet.schema.PrimitiveStringifier$DateStringifier.stringify(PrimitiveStringifier.java:256) at org.apache.parquet.column.statistics.IntStatistics.stringify(IntStatistics.java:92) at org.apache.parquet.column.statistics.IntStatistics.stringify(IntStatistics.java:25) at org.apache.parquet.column.statistics.Statistics.minAsString(Statistics.java:423) (... unrelated code) A simple solution would be to change those from static to non static values. I can create a PR if the solution is ok by the maintainers of the library. was: DateTimeFormatter is used in static context but not thread safe, a formatter instance is created in PrimitiveStringifer.DateStringifier, and DateStringifier is created in static final DATE_STRINGIFIER, TIMESTAMP_MILLIS_STRINGIFIER, TIMESTAMP_MICROS_STRINGIFIER, TIMESTAMP_NANOS_STRINGIFIER, TIMESTAMP_MILLIS_UTC_STRINGIFIER, TIMESTAMP_MICROS_UTC_STRINGIFIER, and TIMESTAMP_NANOS_UTC_STRINGIFIER. This causes exceptions like the following to be thrown from parquet-code: java.lang.ArrayIndexOutOfBoundsException: Index 633 out of bounds for length 13 stacktrace: at java.base/sun.util.calendar.BaseCalendar.getCalendarDateFromFixedDate(BaseCalendar.java:457) at java.base/java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2358) at java.base/java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2273) at java.base/java.util.Calendar.setTimeInMillis(Calendar.java:1827) at java.base/java.util.Calendar.setTime(Calendar.java:1793) at java.base/java.text.SimpleDateFormat.format(SimpleDateFormat.java:978) at java.base/java.text.SimpleDateFormat.format(SimpleDateFormat.java:971) at java.base/java.text.DateFormat.format(DateFormat.java:339) at java.base/java.text.Format.format(Format.java:159) at org.apache.parquet.schema.PrimitiveStringifier$DateStringifier.toFormattedString(PrimitiveStringifier.java:265) at org.apache.parquet.schema.PrimitiveStringifier$DateStringifier.stringify(PrimitiveStringifier.java:256) at org.apache.parquet.column.statistics.IntStatistics.stringify(IntStatistics.java:92) at org.apache.parquet.column.statistics.IntStatistics.stringify(IntStatistics.java:25) at org.apache.parquet.column.statistics.Statistics.minAsString(Statistics.java:423) (... unrelated code) A simple solution would be to change those from static to non static values. > DateTimeFormatter is used in static context, but not thread safe > ---------------------------------------------------------------- > > Key: PARQUET-2240 > URL: https://issues.apache.org/jira/browse/PARQUET-2240 > Project: Parquet > Issue Type: Bug > Components: parquet-mr > Environment: Linux, OpenJDK 17 (based on docker image openjdk:17-slim) > > Reporter: Shani Elharrar > Priority: Trivial > Labels: bug > Original Estimate: 24h > Remaining Estimate: 24h > > DateTimeFormatter is used in static context but not thread safe, a formatter > instance is created in PrimitiveStringifer.DateStringifier, and > DateStringifier is created in static final DATE_STRINGIFIER, > TIMESTAMP_MILLIS_STRINGIFIER, TIMESTAMP_MICROS_STRINGIFIER, > TIMESTAMP_NANOS_STRINGIFIER, TIMESTAMP_MILLIS_UTC_STRINGIFIER, > TIMESTAMP_MICROS_UTC_STRINGIFIER, and TIMESTAMP_NANOS_UTC_STRINGIFIER. > This causes exceptions like the following to be thrown from parquet-code: > java.lang.ArrayIndexOutOfBoundsException: Index 633 out of bounds for length > 13 > stacktrace: > at > java.base/sun.util.calendar.BaseCalendar.getCalendarDateFromFixedDate(BaseCalendar.java:457) > at > java.base/java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2358) > at > java.base/java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2273) > at java.base/java.util.Calendar.setTimeInMillis(Calendar.java:1827) > at java.base/java.util.Calendar.setTime(Calendar.java:1793) > at java.base/java.text.SimpleDateFormat.format(SimpleDateFormat.java:978) > at java.base/java.text.SimpleDateFormat.format(SimpleDateFormat.java:971) > at java.base/java.text.DateFormat.format(DateFormat.java:339) > at java.base/java.text.Format.format(Format.java:159) > at > org.apache.parquet.schema.PrimitiveStringifier$DateStringifier.toFormattedString(PrimitiveStringifier.java:265) > at > org.apache.parquet.schema.PrimitiveStringifier$DateStringifier.stringify(PrimitiveStringifier.java:256) > at > org.apache.parquet.column.statistics.IntStatistics.stringify(IntStatistics.java:92) > at > org.apache.parquet.column.statistics.IntStatistics.stringify(IntStatistics.java:25) > at > org.apache.parquet.column.statistics.Statistics.minAsString(Statistics.java:423) > (... unrelated code) > A simple solution would be to change those from static to non static values. > I can create a PR if the solution is ok by the maintainers of the library. -- This message was sent by Atlassian Jira (v8.20.10#820010)