[
https://issues.apache.org/jira/browse/FLINK-17091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105267#comment-17105267
]
Dawid Wysakowicz commented on FLINK-17091:
------------------------------------------
First of all lets use the full type names as we have three different timestamp
types:
* {{TIMESTAMP_WITH_LOCAL_TIME}} - this has an {{Instant}} semantics and thus
can be mapped to {{long}}
* {{TIMESTAMP_WITHOUT_TIME_ZONE/TIMESTAMP}} - this has a {{LocalDateTime}}
semantics and *can not* be mapped to {{long}}
* {{TIMESTAMP_WITH_TIME_ZONE}} - this has an {{OffsetDateTime}} semantics and
*can not* be mapped to {{long}}
I think it's vital to properly support this types in sources. This is very
nicely explained in this document:
https://docs.google.com/document/d/1gNRww9mZJcHvUDCXklzjFEQGpefsuR_akCDfWsdE35Q
This effort is already being implemented into multiple hadoop formats see e.g.
https://issues.apache.org/jira/browse/AVRO-2328
With this introduction I agree we will need to support mapping of e.g. {{long,
logicalType: local-timestamp-micros}} to {{LocalDateTime}} or {{long,
logicalType: timestamp-micros}} to {{Instant}}.
My argument is we do not need to support all the conversion between all the
possible conversion classes. The source just tells which conversion class does
it use for a particular logical type.
Example:
(long-term} If a column was created with a DDL: {{colName TIMESTAMP(3)}} than
the source should imply the field in avro is of type {{long, logicalType:
local-timestamp-micros}} and it should produce {{LocalDateTime}}. It does not
need to be able to produce also the {{java.sql.Timestamp}}.
Currently it uses the {{java.sql.Timestamp}} conversion class for a
{{TIMESTAMP(3)}} and in my opinion the problem is this information flow is not
respected/inverted.
As a last comment, as I already said in my first reply its probably worth
adding a case:
{code}
case LONG:
if (info == Types.SQL_TIMESTAMP) {
return convertToTimestamp(object);
}
if (info == Types.LOCAL_DATE_TIME) {
return convertToTimestamp(object);
}
return object;
{code}
or at least it will not harm. I still think this is not a correct long-term
solution but just a patch for the current mishandling of the conversion classes.
> AvroRow(De)SerializationSchema doesn't support new Timestamp conversion
> classes
> -------------------------------------------------------------------------------
>
> Key: FLINK-17091
> URL: https://issues.apache.org/jira/browse/FLINK-17091
> Project: Flink
> Issue Type: Bug
> Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
> Affects Versions: 1.10.0
> Reporter: Paul Lin
> Priority: Major
>
> AvroRow(De)SerializationSchema doesn't know how to convert the new physical
> classes of Timestamp (eg. java.time.Date) to/from Avro's int/long based
> timestamp. Currently, when encountering objects of the new physical classes,
> AvroRow(De)SerializationSchema just ignores them and passes them to Avro's
> GenericDatumWriter/Reader, which leads to ClassCastException thrown by
> GenericDatumWriter/Reader. See
> [AvroRowSerializationSchema|https://github.com/apache/flink/blob/master/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/AvroRowSerializationSchema.java#L251].
> To fix this problem, we should support LocalTime/LocalDate/LocalDateTime
> conversion to int/long in AvroRowSerializationSchema, and support int/long
> conversion to LocalTime/LocalDate/LocalDateTime based on logical
> types(Types.LOCAL_TIME/Types.LOCAL_DATE/Types.LOCAL_DATE_TIME) in
> AvroRowDeserializationSchema.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)