[
https://issues.apache.org/jira/browse/FLINK-23911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17403000#comment-17403000
]
Ingo Bürk edited comment on FLINK-23911 at 8/23/21, 7:38 AM:
-------------------------------------------------------------
I debugged a little bit more. Actually only the second case you mentioned is a
problem (when no metadata columns remain after the projection). If any metadata
columns are selected, #applyReadableMetadata is called twice and the second
time the projection has been considered correctly. However, if only physical
columns are selected, then #applyReadableMetadata is only called once with all
metadata keys.
I am also quite surprised that #applyReadableMetadata is called multiple times
on a source. Depending on the implementation this can cause unexpected
behavior, so we should probably document that this method (and those of all
other abilities?) must be idempotent.
-Edit: It seems this might also differ between 1.13 and master? In my 1.13
project I couldn't reproduce the same behavior, but continuing to look into
this-
Edit 2: That was not correct, this just depends on SupportProjectionPushDown
being implemented.
was (Author: airblader):
I debugged a little bit more. Actually only the second case you mentioned is a
problem (when no metadata columns remain after the projection). If any metadata
columns are selected, #applyReadableMetadata is called twice and the second
time the projection has been considered correctly. However, if only physical
columns are selected, then #applyReadableMetadata is only called once with all
metadata keys.
I am also quite surprised that #applyReadableMetadata is called multiple times
on a source. Depending on the implementation this can cause unexpected
behavior, so we should probably document that this method (and those of all
other abilities?) must be idempotent.
(It seems this might also differ between 1.13 and master? In my 1.13 project I
couldn't reproduce the same behavior, but continuing to look into this)
> Projections are not considered when pushing readable metadata into a source
> ---------------------------------------------------------------------------
>
> Key: FLINK-23911
> URL: https://issues.apache.org/jira/browse/FLINK-23911
> Project: Flink
> Issue Type: Bug
> Components: Table SQL / Planner
> Affects Versions: 1.13.2
> Reporter: Ingo Bürk
> Priority: Major
>
> Given a table with a declared schema containing some metadata columns, if we
> select only some of those metadata columns (or none), the interface of
> SupportsReadableMetadata states that the planner will perform the projection
> and only push required metadata keys into the source:
> {quote}The planner will select required metadata columns (i.e. perform
> projection push down) and will call \{@link #applyReadableMetadata(List,
> DataType)} with a list of metadata keys.{quote}
> However, it seems that this doesn't happen, and the planner always applies
> all metadata declared in the schema instead. This can be a problem because
> the source has to do unnecessary work, and some metadata might be more
> expensive to compute than others.
> For reference, SupportsProjectionPushDown can not be used to workaround this
> because it operates only on physical columns, i.e. #applyProjections will
> never be called with a projection for the metadata columns, even if they are
> selected.
> The following test case can be executed to debug into #applyReadableMetadata
> of the values table source:
> {code:java}
> @Test
> def test(): Unit = {
> val tableId = TestValuesTableFactory.registerData(Seq())
> tEnv.createTemporaryTable("T", TableDescriptor.forConnector("values")
> .schema(Schema.newBuilder()
> .column("f0", DataTypes.INT())
> .columnByMetadata("m1", DataTypes.STRING())
> .columnByMetadata("m2", DataTypes.STRING())
> .build())
> .option("data-id", tableId)
> .option("bounded", "true")
> .option("readable-metadata", "m1:STRING,m2:STRING")
> .build())
> tEnv.sqlQuery("SELECT f0, m1 FROM T").execute().collect().toList
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)