[
https://issues.apache.org/jira/browse/FLINK-39608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated FLINK-39608:
-----------------------------------
Labels: pull-request-available (was: )
> KafkaDatasetFacet should extend DatasetConfigFacet to avoid classloader
> issues with OpenLineage
> ------------------------------------------------------------------------------------------------
>
> Key: FLINK-39608
> URL: https://issues.apache.org/jira/browse/FLINK-39608
> Project: Flink
> Issue Type: Bug
> Components: Connectors / Kafka
> Reporter: Swapna Marru
> Priority: Minor
> Labels: pull-request-available
>
> The OpenLineage event emitter relies on the availability of the custom
> KafkaDatasetFacet class in the classpath
> (https://github.com/OpenLineage/OpenLineage/blob/main/integration/flink/flink2/src/main/java/io/openlineage/flink/util/KafkaDatasetFacetUtil.java#L25).
> This
> fails when the Kafka connector is loaded as part of a user jar while the
> OpenLineage library is loaded as part of the Flink distribution.
>
>
>
>
> In this scenario:
>
>
>
> - OpenLineage emitters are loaded in ApplicationClassLoader
> - Kafka connector libs are loaded in FlinkUserClassLoader
>
>
> This classloader boundary cannot load
> KafkaDatasetFacet and not emit kafka dataset information , even though the
> Kafka source is available in lineage datasets information.
>
>
>
>
>
>
> Proposed Solution:
>
>
>
> Make KafkaDatasetFacet extend
> org.apache.flink.streaming.api.lineage.DatasetConfigFacet (which is part of
> Flink core and available to both classloaders). The OpenLineage integration
> can then:
>
> 1. Identify the facet by name
> 2. Cast to DatasetConfigFacet (a Flink core interface) to extract
> configuration
>
>
>
> This eliminates the dependency on Kafka connector classes being in the same
> classloader as the lineage emitter libs.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)