Right now, with the structure of your data, it isn't possible.
The rows aren't duplicates of each other. "a" and "b" both exist in the
array. So Spark is correctly performing the join. It looks like you need to
find another way to model this data to get what you want to achieve.
Are the values of
Hi folks,
I'm contributing to the OpenLineage project, specifically the Apache Spark
integration. My current focus is on extending the project to support data
lineage extraction for Spark Streaming, beginning with Apache Kafka sources
and sinks.
I've encountered an obstacle when attempting to acc