[ 
https://issues.apache.org/jira/browse/KAFKA-12305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17282492#comment-17282492
 ] 

Chris Egerton commented on KAFKA-12305:
---------------------------------------

I think the fact that the documentation doesn't call out arrays could go both 
ways, though–maps and structs are still maps and structs, regardless of whether 
they're top-level or contained in an array.

However, the wording in the docs is that the SMT will "Flatten *a* nested data 
structure" (emphasis mine). The singular "a" there implies only one such nested 
data structure. Considering that the SMT only accepts top-level map/struct 
values, I think that gives enough weight to the position that {{Flatten}} 
should only concern itself with top-level maps/structs and, after encountering 
something it cannot flatten (i.e., an array or a primitive), stop descending 
further.

In other words, I'm also satisfied with the naive approach (option B outlined 
above).

 

If this causes anyone difficulty down the road, they have plenty of options, 
including forking the open-source {{Flatten}} SMT, writing their own SMT from 
scratch, and publishing a KIP to alter the behavior of {{Flatten}}.

> Flatten SMT fails on arrays
> ---------------------------
>
>                 Key: KAFKA-12305
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12305
>             Project: Kafka
>          Issue Type: Bug
>          Components: KafkaConnect
>    Affects Versions: 2.0.1, 2.1.1, 2.2.2, 2.3.1, 2.4.1, 2.5.1, 2.7.0, 2.6.1, 
> 2.8.0
>            Reporter: Chris Egerton
>            Assignee: Chris Egerton
>            Priority: Major
>
> The {{Flatten}} SMT fails for array types. A sophisticated approach that 
> tries to flatten arrays might be desirable in some cases, and may have been 
> punted during the early design phase of the transform, but in the interim, 
> it's probably not worth it to make array data and the SMT mutually exclusive.
> A naive approach that preserves arrays as-are and doesn't attempt to flatten 
> them seems fair for now, but one alternative could be to traverse array 
> elements and, if any are maps or structs, flatten those as well.
> Adding behavior to fully flatten arrays by essentially transforming them into 
> maps whose elements are the elements of the array and whose keys are the 
> indices of each element is likely out of scope for a bug fix and, although 
> useful, might have to wait for a KIP.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to