[GitHub] [nifi] markap14 commented on pull request #6763: NIFI-10956: Fix inference issues with mixed arrays

GitBox Wed, 07 Dec 2022 06:49:41 -0800


markap14 commented on PR #6763:
URL: https://github.com/apache/nifi/pull/6763#issuecomment-1341077872


   Thanks for the fix @mattyb149! Definitely not a simple problem. The code 
looks good. The unit tests look good. I also did some manual testing with some 
more complex JSON objects.
   
   The only thing that I don't love about this PR is the fact that it's doing 
some stuff that's not very obvious. I.e., the fact that it's creating an ARRAY 
data type with a `null` element type, and then it's going in and replace those 
- and there's no documentation as to exactly what it's doing or why.
   
   Do you mind adding some inline documentation that explains what's going on? 
Namely that:
   - If we encounter an empty array in the first record, we have no way of 
knowing the type of elements.
   - If we just decide to use STRING as the type like we previously did, this 
can cause problems because:
         - Anything can be coerced into a STRING.
         - If we later encounter an array of Records there, we end up inferring 
that as a STRING
         - So we end up converting the Record objects into STRINGs
   - Instead, we create an Array where the element type is null
   - We then consider ARRAY[x] wider than ARRAY[null] for any x (other than 
null)
   - We then have to wait until the very end, after making inferences based on 
all data
   - At that point if the type is still inferred to be null we can just change 
it to a STRING
   
   In short, code looks great but needs a bit of documentation :)
   Otherwise I'm a +1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [nifi] markap14 commented on pull request #6763: NIFI-10956: Fix inference issues with mixed arrays

Reply via email to