BartMiki commented on PR #12915:
URL: https://github.com/apache/druid/pull/12915#issuecomment-1219295346

   @FrankChen021 I have new findings and I would like your comment on this. If 
we serialize the map as I've implemented it before, then the output is the same 
as the previous KafkaEmitter (of course with the `queryType`). So for native 
query parameters `sql` and `sqlQueryCtx` are absent from the output JSON.
   
   If we serialize the whole event to JSON first then for the native query the 
`sql` and `sqlQueryCtx` are set to `null` and `{}` respectively in the output 
JSON. Now the question is about backward compatibility.
   
   If I use the second implementation then the output will be different from 
the original implementation. This may lead to errors in the consumers of the 
emitted stream. If consumer code distinguishes the SQL from native queries 
solely by the presence of `native` or `sql`, then it will no longer work 
because both entries are present. Of course, corresponding fields are null, but 
previously they were totally absent from the output JSON. This is because 
`.toMap` removes null entries of "query" and "sql" from the resulting map.
   
   Example consumer in Python:
   ```python
   event = json.loads(rawEvent)
   if "query" in event:
       # do stuff with Native Query
   elif "sql" in event:
       # do stuff with SQL Query
   ```
   If I leave my implementation as is right now, then the above code would work 
correctly (backward compatible with the original KafkaEmitter logic). 
   However, if I update the code as described here: 
https://github.com/apache/druid/pull/12915#issuecomment-1219221809 then the 
above snippet will always execute first `if` branch as "query" is always 
present. 
    
   Should I leave the fixed implementation as is right now to maintain this 
compatibility or should I go with (the perhaps braking) approach of serializing 
the whole event at once? My take is to leave fix as is to have backward 
compatibility. What do you think @FrankChen021 ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to