Synforge commented on issue #9460: Issue with result from CONCAT expression 
when using Kafka streaming ingestion.
URL: https://github.com/apache/druid/issues/9460#issuecomment-605441813
 
 
   I've done a little bit of digging on this and this bug applies to all string 
dimension columns in the IncrementalIndexStorageAdapter. It seems that 
regardless of whether a multi value was inserted into a column or not, this 
storage adapter sets all string columns to be multi value.
   
   e.g. for the example above while it hasn't been persisted a query for 
segment metadata results in this:
   
   `            "currency": {
                   "cardinality": 2, 
                   "errorMessage": null, 
                   "hasMultipleValues": true, 
                   "maxValue": "GBP", 
                   "minValue": "EUR", 
                   "size": 0, 
                   "type": "STRING"
               } `
   
   Whereas the persisted data returns hasMultipleValues correctly as false, it 
seems this results in inconsistencies when using any kind of string function 
against a dimensional column that has not yet been persisted vs data that has 
been persisted. So I think this problem is bigger than just the above report.
   
   I verified this by amending the following to return false and this then 
correctly returns just a string value instead of an array. However I'm aware 
this may break multi-values on ingestion?
   
   
https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/segment/incremental/IncrementalIndexStorageAdapter.java#L166
   
   Happy to take a look further if anyone can offer any advice as to how to 
tackle this problem. I believe @gianm wrote some of this code, I'm hoping you 
might be able to offer some advice?
   
   Thanks
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to