a2l007 opened a new issue, #18437:
URL: https://github.com/apache/druid/issues/18437

   In heavily loaded production clusters with hundreds of realtime and MSQ 
tasks, frequent updates to the row signature have been observed on the brokers. 
These frequent row signature updates occasionally cause queries to return to 
fail or return incorrect results.
   
   ### Affected Version
   
   32.0, 33.0
   
   ### Description
   
   It appears there may be a race when the broker segment metadata cache is 
updated for a table at the same time a query is being planned. In these cases, 
the query returned values for a column that was not even part of the query 
projection. I've not managed to reproduce this locally yet, but here is a 
simplified version of the issue observed:
   
   Consider a datasource d1 with the following row signature:
   `{__time:LONG, dim7:STRING, dim0:STRING, dim3:STRING, line:STRING, 
dim2:STRING, dim8:STRING, dim5:STRING, dim1:STRING, dim4:STRING}`
   
   The following query is run:
   `SELECT dim0, dim2, line FROM d1
   `
   At the same time, the row signature for d1 gets updated to:
   `{__time:LONG, dim8:STRING, dim0:STRING, dim3:STRING, line:STRING, 
dim7:STRING, dim2:STRING, dim5:STRING, dim1:STRING, dim4:STRING}`
   
   Note there are no new columns added/removed, it is a reorder of the columns 
in the signature.
   
   The query unexpectedly returns values for `dim0`, **dim7**, and `line` 
instead of values for `dim0`, **dim2** and `line`.
   
   It seems like something unexpected happened during the projection build 
where the projection indices were based off of the old signature and got 
applied to the new signature. This mismatch likely happens before the Calcite 
logical projection build phase is complete. Once the logical projection is 
built, the row signature should be immutable, so subsequent signature changes 
should not affect the query.
   
   Any thoughts on what could be going on here?
   Should the table row signature change when no new columns are added or 
removed? Avoiding these frequent signature updates could help mitigate the 
query incorrectness issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to