cryptoe commented on pull request #12078: URL: https://github.com/apache/druid/pull/12078#issuecomment-1009068625
For the first > The use-case is that some of the values in the MV field are not interesting for a specific query, and we would like to ignore them for the purpose of the GROUP BY. They are kept there because those ignored tags might be used for filtering, or might be used for GROUP BY when performing a different query. > > An example query, using the example data appearing at the top of this PR: > > SELECT MV_FILTER_ONLY(tags, ARRAY['t3', 't4']), COUNT(*) FROM test GROUP BY 1 > > with your new code enabled, I would expect the following to return: > > ``` > ["t3"], 1 (from row1) > ["t3", "t4"], 1 (from row2) > null, 2 (from row3+row4) > ``` For the first cut, I have not enabled expression as part of the MV_TO_ARRAY(). It has to be a native multiValueString/String col. FWIF @dbardbar you can query like ``` SELECT MV_TO_ARRAY(dim3), SUM(cnt) FROM druid.numfoo where MV_CONTAINS(dim3, ARRAY['b']) GROUP BY 1 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
