thiyaga commented on PR #38001: URL: https://github.com/apache/spark/pull/38001#issuecomment-1258517989
We use grouping sets on our queries and rely on `grouping__id` to use as an identifier to query the data for respective group. If we use `grouping__id` directly, it will be prone to change if grouping set changes (for e.g. adding new grouping set/ adding new column to existing grouping set). Any grouping id change will make things even more complex when consuming this data directly from reporting tools like Tableau . We need to do the one of the following options to mitigate the changing `grouping__id` 1. Either we need to transform the `grouping__id` to something that won't be impacted when the grouping set changes and deterministic (for e.g convert `grouping__id` to `group_name`) 2. Have some sort of logical DB view which will handle the transformation at runtime (for e.g. using CASE WHEN) In essence, we always have dependency with `grouping__id` when grouping sets are used in our query. Any change in the grouping id generation will have immediate impact. This new parameter will help us to use the legacy logic. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org