ilooner commented on issue #1330: DRILL-6147: Adding Columnar Parquet Batch Sizing functionality URL: https://github.com/apache/drill/pull/1330#issuecomment-401424230 @vrozov I discussed the logging with @sachouche. The issue is that some information needed to be logged for QA purposes on a per query basis, and for ease of testing needed to be toggled from the sqlline. This was the agreed on approach by the batch sizing team, so at this point we're coming in at the end of a series of discussions. Nevertheless I spent about half a day to find a way this use case could be met in a logback centric way, and I could not find a reasonable alternative that could be done without a large amount of work. The main issue is that slf4j / logback has global logging configurations, which makes it impossible to turn on logging on a per query bases. We can use MDC to include a query id in the log messages and we could filter by query, but even that is not ideal since you are generating logs you don't need for all queries. Then there is the issue of updating the logging configuration. The only logback native method is to enable periodic scanning of the logback.xml file and to update the logback.xml file on all nodes when you want to enable certain logging. That is not a great experience for someone automating tests, or debugging. The conclusion we came to was that Drill needs a real logging solution, and to really do it right you would have to go down the rabbit hole. 1. Enable per query logging configurations, not just a global logging configuration. To do this we would have to use thread local loggers instead of static loggers, or something like https://clogr.io/, or a custom ContextSelector https://stackoverflow.com/questions/38208617/practical-use-of-logback-context-selectors. 2. Allow either sql line configuration of log levels or web ui configuration of log levels (like Apex had). 3. Include a logging context with the query context to communicate the logging config for a query. These are non trivial, have a lot of corner cases, and require design discussions with the community. For the interim Salim is making the options controlling logging internal, so users will not see them. Then he is going to have a follow up PR to consolidate the logging across all the batch sizing operators. Later when we have a real logging solution we can delete these options and do the right thing.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services