himanshug commented on issue #6834: [Proposal] Add published segment cache in broker URL: https://github.com/apache/incubator-druid/issues/6834#issuecomment-507862581 > In particular, we had thought about moving the entire sys schema implementation to the Coordinator and having the Broker send any SQL queries on `sys` over there. Users could also query sys tables on the Coordinator directly if they wanted. I like that because, as a user I would like to use the feature introduced by `sys` table but it would be nice if it wasn't at the expense of each broker needing whole bunch of extra memory that I would like to save for real data queries. Regarding the counter arguments ... > It is a somewhat common request from our users to add an option to the Broker to either fail a query, or provide a .... We have a slightly different feature already available. Query context key, "uncoveredIntervalsLimit" that can be used in the query to return any intervals not covered by segments that we used to process query. this adds a header in response and user can discard the results. I think it was documented at some point in Query Contexts doc. This should work for many users. However it is not exactly what you pointed. For that, a "cache" wouldn't be enough because it could be stale and we wouldn't be able to guarantee whether results are really partial or not. But, I get it that "good enough" might be good enough. > #6319 contemplates a design for finer-grained loc ... sorry, haven't gone through it yet, so don't understand it. In any case, both counter arguments are basically saying that we need the cache at broker for other reasons. I would propose that we don't make cache at broker a prerequisite for `sys` table functionality .. other features might need the cache when there is no other way but me as a user would like to use `sys` without incurring extra memory at broker if possible. That said, if we do decide otherwise then I am fine with Broker getting information from Coordinator instead of directly going to DB for reasons that @gianm mentioned. General expectation from cluster is that data queries should continue to work in case of node failures as much possible not that all features need to work . I wouldn't worry about coordinator being down leading to Broker not having up-to-date response for `sys` table queries , when coordinator is down then new segments are not loaded on historicals (and many other big problems) happen so "all coordinators down" is a pretty bad situation anyway which cluster operators would want to resolve as quickly as possible.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
