gianm commented on issue #7233: Set "is_published" to false for overshadowed segments in sys.segments table URL: https://github.com/apache/incubator-druid/issues/7233#issuecomment-471814859 `is_published` is meant to mean "has bee published to the metadata store" according to the docs. And to my heart, in that I do think that's a useful thing for there to be a field for. I guess the question is what does that really mean? 1. All segments with `used = true` in the metadata store (what it means today) 2. All segments with `used = true` in the metadata store, that are also not overshadowed by any other segments with `used = true` in the metadata store (what this proposal is suggesting changing it to) I think (2) makes sense, since as this proposal points out, (1) is unintuitive when doing an overwrite of a set of segments. The old set and new set will both have `used = true` for a while, but for all practical purposes the old set is considered overshadowed at that point. > Also, I'm not sure it's most useful. I think most ppl wouldn't care what are in metadata store, but would care what segments are currently being queried. This is sorta like `is_available`. Except I think that one also includes some overshadowed segments, so it's not really what's "currently being queried". (I'm not 100% sure about that but maybe 80% sure) Maybe what is happening here is that we want versions of both `is_published` and `is_available` that take into account whether the segments have been overshadowed or not.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
