jihoonson commented on issue #7233: Set "is_published" to false for overshadowed segments in sys.segments table URL: https://github.com/apache/incubator-druid/issues/7233#issuecomment-472101340 Hmm, ok. It looks that there are some misunderstandings and confusions. First of all, I think it makes sense to keep `is_published` column as it is. But, I also want to add to a new column for non-overshadowed segments in system schema (regardless of their `used` value), like `is_queryable`, `is_available`, or `is_overshadowed` (this is opposite to first 2 names). To keep both columns, I think users can avoid confusions and query more flexibly. I also have some questions for the meaning of "published segments". > `is_published` is meant to mean "has bee published to the metadata store" according to the docs. I think this means any segments in the `segments` table of the metadata store regardless of their `used` flag. This also matches to the meaning of "publishing segments" in indexing tasks. But, you said, > All segments with used = true in the metadata store (what it means today) This isn't exactly same with the meaning of `is_published` since overshadowed segments can still be in the metadata store with `used = false`. I'm not sure what you mean by "what it means today". Is there any document about this? I feel `is_active` or `is_enabled` is more appropriate for these segments. `is_available` (or `is_queryable`) is a bit different. These segments would have mix of `used` if some historicals haven't unannounced them yet. Also, some segments with `used = true` in metadata store can be missing in these segments. > This is sorta like is_available. Except I think that one also includes some overshadowed segments, so it's not really what's "currently being queried". (I'm not 100% sure about that but maybe 80% sure) Yeah, if you're thinking the lag in brokers until they refresh their cache, it might not be "currently" being queried.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
