suvodeep-pyne commented on code in PR #17159:
URL: https://github.com/apache/pinot/pull/17159#discussion_r2505833423
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/utils/ServerReloadJobStatusCache.java:
##########
@@ -35,26 +42,33 @@
*
* <p>Thread-safe for concurrent access. Uses Guava Cache with LRU eviction
* and time-based expiration.
+ *
+ * <p>Implements PinotClusterConfigChangeListener to support dynamic
configuration
+ * updates from ZooKeeper cluster config. When config changes, cache is rebuilt
+ * with new settings and existing entries are migrated.
*/
@ThreadSafe
-public class ServerReloadJobStatusCache {
+public class ServerReloadJobStatusCache implements
PinotClusterConfigChangeListener {
private static final Logger LOG =
LoggerFactory.getLogger(ServerReloadJobStatusCache.class);
+ private static final String CONFIG_PREFIX =
"pinot.server.table.reload.status.cache";
+ private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();
- private final Cache<String, ReloadJobStatus> _cache;
+ private volatile Cache<String, ReloadJobStatus> _cache;
+ private ServerReloadJobStatusCacheConfig _currentConfig;
/**
- * Creates a cache with the given configuration.
- *
+ * Creates a cache with default configuration.
Review Comment:
Thanks for sharing your thoughts. btw, I was able to 'unresolve' this
comment.
I want to go back to the context of the PR and where we are:
- Agenda is to solidify reload reporting especially in the case of failed
loading of segments
- Currently, Table reload jobs are async and the new cache provides a way to
get status for running/recent jobs from respective servers
- The cache is intentionally kept small in terms of memory footprint. The
max size of the current cache is < 1MB today and with segment level tracking in
future, may be ~100M with default 10k jobs (size).
- Default settings: The default of 10k job history should cover most cases.
However, I did want to have this configurable for completion sake. Hence this
PR.
Now, coming back to your point. I agree that there is value in enabling conf
support for these new configs. I am in favor of addressing this with a
subsequent PR later on based on needs. However, I need a little bit of help
understanding why we must have deployment level configurability or no
configurability at all (block the PR).
> Check out how we initialize SegmentDownloadThrottler in the server
starter. We don't hardcode, instead we check if a config exists and use that to
initialize or fallback to a default.
Thanks for sharing. this helps me understand this better. I was thinking
about an central way of resolving deployment confs and cluster configs vs
coding this up for each config set. Above code introduces more constants,
indirection and a certain complexity vs a basic readable POJO config class IMO.
This also introduces challenges in consistency, for eg, if we have both a
deployment config and cluster config, which one is prioritized? Also, from the
example shared, it seems like this handling logic can potentially be different
for different configs. Was that intentional in this case?
All in all, conf/cluster config resolution felt like a larger thing,
orthogonal to reload and hence, was in favor of tackling above as a separate
issue and solving that separately or centralizing it if required.
Look forward on your and @Jackie-Jiang 's thoughts.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]