zhangtongr created ZOOKEEPER-4914: ------------------------------------- Summary: [QUESTION] Strategies for monitoring and preventing "unreasonable length" errors in session closure Key: ZOOKEEPER-4914 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4914 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.6.3 Reporter: zhangtongr
Description: =========== I'm encountering potential risks with "unreasonable length" errors during session closure, particularly when sessions have numerous ephemeral nodes. I'd like to discuss possible monitoring and prevention strategies. Current Situation: ----------------- 1. When sessions with many ephemeral nodes are closed, all node paths are collected into a single transaction 2. If the combined size exceeds jute.maxbuffer, it results in "unreasonable length" errors 3. Currently lacking effective ways to predict or prevent this issue Questions: --------- 1. Monitoring Strategy: * What metrics should we monitor to predict potential issues? * Are there existing metrics for tracking session transaction sizes? * How can we monitor the growth of ephemeral nodes per session? 2. Prevention Approaches: * What are the recommended approaches to prevent this issue? * Is there a way to estimate transaction size before session closure? * Are there best practices for managing large numbers of ephemeral nodes? 3. Configuration Guidelines: * What's the recommended jute.maxbuffer setting for different scenarios? * Are there other relevant configuration parameters? --------------- Would appreciate insights on: 1. Additional metrics to monitor 2. Early warning indicators 3. Prevention strategies 4. Best practices for large-scale deployments Thank you for any guidance or suggestions. -- This message was sent by Atlassian Jira (v8.20.10#820010)