findingrish commented on PR #13365: URL: https://github.com/apache/druid/pull/13365#issuecomment-1321738861
> Thanks for the contribution! I haven't reviewed the code, but I do have some comments based on building this branch on my machine and running it: > > 1. On my machine (64GB RAM) the script decided to set total memory = 52GB and then allocated that chunk of memory 100% to on-heap and direct memory for the various processes. It doesn't leave any room for page cache or for tasks on MMs. The calculations should be adjusted to take those items into account. (By "take page cache into account", I mean leave some memory free so the OS can use it to cache mmapped segments.) I suggest comparing the calculated numbers here against the single-server configs we have: https://druid.apache.org/docs/latest/operations/single-server.html. For example, 64GB RAM would match our "small" config, where we give the Historical a `4g` heap; the script currently generates `12752m`. > 2. The way the script accepts command-line arguments is nonstandard; consider using `argparse` instead and accepting `--x=y` style arguments. It would help with error messages, too. I tried an invalid `totalMemory` setting and noticed that the error message had a traceback in it, which makes it look messy. `argparse` wouldn't do that. > 3. I noticed that when I provide a very low `totalMemory` (like `40m`) the calculations generate negative numbers and then the processes crash. It would be better for the script to print a nice error message. > 4. The script should be less chatty in the normal case. Feel free to provide a `--verbose` mode, but normally, I don't think it should print anything. The first line the user should see is the one from `bin/greet`: "Starting Apache Druid." > 5. When you have scripts tail-calling out to other scripts, it's best to use `exec` (in bash) and `execv` or one of the other exec variants (in python). This way, each call replaces the original one rather than forming a chain. It is cleaner and uses fewer resources. For the bash script `start-druid`, refer to `post-index-task` for an example of how you can use `exec`. For python refer to https://docs.python.org/3/library/os.html for `os.exec*` functions. @gianm thanks for the feedback. Regarding `1`, after going over https://druid.apache.org/docs/latest/operations/single-server.html I tried to check the ratio (service memory (heap, direct) + task on MMs) / totalMemory nano -> (2 + 1)g / 4g micro -> (3 + 4)g / 16g small -> (25 + 6)g / 64g medium -> (44 + 8)g / 128g large -> (85 + 16)g / 256g xlarge -> (99 + 32)g / 512g except for nano and xlarge profiles, the ratio is close to 50% I am thinking of using 50% as default and allow the user to override this ratio -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
