Re: Three scripts needed to run the server, Why?
On Wed, Jul 12, 2017 at 5:49 AM, Tomas Repikwrote: > Thanks guys for joining the discussion, I hope you don't mind if I continue > to argue a bit more. > > The core intelligence and functionality of Cassandra server lays in the Java > classes, which reside in jar archives. This is the place where the main > functionality updates take place. To ease the use of the classes there is, > let's call it "wrapper" script (bin/cassandra), which sets up the environment > for the classes to provide the functionality. This wrapper uses two other > scripts: one of which sits in bin (the include) and the other in etc (the env > file). I agree that the files in bin should not be edited by the users, but > the following quotes from the wrapper script state the opposite: > "Any serious use-case though will likely require customization of the > include." > "Developers and enthusiasts can put a customized include file at > ~/.cassandra.in.sh." > According to these the include file is no different from the environment > file. But why would you have two separate files meant for the same purpose? cassandra-env.sh is meant to be user configuration, whereas cassandra.in.sh is system configuration. cassandra.in.sh can be used to customize the behavior of the startup script for the system you are deploying to; It is used to integrate. Packages can make customizations here, or you could template it for use with Puppet, Chef, etc. Once deployed, you would not edit this file again. cassandra-env.sh is configuration for Cassandra that lives above what is reasonable to configure in the application. Heap size is a good example of the sort handled here, something to be passed as an argument to the JVM, not something you could use cassandra.yaml for. -- Eric Evans john.eric.ev...@gmail.com - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org
Re: Three scripts needed to run the server, Why?
Thanks guys for joining the discussion, I hope you don't mind if I continue to argue a bit more. The core intelligence and functionality of Cassandra server lays in the Java classes, which reside in jar archives. This is the place where the main functionality updates take place. To ease the use of the classes there is, let's call it "wrapper" script (bin/cassandra), which sets up the environment for the classes to provide the functionality. This wrapper uses two other scripts: one of which sits in bin (the include) and the other in etc (the env file). I agree that the files in bin should not be edited by the users, but the following quotes from the wrapper script state the opposite: "Any serious use-case though will likely require customization of the include." "Developers and enthusiasts can put a customized include file at ~/.cassandra.in.sh." According to these the include file is no different from the environment file. But why would you have two separate files meant for the same purpose? What is more important is that to "configure" the options in both scripts the user has to be somewhat familiar with bash. The "bashy" stuff could be well hidden from the user in the wrapper script and the configuration options could be sitting in the cassandra.yaml file in the key-value pairs fashion like the other ones. When solving some issues that the users run into they would provide just a single configuration file and the maintainer would easily reproduce the issue by plugging in the single config file. Regarding the updating, only the wrapper script would be updated of course and the user modified config file would stay untouched in etc directory. Speaking about flexibility and the use-case when there is a upstream default, admin specific and user specific configuration, it is not a problem at all. Making the config file modular would do the job. There won't be any duplicity. In case user does not care about the configuration and just wants to run the server out of the box there are always default options embedded in the java classes. What do you think? I don't think my solution is ideal and I'd be glad to hear where my assumptions are wrong. Tomas - Original Message - > Standard unix/linux systems policy is that editable configurable files > go under /etc. It is not proper to edit files under /{s}bin or > /usr/{s}bin. $PATH contains /{s}bin and /usr/{s}bin files as executables > that can be run by a user, so that's why the basic separation of the > runnable files and tunable configuration files that are intended to be > edited. > > There may be multiple executables in /{s}bin and /usr/{s}bin that use > the common configurations under /etc - they may not be just single > purpose. If there were all configs contained in each executable script, > we would be repeating ourselves, as well as possibly creating unexpected > results, if they are not all aligned by the user. > > Additionally, package managers like apt and rpm should not overwrite > configuration files, if they have been edited, so hopefully, upgrades > won't hose a user-edited change under /etc. (Back them up, regardless). > If there is a fundamental change to the executables it /usr/{s}bin, they > will be overwritten by package managers, since users are expected to not > edit those. > > This is all really basic system administration and common policy for > most different software packages. Group common configs where they are > meant to be edited and split out various configs when it makes sense or > they may be utilized by various executables. > > The user may deviate from these common practices as they see fit, but > may also introduce self inflicted problems. :) > > -- > Kind regards, > Michael - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org
Re: Three scripts needed to run the server, Why?
Also, you can use bash to debug bin/cassandra: PS4=' $BASH_SOURCE:$LINENO: ' bash -x bin/cassandra This should print the filename of the file being executed/sourced and the line number being currently executed, so it should be easier to find out what happened, where and when. Of course, /bin/sh need not be bash, but I'm not sure what the equivalent method would be for dash or other shells. On 2017-07-12 00:15 (+0900), "Murukesh Mohanan"wrote: > What you complain about may be useful to someone else who might appreciate > the added flexibility. I'd personally be opposed to a single script, as I'd > rather not edit something that might cause conflicts or be overwritten on > upgrades (the location of the include and environment files being > configurable mean that they can be in an entirely different corner of the > filesystem). > > I can also think of cases where having two configurable files is useful. For > example, as an administrator, I'd keep everything in the cassandra install > directory read-only except for upgrades, then keep a common include file for > my users with some common configuration for my server, and let the users use > `$CASSANDRA_CONF` (the directory where the environment file is) to configure > everything else they wish for running their instances of Cassandra taking > advantage of the common install and base setup. Admittedly this isn't a > common use case. > > If you're modifying bin/cassandra, then you're doing it wrong, IMHO. Only two > files need to be examined: the (an?) included file and the environment file. > And if you simply need to override a setting, then, you can just use the > environment file as the ultimate override, since it is sourced after the > include (not by it). > > On 2017-07-11 23:39 (+0900), Tomas Repik wrote: > > Thanks for the answer, it did not help much. I have read this several times > > and this I already know, It still does not answer the question, why there > > is the need for three files instead of a single file. Not to mention > > multiple different config files. > > All these files are more or less configuration file which set up the > > environment and properties of the server. Why can't there be a single file > > that one would modify in order to tweak the server to his or her needs. In > > the current situation you have to search many different files to find the > > place where the option is configured. > > > > - Original Message - > > > > > > The bin/cassandra script has an explanation > > > (https://github.com/apache/cassandra/blob/trunk/bin/cassandra#L24): > > > > > > # As a convenience, a fragment of shell is sourced in order to set one or > > > # more of these variables. This so-called `include' can be placed in a > > > # number of locations and will be searched for in order. The lowest > > > # priority search path is the same directory as the startup script, and > > > # since this is the location of the sample in the project tree, it should > > > # almost work Out Of The Box. > > > # > > > # Any serious use-case though will likely require customization of the > > > # include. For production installations, it is recommended that you copy > > > # the sample to one of /usr/share/cassandra/cassandra.in.sh, > > > # /usr/local/share/cassandra/cassandra.in.sh, or > > > # /opt/cassandra/cassandra.in.sh and make your modifications there. > > > # > > > #[...] > > > # > > > # If you would rather configure startup entirely from the environment, you > > > # can disable the include by exporting an empty CASSANDRA_INCLUDE, or by > > > # ensuring that no include files exist in the aforementioned search list. > > > # Be aware that you will be entirely responsible for populating the needed > > > # environment variables. > > > > > > You can use just a single environment file, if you so wish. > > > > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > > > - > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org
Re: Three scripts needed to run the server, Why?
What you complain about may be useful to someone else who might appreciate the added flexibility. I'd personally be opposed to a single script, as I'd rather not edit something that might cause conflicts or be overwritten on upgrades (the location of the include and environment files being configurable mean that they can be in an entirely different corner of the filesystem). I can also think of cases where having two configurable files is useful. For example, as an administrator, I'd keep everything in the cassandra install directory read-only except for upgrades, then keep a common include file for my users with some common configuration for my server, and let the users use `$CASSANDRA_CONF` (the directory where the environment file is) to configure everything else they wish for running their instances of Cassandra taking advantage of the common install and base setup. Admittedly this isn't a common use case. If you're modifying bin/cassandra, then you're doing it wrong, IMHO. Only two files need to be examined: the (an?) included file and the environment file. And if you simply need to override a setting, then, you can just use the environment file as the ultimate override, since it is sourced after the include (not by it). On 2017-07-11 23:39 (+0900), Tomas Repikwrote: > Thanks for the answer, it did not help much. I have read this several times > and this I already know, It still does not answer the question, why there is > the need for three files instead of a single file. Not to mention multiple > different config files. > All these files are more or less configuration file which set up the > environment and properties of the server. Why can't there be a single file > that one would modify in order to tweak the server to his or her needs. In > the current situation you have to search many different files to find the > place where the option is configured. > > - Original Message - > > > > The bin/cassandra script has an explanation > > (https://github.com/apache/cassandra/blob/trunk/bin/cassandra#L24): > > > > # As a convenience, a fragment of shell is sourced in order to set one or > > # more of these variables. This so-called `include' can be placed in a > > # number of locations and will be searched for in order. The lowest > > # priority search path is the same directory as the startup script, and > > # since this is the location of the sample in the project tree, it should > > # almost work Out Of The Box. > > # > > # Any serious use-case though will likely require customization of the > > # include. For production installations, it is recommended that you copy > > # the sample to one of /usr/share/cassandra/cassandra.in.sh, > > # /usr/local/share/cassandra/cassandra.in.sh, or > > # /opt/cassandra/cassandra.in.sh and make your modifications there. > > # > > #[...] > > # > > # If you would rather configure startup entirely from the environment, you > > # can disable the include by exporting an empty CASSANDRA_INCLUDE, or by > > # ensuring that no include files exist in the aforementioned search list. > > # Be aware that you will be entirely responsible for populating the needed > > # environment variables. > > > > You can use just a single environment file, if you so wish. > > > > - > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org
Re: Three scripts needed to run the server, Why?
Standard unix/linux systems policy is that editable configurable files go under /etc. It is not proper to edit files under /{s}bin or /usr/{s}bin. $PATH contains /{s}bin and /usr/{s}bin files as executables that can be run by a user, so that's why the basic separation of the runnable files and tunable configuration files that are intended to be edited. There may be multiple executables in /{s}bin and /usr/{s}bin that use the common configurations under /etc - they may not be just single purpose. If there were all configs contained in each executable script, we would be repeating ourselves, as well as possibly creating unexpected results, if they are not all aligned by the user. Additionally, package managers like apt and rpm should not overwrite configuration files, if they have been edited, so hopefully, upgrades won't hose a user-edited change under /etc. (Back them up, regardless). If there is a fundamental change to the executables it /usr/{s}bin, they will be overwritten by package managers, since users are expected to not edit those. This is all really basic system administration and common policy for most different software packages. Group common configs where they are meant to be edited and split out various configs when it makes sense or they may be utilized by various executables. The user may deviate from these common practices as they see fit, but may also introduce self inflicted problems. :) -- Kind regards, Michael On 07/11/2017 09:39 AM, Tomas Repik wrote: > Thanks for the answer, it did not help much. I have read this several > times and this I already know, It still does not answer the question, > why there is the need for three files instead of a single file. Not > to mention multiple different config files. All these files are more > or less configuration file which set up the environment and > properties of the server. Why can't there be a single file that one > would modify in order to tweak the server to his or her needs. In the > current situation you have to search many different files to find the > place where the option is configured. > > - Original Message - >> >> The bin/cassandra script has an explanation >> (https://github.com/apache/cassandra/blob/trunk/bin/cassandra#L24): >> >> >> # As a convenience, a fragment of shell is sourced in order to set one or >> # more of these variables. This so-called `include' can be placed >> in a # number of locations and will be searched for in order. The >> lowest # priority search path is the same directory as the startup >> script, and # since this is the location of the sample in the >> project tree, it should # almost work Out Of The Box. # # Any >> serious use-case though will likely require customization of the # >> include. For production installations, it is recommended that you >> copy # the sample to one of /usr/share/cassandra/cassandra.in.sh, # >> /usr/local/share/cassandra/cassandra.in.sh, or # >> /opt/cassandra/cassandra.in.sh and make your modifications there. >> # #[...] # # If you would rather configure startup entirely from >> the environment, you # can disable the include by exporting an >> empty CASSANDRA_INCLUDE, or by # ensuring that no include files >> exist in the aforementioned search list. # Be aware that you will >> be entirely responsible for populating the needed # environment >> variables. >> >> You can use just a single environment file, if you so wish. >> > > - > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org
Re: Three scripts needed to run the server, Why?
Thanks for the answer, it did not help much. I have read this several times and this I already know, It still does not answer the question, why there is the need for three files instead of a single file. Not to mention multiple different config files. All these files are more or less configuration file which set up the environment and properties of the server. Why can't there be a single file that one would modify in order to tweak the server to his or her needs. In the current situation you have to search many different files to find the place where the option is configured. - Original Message - > > The bin/cassandra script has an explanation > (https://github.com/apache/cassandra/blob/trunk/bin/cassandra#L24): > > # As a convenience, a fragment of shell is sourced in order to set one or > # more of these variables. This so-called `include' can be placed in a > # number of locations and will be searched for in order. The lowest > # priority search path is the same directory as the startup script, and > # since this is the location of the sample in the project tree, it should > # almost work Out Of The Box. > # > # Any serious use-case though will likely require customization of the > # include. For production installations, it is recommended that you copy > # the sample to one of /usr/share/cassandra/cassandra.in.sh, > # /usr/local/share/cassandra/cassandra.in.sh, or > # /opt/cassandra/cassandra.in.sh and make your modifications there. > # > #[...] > # > # If you would rather configure startup entirely from the environment, you > # can disable the include by exporting an empty CASSANDRA_INCLUDE, or by > # ensuring that no include files exist in the aforementioned search list. > # Be aware that you will be entirely responsible for populating the needed > # environment variables. > > You can use just a single environment file, if you so wish. > - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org
Re: Three scripts needed to run the server, Why?
The bin/cassandra script has an explanation (https://github.com/apache/cassandra/blob/trunk/bin/cassandra#L24): # As a convenience, a fragment of shell is sourced in order to set one or # more of these variables. This so-called `include' can be placed in a # number of locations and will be searched for in order. The lowest # priority search path is the same directory as the startup script, and # since this is the location of the sample in the project tree, it should # almost work Out Of The Box. # # Any serious use-case though will likely require customization of the # include. For production installations, it is recommended that you copy # the sample to one of /usr/share/cassandra/cassandra.in.sh, # /usr/local/share/cassandra/cassandra.in.sh, or # /opt/cassandra/cassandra.in.sh and make your modifications there. # #[...] # # If you would rather configure startup entirely from the environment, you # can disable the include by exporting an empty CASSANDRA_INCLUDE, or by # ensuring that no include files exist in the aforementioned search list. # Be aware that you will be entirely responsible for populating the needed # environment variables. You can use just a single environment file, if you so wish. On 2017-07-11 23:08 (+0900), Tomas Repikwrote: > Greetings, > > I've been working with Cassandra for more than a year but I still wonder > about one thing: > > To run the server there is a bash script (cassandra) which uses another > script (cassandra.in.sh) which uses yet another bash script > (cassandra-env.sh). > What is the reason behind this? > Why there is not only a single file setting up the environment and running > the server? > > Thanks for your answers > > Tomas > > - > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org
Three scripts needed to run the server, Why?
Greetings, I've been working with Cassandra for more than a year but I still wonder about one thing: To run the server there is a bash script (cassandra) which uses another script (cassandra.in.sh) which uses yet another bash script (cassandra-env.sh). What is the reason behind this? Why there is not only a single file setting up the environment and running the server? Thanks for your answers Tomas - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org