Re: Three scripts needed to run the server, Why?

2017-07-12 Thread Eric Evans
On Wed, Jul 12, 2017 at 5:49 AM, Tomas Repik  wrote:
> Thanks guys for joining the discussion, I hope you don't mind if I continue 
> to argue a bit more.
>
> The core intelligence and functionality of Cassandra server lays in the Java 
> classes, which reside in jar archives. This is the place where the main 
> functionality updates take place. To ease the use of the classes there is, 
> let's call it "wrapper" script (bin/cassandra), which sets up the environment 
> for the classes to provide the functionality. This wrapper uses two other 
> scripts: one of which sits in bin (the include) and the other in etc (the env 
> file). I agree that the files in bin should not be edited by the users, but 
> the following quotes from the wrapper script state the opposite:
> "Any serious use-case though will likely require customization of the 
> include."
> "Developers and enthusiasts can put a customized include file at 
> ~/.cassandra.in.sh."
> According to these the include file is no different from the environment 
> file. But why would you have two separate files meant for the same purpose?

cassandra-env.sh is meant to be user configuration, whereas
cassandra.in.sh is system configuration.

cassandra.in.sh can be used to customize the behavior of the startup
script for the system you are deploying to; It is used to integrate.
Packages can make customizations here, or you could template it for
use with Puppet, Chef, etc.  Once deployed, you would not edit this
file again.

cassandra-env.sh is configuration for Cassandra that lives above what
is reasonable to configure in the application.  Heap size is a good
example of the sort handled here, something to be passed as an
argument to the JVM, not something you could use cassandra.yaml for.


-- 
Eric Evans
john.eric.ev...@gmail.com

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Three scripts needed to run the server, Why?

2017-07-12 Thread Tomas Repik
Thanks guys for joining the discussion, I hope you don't mind if I continue to 
argue a bit more.

The core intelligence and functionality of Cassandra server lays in the Java 
classes, which reside in jar archives. This is the place where the main 
functionality updates take place. To ease the use of the classes there is, 
let's call it "wrapper" script (bin/cassandra), which sets up the environment 
for the classes to provide the functionality. This wrapper uses two other 
scripts: one of which sits in bin (the include) and the other in etc (the env 
file). I agree that the files in bin should not be edited by the users, but the 
following quotes from the wrapper script state the opposite: 
"Any serious use-case though will likely require customization of the include."
"Developers and enthusiasts can put a customized include file at 
~/.cassandra.in.sh."
According to these the include file is no different from the environment file. 
But why would you have two separate files meant for the same purpose? What is 
more important is that to "configure" the options in both scripts the user has 
to be somewhat familiar with bash. The "bashy" stuff could be well hidden from 
the user in the wrapper script and the configuration options could be sitting 
in the cassandra.yaml file in the key-value pairs fashion like the other ones. 
When solving some issues that the users run into they would provide just a 
single configuration file and the maintainer would easily reproduce the issue 
by plugging in the single config file. Regarding the updating, only the wrapper 
script would be updated of course and the user modified config file would stay 
untouched in etc directory. Speaking about flexibility and the use-case when 
there is a upstream default, admin specific and user specific configuration, it 
is not a problem at all. Making the config file modular would do the job. There 
won't be any duplicity. In case user does not care about the configuration and 
just wants to run the server out of the box there are always default options 
embedded in the java classes.

What do you think? I don't think my solution is ideal and I'd be glad to hear 
where my assumptions are wrong.

Tomas

- Original Message -
> Standard unix/linux systems policy is that editable configurable files
> go under /etc. It is not proper to edit files under /{s}bin or
> /usr/{s}bin. $PATH contains /{s}bin and /usr/{s}bin files as executables
> that can be run by a user, so that's why the basic separation of the
> runnable files and tunable configuration files that are intended to be
> edited.
> 
> There may be multiple executables in /{s}bin and /usr/{s}bin that use
> the common configurations under /etc - they may not be just single
> purpose. If there were all configs contained in each executable script,
> we would be repeating ourselves, as well as possibly creating unexpected
> results, if they are not all aligned by the user.
> 
> Additionally, package managers like apt and rpm should not overwrite
> configuration files, if they have been edited, so hopefully, upgrades
> won't hose a user-edited change under /etc. (Back them up, regardless).
> If there is a fundamental change to the executables it /usr/{s}bin, they
> will be overwritten by package managers, since users are expected to not
> edit those.
> 
> This is all really basic system administration and common policy for
> most different software packages. Group common configs where they are
> meant to be edited and split out various configs when it makes sense or
> they may be utilized by various executables.
> 
> The user may deviate from these common practices as they see fit, but
> may also introduce self inflicted problems. :)
> 
> --
> Kind regards,
> Michael

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Three scripts needed to run the server, Why?

2017-07-11 Thread Murukesh Mohanan
Also, you can use bash to debug bin/cassandra:

PS4=' $BASH_SOURCE:$LINENO:   ' bash -x bin/cassandra

This should print the filename of the file being executed/sourced and the line 
number being currently executed, so it should be easier to find out what 
happened, where and when. Of course, /bin/sh need not be bash, but I'm not sure 
what the equivalent method would be for dash or other shells.

On 2017-07-12 00:15 (+0900), "Murukesh Mohanan" 
wrote: 
> What you complain about may be useful to someone else who might appreciate 
> the added flexibility. I'd personally be opposed to a single script, as I'd 
> rather not edit something that might cause conflicts or be overwritten on 
> upgrades (the location of the include and environment files being 
> configurable mean that they can be in an entirely different corner of the 
> filesystem).
> 
> I can also think of cases where having two configurable files is useful. For 
> example, as an administrator, I'd keep everything in the cassandra install 
> directory read-only except for upgrades, then keep a common include file for 
> my users with some common configuration for my server, and let the users use  
> `$CASSANDRA_CONF` (the directory where the environment file is) to configure 
> everything else they wish for running their instances of Cassandra taking 
> advantage of the common install and base setup. Admittedly this isn't a 
> common use case.
> 
> If you're modifying bin/cassandra, then you're doing it wrong, IMHO. Only two 
> files need to be examined: the (an?) included file and the environment file. 
> And if you simply need to override a setting, then, you can just use the 
> environment file as the ultimate override, since it is sourced after the 
> include (not by it).
> 
> On 2017-07-11 23:39 (+0900), Tomas Repik  wrote: 
> > Thanks for the answer, it did not help much. I have read this several times 
> > and this I already know, It still does not answer the question, why there 
> > is the need for three files instead of a single file. Not to mention 
> > multiple different config files.
> > All these files are more or less configuration file which set up the 
> > environment and properties of the server. Why can't there be a single file 
> > that one would modify in order to tweak the server to his or her needs. In 
> > the current situation you have to search many different files to find the 
> > place where the option is configured.
> > 
> > - Original Message -
> > > 
> > > The bin/cassandra script has an explanation
> > > (https://github.com/apache/cassandra/blob/trunk/bin/cassandra#L24):
> > > 
> > > # As a convenience, a fragment of shell is sourced in order to set one or
> > > # more of these variables. This so-called `include' can be placed in a
> > > # number of locations and will be searched for in order. The lowest
> > > # priority search path is the same directory as the startup script, and
> > > # since this is the location of the sample in the project tree, it should
> > > # almost work Out Of The Box.
> > > #
> > > # Any serious use-case though will likely require customization of the
> > > # include. For production installations, it is recommended that you copy
> > > # the sample to one of /usr/share/cassandra/cassandra.in.sh,
> > > # /usr/local/share/cassandra/cassandra.in.sh, or
> > > # /opt/cassandra/cassandra.in.sh and make your modifications there.
> > > #
> > > #[...]
> > > #
> > > # If you would rather configure startup entirely from the environment, you
> > > # can disable the include by exporting an empty CASSANDRA_INCLUDE, or by
> > > # ensuring that no include files exist in the aforementioned search list.
> > > # Be aware that you will be entirely responsible for populating the needed
> > > # environment variables.
> > > 
> > > You can use just a single environment file, if you so wish.
> > > 
> > 
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > 
> > 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Three scripts needed to run the server, Why?

2017-07-11 Thread Murukesh Mohanan
What you complain about may be useful to someone else who might appreciate the 
added flexibility. I'd personally be opposed to a single script, as I'd rather 
not edit something that might cause conflicts or be overwritten on upgrades 
(the location of the include and environment files being configurable mean that 
they can be in an entirely different corner of the filesystem).

I can also think of cases where having two configurable files is useful. For 
example, as an administrator, I'd keep everything in the cassandra install 
directory read-only except for upgrades, then keep a common include file for my 
users with some common configuration for my server, and let the users use  
`$CASSANDRA_CONF` (the directory where the environment file is) to configure 
everything else they wish for running their instances of Cassandra taking 
advantage of the common install and base setup. Admittedly this isn't a common 
use case.

If you're modifying bin/cassandra, then you're doing it wrong, IMHO. Only two 
files need to be examined: the (an?) included file and the environment file. 
And if you simply need to override a setting, then, you can just use the 
environment file as the ultimate override, since it is sourced after the 
include (not by it).

On 2017-07-11 23:39 (+0900), Tomas Repik  wrote: 
> Thanks for the answer, it did not help much. I have read this several times 
> and this I already know, It still does not answer the question, why there is 
> the need for three files instead of a single file. Not to mention multiple 
> different config files.
> All these files are more or less configuration file which set up the 
> environment and properties of the server. Why can't there be a single file 
> that one would modify in order to tweak the server to his or her needs. In 
> the current situation you have to search many different files to find the 
> place where the option is configured.
> 
> - Original Message -
> > 
> > The bin/cassandra script has an explanation
> > (https://github.com/apache/cassandra/blob/trunk/bin/cassandra#L24):
> > 
> > # As a convenience, a fragment of shell is sourced in order to set one or
> > # more of these variables. This so-called `include' can be placed in a
> > # number of locations and will be searched for in order. The lowest
> > # priority search path is the same directory as the startup script, and
> > # since this is the location of the sample in the project tree, it should
> > # almost work Out Of The Box.
> > #
> > # Any serious use-case though will likely require customization of the
> > # include. For production installations, it is recommended that you copy
> > # the sample to one of /usr/share/cassandra/cassandra.in.sh,
> > # /usr/local/share/cassandra/cassandra.in.sh, or
> > # /opt/cassandra/cassandra.in.sh and make your modifications there.
> > #
> > #[...]
> > #
> > # If you would rather configure startup entirely from the environment, you
> > # can disable the include by exporting an empty CASSANDRA_INCLUDE, or by
> > # ensuring that no include files exist in the aforementioned search list.
> > # Be aware that you will be entirely responsible for populating the needed
> > # environment variables.
> > 
> > You can use just a single environment file, if you so wish.
> > 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Three scripts needed to run the server, Why?

2017-07-11 Thread Michael Shuler
Standard unix/linux systems policy is that editable configurable files
go under /etc. It is not proper to edit files under /{s}bin or
/usr/{s}bin. $PATH contains /{s}bin and /usr/{s}bin files as executables
that can be run by a user, so that's why the basic separation of the
runnable files and tunable configuration files that are intended to be
edited.

There may be multiple executables in /{s}bin and /usr/{s}bin that use
the common configurations under /etc - they may not be just single
purpose. If there were all configs contained in each executable script,
we would be repeating ourselves, as well as possibly creating unexpected
results, if they are not all aligned by the user.

Additionally, package managers like apt and rpm should not overwrite
configuration files, if they have been edited, so hopefully, upgrades
won't hose a user-edited change under /etc. (Back them up, regardless).
If there is a fundamental change to the executables it /usr/{s}bin, they
will be overwritten by package managers, since users are expected to not
edit those.

This is all really basic system administration and common policy for
most different software packages. Group common configs where they are
meant to be edited and split out various configs when it makes sense or
they may be utilized by various executables.

The user may deviate from these common practices as they see fit, but
may also introduce self inflicted problems. :)

-- 
Kind regards,
Michael

On 07/11/2017 09:39 AM, Tomas Repik wrote:
> Thanks for the answer, it did not help much. I have read this several
> times and this I already know, It still does not answer the question,
> why there is the need for three files instead of a single file. Not
> to mention multiple different config files. All these files are more
> or less configuration file which set up the environment and
> properties of the server. Why can't there be a single file that one
> would modify in order to tweak the server to his or her needs. In the
> current situation you have to search many different files to find the
> place where the option is configured.
> 
> - Original Message -
>> 
>> The bin/cassandra script has an explanation 
>> (https://github.com/apache/cassandra/blob/trunk/bin/cassandra#L24):
>>
>>
>> 
# As a convenience, a fragment of shell is sourced in order to set one or
>> # more of these variables. This so-called `include' can be placed
>> in a # number of locations and will be searched for in order. The
>> lowest # priority search path is the same directory as the startup
>> script, and # since this is the location of the sample in the
>> project tree, it should # almost work Out Of The Box. # # Any
>> serious use-case though will likely require customization of the #
>> include. For production installations, it is recommended that you
>> copy # the sample to one of /usr/share/cassandra/cassandra.in.sh, #
>> /usr/local/share/cassandra/cassandra.in.sh, or #
>> /opt/cassandra/cassandra.in.sh and make your modifications there. 
>> # #[...] # # If you would rather configure startup entirely from
>> the environment, you # can disable the include by exporting an
>> empty CASSANDRA_INCLUDE, or by # ensuring that no include files
>> exist in the aforementioned search list. # Be aware that you will
>> be entirely responsible for populating the needed # environment
>> variables.
>> 
>> You can use just a single environment file, if you so wish.
>> 
> 
> -
>
> 
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Three scripts needed to run the server, Why?

2017-07-11 Thread Tomas Repik
Thanks for the answer, it did not help much. I have read this several times and 
this I already know, It still does not answer the question, why there is the 
need for three files instead of a single file. Not to mention multiple 
different config files.
All these files are more or less configuration file which set up the 
environment and properties of the server. Why can't there be a single file that 
one would modify in order to tweak the server to his or her needs. In the 
current situation you have to search many different files to find the place 
where the option is configured.

- Original Message -
> 
> The bin/cassandra script has an explanation
> (https://github.com/apache/cassandra/blob/trunk/bin/cassandra#L24):
> 
> # As a convenience, a fragment of shell is sourced in order to set one or
> # more of these variables. This so-called `include' can be placed in a
> # number of locations and will be searched for in order. The lowest
> # priority search path is the same directory as the startup script, and
> # since this is the location of the sample in the project tree, it should
> # almost work Out Of The Box.
> #
> # Any serious use-case though will likely require customization of the
> # include. For production installations, it is recommended that you copy
> # the sample to one of /usr/share/cassandra/cassandra.in.sh,
> # /usr/local/share/cassandra/cassandra.in.sh, or
> # /opt/cassandra/cassandra.in.sh and make your modifications there.
> #
> #[...]
> #
> # If you would rather configure startup entirely from the environment, you
> # can disable the include by exporting an empty CASSANDRA_INCLUDE, or by
> # ensuring that no include files exist in the aforementioned search list.
> # Be aware that you will be entirely responsible for populating the needed
> # environment variables.
> 
> You can use just a single environment file, if you so wish.
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Three scripts needed to run the server, Why?

2017-07-11 Thread Murukesh Mohanan

The bin/cassandra script has an explanation 
(https://github.com/apache/cassandra/blob/trunk/bin/cassandra#L24):

# As a convenience, a fragment of shell is sourced in order to set one or
# more of these variables. This so-called `include' can be placed in a 
# number of locations and will be searched for in order. The lowest 
# priority search path is the same directory as the startup script, and
# since this is the location of the sample in the project tree, it should
# almost work Out Of The Box.
#
# Any serious use-case though will likely require customization of the
# include. For production installations, it is recommended that you copy
# the sample to one of /usr/share/cassandra/cassandra.in.sh,
# /usr/local/share/cassandra/cassandra.in.sh, or 
# /opt/cassandra/cassandra.in.sh and make your modifications there.
#
#[...]
# 
# If you would rather configure startup entirely from the environment, you
# can disable the include by exporting an empty CASSANDRA_INCLUDE, or by 
# ensuring that no include files exist in the aforementioned search list.
# Be aware that you will be entirely responsible for populating the needed
# environment variables.

You can use just a single environment file, if you so wish.

On 2017-07-11 23:08 (+0900), Tomas Repik  wrote: 
> Greetings,
> 
> I've been working with Cassandra for more than a year but I still wonder 
> about one thing:
> 
> To run the server there is a bash script (cassandra) which uses another 
> script (cassandra.in.sh) which uses yet another bash script 
> (cassandra-env.sh).
> What is the reason behind this?
> Why there is not only a single file setting up the environment and running 
> the server? 
> 
> Thanks for your answers
> 
> Tomas
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Three scripts needed to run the server, Why?

2017-07-11 Thread Tomas Repik
Greetings,

I've been working with Cassandra for more than a year but I still wonder about 
one thing:

To run the server there is a bash script (cassandra) which uses another script 
(cassandra.in.sh) which uses yet another bash script (cassandra-env.sh).
What is the reason behind this?
Why there is not only a single file setting up the environment and running the 
server? 

Thanks for your answers

Tomas

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org