[
https://issues.apache.org/jira/browse/METRON-609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15839422#comment-15839422
]
ASF GitHub Bot commented on METRON-609:
---------------------------------------
GitHub user mattf-horton opened a pull request:
https://github.com/apache/incubator-metron/pull/425
METRON 609 Enhance Mpack to handle single-node and small-cluster installs
of Elasticsearch
This PR is not ready for prime time, but is provided for ease of access to
work-in-progress for:
- METRON-609 Enhance Mpack to handle single-node and small-cluster installs
of Elasticsearch, and
- METRON-634 Mpack bug fixes and improvements (not related to singlenode
install).
These are presented as two separate commits, so you can look at them
separately if you wish.
These are the included enhancements from METRON-609:
- Enable 1-, 2-, and 3-node clusters to have a working Elasticsearch
install via the Mpack.
- Change constraints from 1+ Masters and 3+ Slaves, to 1+ and 0+.
- Allow non-dedicated master/datanodes via boolean
"masters_also_are_datanodes".
- Allow use of alternative single-node template via
"single_node_elasticsearch" boolean.
- Only the 1- and 4-node clusters have been tested, last month.
- Improve various mouse-over Description fields in the GUI.
- I included the attempted validation check on (storm) num_slots =
slots_per_supervisor * num_supervisors. This doesn't currently work due to
pre-existing bug in other parts of validation check, so haven't been able to
test.
These are the included enhancements and bug fixes from METRON-634:
NOT AFFECTING THE AMBARI DATABASE:
- ES pid_dir specification and usage:
- Currently pid_dir is multiply specified in elastic-env.xml and
params.py. The config parameter should not be over-ridden in params.py.
- PID_DIR failed to be included in /etc/sysconfig/elasticsearch. It
needs to be added to the template in elastic-sysconfig, as it must be provided
to ES at launch-time (else the default directory will be used).
- pid_file is specified in params.py, but is not used anywhere. (The ES
internal launcher synthesizes it from PID_DIR, and this is appropriate.)
- JAVA_HOME needs to be provided in /etc/sysconfig/elasticsearch (templated
in elastic-sysconfig.xml). Its absence causes Centos7 systemctl to fail the ES
launch, unless /bin/java is defined (which it isn't necessarily).
- Also in the /etc/sysconfig/elasticsearch template in
elastic-sysconfig.xml, the value of ES_JAVA_OPTS incorrectly spans 3 lines. The
lines must be terminated with backslashes to effectively become a single line.
The current inclusion of newlines in the long string value is acceptable
(although unusual) in shellscript, but not in a systemd EnvironmentFile.
/etc/sysconfig/elasticsearch must function as both.
- Also in ES_JAVA_OPTS, the two instances of log_dir needs to be followed
by a slash '/'
- In elastic.py, when directories are being pre-created and permissions
set, the file $CONF_DIR/scripts should also be pre-created. I intermittently
hit permissions issues with this directory being created later by root, and not
properly assigned to elastic_user.
- In several places in elastic.py, "params.elastic_user" is incorrectly
used when "params.user_group" should be used.
- Undefined "format()" method is used in elastic.py, unnecessarily in
File(format("/etc/sysconfig/elasticsearch")...
- Undefined "format()" method is similarly used several times unnecessarily
in elastic_master.py
- The comments and descriptions in elastic-site.xml have multiple suggested
improvements.
- Provide Quick Links in Ambari service page for Elasticsearch to the
self-report pages for ES health and ES node list. (very useful for debugging)
CHANGES THAT DO AFFECT THE AMBARI DATABASE:
- pid_dir SHOULD be specified in elastic-sysconfig.xml, rather than
elastic-env.xml, as it is a parameter that must be provided to ES at
launch-time, but is not something there's any reason for the admin to change in
usual circumstances.
- conf_dir SHOULD be specified in elastic-env.xml or elastic-site.xml, not
in elastic-sysconfig.xml. While it too is a parameter that must be provided to
ES at launch-time, it is typically left to the installing admin where to put
the config files.
- The Ambari configuration parameter names in elastic-site.xml should be
improved in several instances to make the semantics more obvious to the human
reader (who may not be real familiar with Elasticsearch configuration).
Mouse-over documentation will continue to provide the ES config parameter
equivalents. In particular, suggest:
- cluster_name -> es_cluster_name (to distinguish ES cluster from
Stack cluster)
- zen_discovery_ping_unicast_hosts -> es_cluster_hosts
- network_host -> network_bindings (these are in fact interface names,
not host names)
- There are at least two places in elasticsearch.master.yaml.j2
(zen_discovery_ping_unicast_hosts and network_host) where needed square
brackets are either missing or included in the configuration string. To be
consistent with other usages, and less prone to human error, the square
brackets should not be in the configuration string but rather should be
provided in the template text.
- In METRON/0.3.0/configuration/metron-env.xml and
METRON/0.3.0/package/scripts/params/params_linux.py, the value
"metron_apps_indexed_hdfs_dir" does not need to be settable by admin; it is
appropriate to require it to be subordinate to "metron_apps_hdfs_dir". Thus it
can be removed from metron-env.xml and set to
"{metron_apps_hdfs_dir}/indexing/indexed" in params_linux.py. This also
eliminates a really unacceptable use of "double format".
NOTE that these changes, because they affect the database, should properly
be accompanied by a database update script and a version increment in the Mpack
version number. This is not currently implemented.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mattf-horton/incubator-metron METRON-609
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-metron/pull/425.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #425
----
commit 0fd12a5bab2745e7c496657ef92b792b60faf2bf
Author: mattf-horton <[email protected]>
Date: 2017-01-25T22:39:23Z
METRON-609 Enhance Mpack to handle single-node and small-cluster installs
of Elasticsearch. Work in Progress, at request of David Lyle.
commit 1af5376d59fe4c1812bda519e9b960dc74fdb0d6
Author: mattf-horton <[email protected]>
Date: 2017-01-26T07:41:04Z
METRON-634 Mpack bug fixes and improvements (not related to singlenode
install). Partial: all improvements from METRON-634 already proved out in
METRON-608.
----
> Enhance Mpack to handle single-node and small-cluster installs of
> Elasticsearch
> -------------------------------------------------------------------------------
>
> Key: METRON-609
> URL: https://issues.apache.org/jira/browse/METRON-609
> Project: Metron
> Issue Type: Improvement
> Reporter: Matt Foley
> Assignee: Matt Foley
>
> The current Mpack for Ambari install of Metron requires that Elasticsearch be
> installed with 1+ dedicated Masters and 3+ dedicated Slaves. It does not
> support combined master/datanode configuration (non-"dedicated" Master),
> which is the recommended way to run a single-node configuration of
> Elasticsearch; nor does it allow less than 3 dedicated Slaves, thereby
> requiring the cluster have at least 4 nodes.
> This jira proposes to enable 1-, 2-, and 3-node clusters to have a working
> Elasticsearch install via the Mpack. It will also allow non-dedicated
> master/datanodes, which are required for single-node and useful for few-node
> clusters.
> I intend to also include the following enhancements and fixes:
> * Determine whether elastic-sysconfig:data_dir and elastic-site:path_data
> should have same default value and if so fix it. (I think they should, and
> they don't. Probably there should only be one variable instead of two.)
> * Provide Quick Links in Ambari service page for Elasticsearch to the
> self-report pages for Elasticsearch health, node list, and other "_cat"
> status. May include some "_cluster" status.
> * Improve various mouse-over Description fields in the GUI
> * Improve the order of fields in the GUI, to group at the beginning the
> fields that must be modified or are commonly modified. Possibly separate
> "basic" and "advanced" configs.
> * Provide a README.md that describes what is going on with the various
> resource files in the Mpack src
> * Enhance the wiki page for cluster installation, with regard to single-node
> install with Mpack.
> * Database-preserving upgrade from version 1.0.0.0.
> Also see
> https://github.com/apache/incubator-metron/tree/master/metron-deployment#current-limitations
> for some other known issues. And the documentation in that README.md page
> is significantly out of date wrt the Mpack.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)