ctubbsii commented on a change in pull request #2238:
URL: https://github.com/apache/accumulo/pull/2238#discussion_r693302409
##########
File path: pom.xml
##########
@@ -566,6 +566,11 @@
<artifactId>slf4j-api</artifactId>
<version>${slf4j.version}</version>
</dependency>
+ <dependency>
+ <groupId>org.yaml</groupId>
+ <artifactId>snakeyaml</artifactId>
+ <version>1.29</version>
+ </dependency>
Review comment:
This looks too good to be true... please correct me if I'm wrong, but at
first glance, this appears to be a pure Java yaml implementation licensed as
Apache 2.0 without any new transitive dependencies to conflict with.
##########
File path:
core/src/test/java/org/apache/accumulo/core/conf/cluster/ClusterConfigParserTest.java
##########
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.accumulo.core.conf.cluster;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertFalse;
+import static org.junit.Assert.assertNotNull;
+import static org.junit.Assert.assertTrue;
+
+import java.io.File;
+import java.net.URL;
+import java.util.Map;
+
+import org.junit.Test;
+
+import edu.umd.cs.findbugs.annotations.SuppressFBWarnings;
+
+@SuppressFBWarnings(value = "PATH_TRAVERSAL_IN", justification = "paths
provided by test")
+public class ClusterConfigParserTest {
+
+ @Test
+ public void testParse() throws Exception {
+ URL configFile = ClusterConfigParserTest.class.getResource("/cluster.yml");
Review comment:
See other comment about top-level resources that are likely to collide.
##########
File path: assemble/bin/accumulo-cluster
##########
@@ -147,86 +172,141 @@ function start_all() {
start_tservers
fi
- grep -Ev '(^#|^\s*$)' "${conf}/$manager_file" | while read -r host; do
- start_service "$host" manager
+ IFS_ORIG=$IFS
+ IFS=","
+
+ for manager in ${HOSTS["MANAGERS"]}; do
+ start_service "$manager" manager
+ done
+
+ for gc in "${HOSTS["GC"]}"; do
+ start_service "$gc" gc
+ done
+
+ for monitor in ${HOSTS["MONITORS"]}; do
+ start_service "$monitor" monitor
done
- grep -Ev '(^#|^\s*$)' "${conf}/gc" | while read -r host; do
- start_service "$host" gc
+ for tracer in ${HOSTS["TRACER"]}; do
+ start_service "$tracer" tracer
done
- grep -Ev '(^#|^\s*$)' "${conf}/tracers" | while read -r host; do
- start_service "$host" tracer
+ for coordinator in ${HOSTS["COMPACTION_COORDINATORS"]}; do
+ start_service "$coordinator" compaction-coordinator
done
- start_service "$monitor" monitor
+ for queue in ${HOSTS["COMPACTION_QUEUES"]}; do
+ for compactor in ${HOSTS["COMPACTORS_$queue"]}; do
+ start_service "$compactor" compactor "-q $queue"
Review comment:
These are two arguments, not one, so should be quoted as two:
```suggestion
start_service "$compactor" compactor -q "$queue"
```
##########
File path: server/base/src/main/java/org/apache/accumulo/server/util/ZooZap.java
##########
@@ -61,6 +61,10 @@ private static void message(String msg, Opts opts) {
boolean zapTservers = false;
@Parameter(names = "-tracers", description = "remove tracer locks")
boolean zapTracers = false;
+ @Parameter(names = "-coordinators", description = "remove compaction
coordinator locks")
Review comment:
I wish `compaction-coordinators` wasn't so lengthy. `-coordinators` is
just so generic, when we're referring to a specific coordinator.
##########
File path: core/src/test/resources/cluster.yml
##########
@@ -0,0 +1,53 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+managers:
+ - localhost1
+ - localhost2
+
+monitors:
+ - localhost1
+ - localhost2
+
+tracer:
+ - localhost
+
+gc:
Review comment:
This can also be plural, since we can have standby accumulo-gc services,
just like the manager or monitor. If you make `gc` plural, it looks like `gcs`,
which reads as "jee - see - ess" rather than the intended "jee - sees". To
normalize, could just make them all singular. In English, instead of reading
the `managers:` section as "list of `managers`", we read the `manager:` section
as "list of hosts with `manager` role".
##########
File path: assemble/bin/accumulo-service
##########
@@ -77,7 +79,7 @@ function start_service() {
rotate_log "$outfile"
rotate_log "$errfile"
- nohup "${bin}/accumulo" "$service" >"$outfile" 2>"$errfile" < /dev/null &
+ nohup "${bin}/accumulo" "$service" "${@:2}" >"$outfile" 2>"$errfile" <
/dev/null &
Review comment:
Hmm, was this related to something needed in this PR, or just a nice add
you found? I'm okay either way, just curious what motivated this.
##########
File path:
core/src/main/java/org/apache/accumulo/core/conf/cluster/ClusterConfigParser.java
##########
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.accumulo.core.conf.cluster;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.file.Files;
+import java.nio.file.Paths;
+import java.nio.file.StandardOpenOption;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.yaml.snakeyaml.Yaml;
+
+import edu.umd.cs.findbugs.annotations.SuppressFBWarnings;
+
+public class ClusterConfigParser {
Review comment:
This class looks about right. I'm not a big fan of when libraries force
you to work with `Object` types as generic params, but it looks like you don't
have a lot of choice there.
##########
File path: assemble/bin/accumulo-cluster
##########
@@ -43,52 +43,74 @@ function invalid_args {
exit 1
}
-function verify_config {
+function parse_config {
if [[ -f ${conf}/slaves ]]; then
echo "ERROR: A 'slaves' file was found in ${conf}/"
- echo "Accumulo now reads tablet server hosts from 'tservers' and requires
that the 'slaves' file not be present to reduce confusion."
+ echo "Accumulo now uses cluster host configuration information from
'cluster.yml' and requires that the 'slaves' file not be present to reduce
confusion."
echo "Please rename the 'slaves' file to 'tservers' or remove it if both
exist."
exit 1
fi
- if [[ ! -f ${conf}/tservers ]]; then
- echo "ERROR: A 'tservers' file was not found at ${conf}/tservers"
- echo "Please make sure it exists and is configured with tablet server
hosts."
+ if [[ ! -f ${conf}/cluster.yml ]]; then
+ echo "ERROR: A 'cluster.yml' file was not found at ${conf}/cluster.yml"
+ echo "Please make sure it exists and is configured with the host
information. Run 'create-config' to create an example configuration."
exit 1
fi
- unset manager1
- if [[ -f "${conf}/$manager_file" ]]; then
- manager1=$(grep -Ev '(^#|^\s*$)' "${conf}/$manager_file" | head -1)
+ config=$(${accumulo_cmd}
org.apache.accumulo.core.conf.cluster.ClusterConfigParser ${conf}/cluster.yml)
+
+ declare -A HOSTS
+
+ HOSTS["MANAGERS"]=$(grep managers $config | awk -F':' '{ print $2 }')
+ HOSTS["MONITORS"]=$(grep monitors $config | awk -F':' '{ print $2 }')
+ HOSTS["GC"]=$(grep gc $config | awk -F':' '{ print $2 }')
+ HOSTS["TRACER"]=$(grep tracer $config | awk -F':' '{ print $2 }')
+ HOSTS["TSERVERS"]=$(grep tservers $config | awk -F':' '{ print $2 }')
+
+ grep "compaction" $config $2>1 > /dev/null
+ compaction_in_config=$?
+
+ if [[ $compaction_in_config -eq 0 ]]; then
+ HOSTS["COMPACTION_COORDINATORS"]=$(grep "compaction.coordinators" $file |
awk -F':' '{ print $2 }')
+ HOSTS["COMPACTION_QUEUES"]=$(grep "compaction.compactors.queues" $file |
awk -F':' '{ print $2 }')
Review comment:
Probably better to use `cut -f2- -d:` rather than the heavyweight `awk
-F':' '{ print $2 }'`
##########
File path: assemble/bin/accumulo-cluster
##########
@@ -340,57 +458,74 @@ function main() {
accumulo_cmd="${bin}/accumulo"
SSH='ssh -qnf -o ConnectTimeout=2'
- manager_file="managers"
- if [[ ! -f "$conf/$manager_file" && -f "$conf/masters" ]]; then
- echo "WARN : Use of 'masters' file is deprecated; use 'managers' file
instead."
- manager_file="masters"
- fi
-
case "$1" in
create-config)
- echo "localhost" > "$conf/gc"
- echo "localhost" > "$conf/managers"
- echo "localhost" > "$conf/monitor"
- echo "localhost" > "$conf/tracers"
- echo "localhost" > "$conf/tservers"
+ cat <<EOF > $conf/cluster.yml
+managers:
+ - localhost
+
+monitors:
+ - localhost
+
+tracer:
+ - localhost
+
+gc:
+ - localhost
+
+tservers:
+ - localhost
+
+#compaction:
+# coordinators:
+# - localhost
+# compactors:
+# - queues:
+# - q1
+# - q2
+# - q1:
+# - localhost
+# - q2:
+# - localhost
+ EOF
;;
restart)
- verify_config
Review comment:
Nice how we already had a method that could be swapped out to do
something more useful.
##########
File path: assemble/bin/accumulo-cluster
##########
@@ -43,52 +43,74 @@ function invalid_args {
exit 1
}
-function verify_config {
+function parse_config {
if [[ -f ${conf}/slaves ]]; then
echo "ERROR: A 'slaves' file was found in ${conf}/"
- echo "Accumulo now reads tablet server hosts from 'tservers' and requires
that the 'slaves' file not be present to reduce confusion."
+ echo "Accumulo now uses cluster host configuration information from
'cluster.yml' and requires that the 'slaves' file not be present to reduce
confusion."
echo "Please rename the 'slaves' file to 'tservers' or remove it if both
exist."
exit 1
fi
- if [[ ! -f ${conf}/tservers ]]; then
- echo "ERROR: A 'tservers' file was not found at ${conf}/tservers"
- echo "Please make sure it exists and is configured with tablet server
hosts."
+ if [[ ! -f ${conf}/cluster.yml ]]; then
+ echo "ERROR: A 'cluster.yml' file was not found at ${conf}/cluster.yml"
+ echo "Please make sure it exists and is configured with the host
information. Run 'create-config' to create an example configuration."
exit 1
fi
- unset manager1
- if [[ -f "${conf}/$manager_file" ]]; then
- manager1=$(grep -Ev '(^#|^\s*$)' "${conf}/$manager_file" | head -1)
+ config=$(${accumulo_cmd}
org.apache.accumulo.core.conf.cluster.ClusterConfigParser ${conf}/cluster.yml)
+
+ declare -A HOSTS
+
+ HOSTS["MANAGERS"]=$(grep managers $config | awk -F':' '{ print $2 }')
+ HOSTS["MONITORS"]=$(grep monitors $config | awk -F':' '{ print $2 }')
+ HOSTS["GC"]=$(grep gc $config | awk -F':' '{ print $2 }')
+ HOSTS["TRACER"]=$(grep tracer $config | awk -F':' '{ print $2 }')
+ HOSTS["TSERVERS"]=$(grep tservers $config | awk -F':' '{ print $2 }')
+
+ grep "compaction" $config $2>1 > /dev/null
+ compaction_in_config=$?
+
+ if [[ $compaction_in_config -eq 0 ]]; then
+ HOSTS["COMPACTION_COORDINATORS"]=$(grep "compaction.coordinators" $file |
awk -F':' '{ print $2 }')
+ HOSTS["COMPACTION_QUEUES"]=$(grep "compaction.compactors.queues" $file |
awk -F':' '{ print $2 }')
+
+ IFS_ORIG=$IFS
+ IFS=","
+ for queue in ${HOSTS["COMPACTION_QUEUES"]}; do
+ HOSTS["COMPACTORS_$queue"]=$(grep "compaction.compactors.$queue" $file |
awk -F':' '{ print $2 }')
+ done
+ IFS=$IFS_ORIG
fi
- if [[ -z "${monitor}" ]] ; then
- monitor=$manager1
- if [[ -f "${conf}/monitor" ]]; then
- monitor=$(grep -Ev '(^#|^\s*$)' "${conf}/monitor" | head -1)
- fi
- if [[ -z "${monitor}" ]] ; then
- echo "Could not infer a Monitor role. You need to either define
\"${conf}/monitor\","
- echo "or make sure \"${conf}/$manager_file\" is non-empty."
- exit 1
- fi
+ if [[ ${#HOSTS["MANAGERS"]} -eq 0 ]]; then
+ echo "ERROR: managers not found in ${conf}/cluster.yml"
+ exit 1
fi
- if [[ ! -f "${conf}/tracers" ]]; then
- if [[ -z "${manager1}" ]] ; then
- echo "Could not find a manager node to use as a default for the tracer
role."
- echo "Either set up \"${conf}/tracers\" or make sure
\"${conf}/$manager_file\" is non-empty."
- exit 1
- else
- echo "$manager1" > "${conf}/tracers"
- fi
+
+ if [[ ${#HOSTS["TSERVERS"]} -eq 0 ]]; then
+ echo "ERROR: tservers not found in ${conf}/cluster.yml"
+ exit 1
fi
- if [[ ! -f "${conf}/gc" ]]; then
- if [[ -z "${manager1}" ]] ; then
- echo "Could not infer a GC role. You need to either set up
\"${conf}/gc\" or make sure \"${conf}/$manager_file\" is non-empty."
- exit 1
- else
- echo "$manager1" > "${conf}/gc"
- fi
+
+ unset manager1
+ manager1=$(echo "${HOSTS["MANAGERS"]}" | cut -d"," -f1)
+
+ if [[ ${#HOSTS["MONITORS"]} -eq 0 ]]; then
+ echo "WARN: monitors not found in ${conf}/cluster.yml, using first manager
host $manager1"
+ HOSTS["MONITORS"]=$manager1
+ exit 1
+ fi
Review comment:
You probably don't want to exit with an error if it's just a warning and
you're mitigating it by assigning the first manager. Also, the Java code could
probably do that logic, so that its output is always valid, and there's less
checking for whether a server type is set or falls back to the default.
##########
File path: assemble/bin/accumulo-cluster
##########
@@ -43,52 +43,74 @@ function invalid_args {
exit 1
}
-function verify_config {
+function parse_config {
if [[ -f ${conf}/slaves ]]; then
echo "ERROR: A 'slaves' file was found in ${conf}/"
- echo "Accumulo now reads tablet server hosts from 'tservers' and requires
that the 'slaves' file not be present to reduce confusion."
+ echo "Accumulo now uses cluster host configuration information from
'cluster.yml' and requires that the 'slaves' file not be present to reduce
confusion."
echo "Please rename the 'slaves' file to 'tservers' or remove it if both
exist."
exit 1
fi
- if [[ ! -f ${conf}/tservers ]]; then
- echo "ERROR: A 'tservers' file was not found at ${conf}/tservers"
- echo "Please make sure it exists and is configured with tablet server
hosts."
+ if [[ ! -f ${conf}/cluster.yml ]]; then
+ echo "ERROR: A 'cluster.yml' file was not found at ${conf}/cluster.yml"
+ echo "Please make sure it exists and is configured with the host
information. Run 'create-config' to create an example configuration."
exit 1
fi
- unset manager1
- if [[ -f "${conf}/$manager_file" ]]; then
- manager1=$(grep -Ev '(^#|^\s*$)' "${conf}/$manager_file" | head -1)
+ config=$(${accumulo_cmd}
org.apache.accumulo.core.conf.cluster.ClusterConfigParser ${conf}/cluster.yml)
Review comment:
Might want to catch a bad return code from Java here, as in:
```suggestion
config=$(${accumulo_cmd}
org.apache.accumulo.core.conf.cluster.ClusterConfigParser ${conf}/cluster.yml)
|| call_some_die_method_here
```
##########
File path:
core/src/test/java/org/apache/accumulo/core/conf/cluster/ClusterConfigParserTest.java
##########
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.accumulo.core.conf.cluster;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertFalse;
+import static org.junit.Assert.assertNotNull;
+import static org.junit.Assert.assertTrue;
+
+import java.io.File;
+import java.net.URL;
+import java.util.Map;
+
+import org.junit.Test;
+
+import edu.umd.cs.findbugs.annotations.SuppressFBWarnings;
+
+@SuppressFBWarnings(value = "PATH_TRAVERSAL_IN", justification = "paths
provided by test")
+public class ClusterConfigParserTest {
+
+ @Test
+ public void testParse() throws Exception {
+ URL configFile = ClusterConfigParserTest.class.getResource("/cluster.yml");
+ assertNotNull(configFile);
+
+ Map<String,String> contents =
+ ClusterConfigParser.parseConfiguration(new
File(configFile.toURI()).getAbsolutePath());
+ assertEquals(5, contents.size());
+ assertTrue(contents.containsKey("managers"));
+ assertEquals("localhost1,localhost2", contents.get("managers"));
+ assertTrue(contents.containsKey("monitors"));
+ assertEquals("localhost1,localhost2", contents.get("monitors"));
+ assertTrue(contents.containsKey("tracer"));
+ assertEquals("localhost", contents.get("tracer"));
+ assertTrue(contents.containsKey("gc"));
+ assertEquals("localhost", contents.get("gc"));
+ assertTrue(contents.containsKey("tservers"));
+ assertEquals("localhost1,localhost2,localhost3,localhost4",
contents.get("tservers"));
+ assertFalse(contents.containsKey("compaction"));
+ assertFalse(contents.containsKey("compaction.coordinators"));
+ assertFalse(contents.containsKey("compaction.compactors"));
+ assertFalse(contents.containsKey("compaction.compactors.queues"));
+ assertFalse(contents.containsKey("compaction.compactors.q1"));
+ assertFalse(contents.containsKey("compaction.compactors.q2"));
+ }
+
+ @Test
+ public void testParseWithExternalCompactions() throws Exception {
+ URL configFile =
+
ClusterConfigParserTest.class.getResource("/cluster-with-external-compactions.yml");
Review comment:
We have a lot of top-level resources. Resources can go in package
folders like classes. This makes their location on the class path less likely
to collide. Granted, this file name isn't likely to collide, but I still think
it's best practice to put resources in a package. Could put this file in
`src/test/resources/org/apache/accumulo/core/conf/cluster`, and then I think
you could drop the leading `/` when loading and just load it by file name (but
I'm not 100% sure I remember rules for locating resources).
##########
File path: assemble/bin/accumulo-cluster
##########
@@ -43,52 +43,74 @@ function invalid_args {
exit 1
}
-function verify_config {
+function parse_config {
if [[ -f ${conf}/slaves ]]; then
echo "ERROR: A 'slaves' file was found in ${conf}/"
- echo "Accumulo now reads tablet server hosts from 'tservers' and requires
that the 'slaves' file not be present to reduce confusion."
+ echo "Accumulo now uses cluster host configuration information from
'cluster.yml' and requires that the 'slaves' file not be present to reduce
confusion."
echo "Please rename the 'slaves' file to 'tservers' or remove it if both
exist."
exit 1
fi
- if [[ ! -f ${conf}/tservers ]]; then
- echo "ERROR: A 'tservers' file was not found at ${conf}/tservers"
- echo "Please make sure it exists and is configured with tablet server
hosts."
+ if [[ ! -f ${conf}/cluster.yml ]]; then
+ echo "ERROR: A 'cluster.yml' file was not found at ${conf}/cluster.yml"
+ echo "Please make sure it exists and is configured with the host
information. Run 'create-config' to create an example configuration."
exit 1
fi
- unset manager1
- if [[ -f "${conf}/$manager_file" ]]; then
- manager1=$(grep -Ev '(^#|^\s*$)' "${conf}/$manager_file" | head -1)
+ config=$(${accumulo_cmd}
org.apache.accumulo.core.conf.cluster.ClusterConfigParser ${conf}/cluster.yml)
+
+ declare -A HOSTS
+
+ HOSTS["MANAGERS"]=$(grep managers $config | awk -F':' '{ print $2 }')
+ HOSTS["MONITORS"]=$(grep monitors $config | awk -F':' '{ print $2 }')
+ HOSTS["GC"]=$(grep gc $config | awk -F':' '{ print $2 }')
+ HOSTS["TRACER"]=$(grep tracer $config | awk -F':' '{ print $2 }')
+ HOSTS["TSERVERS"]=$(grep tservers $config | awk -F':' '{ print $2 }')
+
+ grep "compaction" $config $2>1 > /dev/null
+ compaction_in_config=$?
+
+ if [[ $compaction_in_config -eq 0 ]]; then
Review comment:
You don't need to check `$?` when you can put the command directly into
the if statement. Here, `grep -q` is useful for matching without displaying the
output:
```suggestion
if echo "$config" | grep -q compaction; then
```
##########
File path: assemble/bin/accumulo-cluster
##########
@@ -304,16 +417,21 @@ function stop_here() {
if grep -Eq 'localhost|127[.]0[.]0[.]1' "${conf}/tservers"; then
${accumulo_cmd} admin stop localhost
else
+ IFS_ORIG=$IFS
+ IFS=","
for host in "${hosts_to_check[@]}"; do
- if grep -q "$host" "${conf}/tservers"; then
- ${accumulo_cmd} admin stop "$host"
- fi
+ for tserver in ${HOSTS["TSERVERS"]}; do
+ if grep -q "$host" "$tserver"; then
+ ${accumulo_cmd} admin stop "$host"
+ fi
+ done
done
+ IFS=$IFS_ORIG
Review comment:
This looks like the correct use of IFS... but messing with IFS can have
unexpected consequences, especially when a future dev comes in later and
changes something without understanding the implications.
Instead of having these comma-separated, just make them space separated or
use an array to begin with, as in the outer loop. It'd be much more reliable.
##########
File path: assemble/bin/accumulo-cluster
##########
@@ -340,57 +458,74 @@ function main() {
accumulo_cmd="${bin}/accumulo"
SSH='ssh -qnf -o ConnectTimeout=2'
- manager_file="managers"
- if [[ ! -f "$conf/$manager_file" && -f "$conf/masters" ]]; then
- echo "WARN : Use of 'masters' file is deprecated; use 'managers' file
instead."
- manager_file="masters"
- fi
-
case "$1" in
create-config)
- echo "localhost" > "$conf/gc"
- echo "localhost" > "$conf/managers"
- echo "localhost" > "$conf/monitor"
- echo "localhost" > "$conf/tracers"
- echo "localhost" > "$conf/tservers"
+ cat <<EOF > $conf/cluster.yml
+managers:
+ - localhost
+
+monitors:
+ - localhost
+
+tracer:
+ - localhost
+
+gc:
+ - localhost
+
+tservers:
+ - localhost
+
+#compaction:
+# coordinators:
+# - localhost
+# compactors:
+# - queues:
+# - q1
+# - q2
+# - q1:
+# - localhost
+# - q2:
+# - localhost
+ EOF
Review comment:
Nice. But, the EOF needs to be all the way to the left for this to work
properly. It needs to be the only thing on the line.
```suggestion
cat <<EOF > $conf/cluster.yml
managers:
- localhost
monitors:
- localhost
tracer:
- localhost
gc:
- localhost
tservers:
- localhost
#compaction:
# coordinators:
# - localhost
# compactors:
# - queues:
# - q1
# - q2
# - q1:
# - localhost
# - q2:
# - localhost
EOF
```
##########
File path: assemble/bin/accumulo-cluster
##########
@@ -43,52 +43,74 @@ function invalid_args {
exit 1
}
-function verify_config {
+function parse_config {
if [[ -f ${conf}/slaves ]]; then
echo "ERROR: A 'slaves' file was found in ${conf}/"
- echo "Accumulo now reads tablet server hosts from 'tservers' and requires
that the 'slaves' file not be present to reduce confusion."
+ echo "Accumulo now uses cluster host configuration information from
'cluster.yml' and requires that the 'slaves' file not be present to reduce
confusion."
echo "Please rename the 'slaves' file to 'tservers' or remove it if both
exist."
exit 1
fi
- if [[ ! -f ${conf}/tservers ]]; then
- echo "ERROR: A 'tservers' file was not found at ${conf}/tservers"
- echo "Please make sure it exists and is configured with tablet server
hosts."
+ if [[ ! -f ${conf}/cluster.yml ]]; then
+ echo "ERROR: A 'cluster.yml' file was not found at ${conf}/cluster.yml"
+ echo "Please make sure it exists and is configured with the host
information. Run 'create-config' to create an example configuration."
exit 1
fi
- unset manager1
- if [[ -f "${conf}/$manager_file" ]]; then
- manager1=$(grep -Ev '(^#|^\s*$)' "${conf}/$manager_file" | head -1)
+ config=$(${accumulo_cmd}
org.apache.accumulo.core.conf.cluster.ClusterConfigParser ${conf}/cluster.yml)
+
+ declare -A HOSTS
+
+ HOSTS["MANAGERS"]=$(grep managers $config | awk -F':' '{ print $2 }')
+ HOSTS["MONITORS"]=$(grep monitors $config | awk -F':' '{ print $2 }')
+ HOSTS["GC"]=$(grep gc $config | awk -F':' '{ print $2 }')
+ HOSTS["TRACER"]=$(grep tracer $config | awk -F':' '{ print $2 }')
+ HOSTS["TSERVERS"]=$(grep tservers $config | awk -F':' '{ print $2 }')
Review comment:
Don't rely on bash 4.0 associative arrays. They won't work on a Mac.
It's also unnecessary, because you can just as easily do `MANAGER_HOSTS=...` or
similar. You don't actually need all hosts to be stored in the same variable.
I'm also a bit confused by the grepping over a variable instead of a file
handle. I don't think that will work.
If the output was something like:
```
manager:host1,host2,host3
gc:host4
```
Then `$config` would look like (a space separated array):
```
manager:host1,host2,host3 gc:host4
```
then you could either loop through the array until you found an item
starting with `manager:`, or just grep for the manager chunk at word boundaries:
```bash
MANAGER_HOSTS=$(echo "$config" | grep '\<managers:[^ ]*' | cut -f2- -d: | tr
',' ' ')
```
Also be careful about using colon as a delimiter for hosts, because hosts
could be specified as IPv6, or we may wish to add port numbers at some point.
In my example above, it's okay, because it only removes the first colon with
the trailing dash in the field specifier, `-f2-` (interpret as fields 2 and
later; the delimiter was specified as a colon with `-d:`).
##########
File path: assemble/bin/accumulo-cluster
##########
@@ -43,52 +43,74 @@ function invalid_args {
exit 1
}
-function verify_config {
+function parse_config {
if [[ -f ${conf}/slaves ]]; then
echo "ERROR: A 'slaves' file was found in ${conf}/"
- echo "Accumulo now reads tablet server hosts from 'tservers' and requires
that the 'slaves' file not be present to reduce confusion."
+ echo "Accumulo now uses cluster host configuration information from
'cluster.yml' and requires that the 'slaves' file not be present to reduce
confusion."
echo "Please rename the 'slaves' file to 'tservers' or remove it if both
exist."
exit 1
fi
- if [[ ! -f ${conf}/tservers ]]; then
- echo "ERROR: A 'tservers' file was not found at ${conf}/tservers"
- echo "Please make sure it exists and is configured with tablet server
hosts."
+ if [[ ! -f ${conf}/cluster.yml ]]; then
+ echo "ERROR: A 'cluster.yml' file was not found at ${conf}/cluster.yml"
+ echo "Please make sure it exists and is configured with the host
information. Run 'create-config' to create an example configuration."
exit 1
fi
- unset manager1
- if [[ -f "${conf}/$manager_file" ]]; then
- manager1=$(grep -Ev '(^#|^\s*$)' "${conf}/$manager_file" | head -1)
+ config=$(${accumulo_cmd}
org.apache.accumulo.core.conf.cluster.ClusterConfigParser ${conf}/cluster.yml)
+
+ declare -A HOSTS
+
+ HOSTS["MANAGERS"]=$(grep managers $config | awk -F':' '{ print $2 }')
+ HOSTS["MONITORS"]=$(grep monitors $config | awk -F':' '{ print $2 }')
+ HOSTS["GC"]=$(grep gc $config | awk -F':' '{ print $2 }')
+ HOSTS["TRACER"]=$(grep tracer $config | awk -F':' '{ print $2 }')
+ HOSTS["TSERVERS"]=$(grep tservers $config | awk -F':' '{ print $2 }')
+
+ grep "compaction" $config $2>1 > /dev/null
+ compaction_in_config=$?
+
+ if [[ $compaction_in_config -eq 0 ]]; then
+ HOSTS["COMPACTION_COORDINATORS"]=$(grep "compaction.coordinators" $file |
awk -F':' '{ print $2 }')
+ HOSTS["COMPACTION_QUEUES"]=$(grep "compaction.compactors.queues" $file |
awk -F':' '{ print $2 }')
+
+ IFS_ORIG=$IFS
+ IFS=","
+ for queue in ${HOSTS["COMPACTION_QUEUES"]}; do
+ HOSTS["COMPACTORS_$queue"]=$(grep "compaction.compactors.$queue" $file |
awk -F':' '{ print $2 }')
Review comment:
could do `eval COMPACTORS_$queue=...` instead of an associative array
for these. Or, put the queue name at the front, in a regular bash array, as in:
```bash
COMPACTOR_QUEUES=($queue:'compactors,here')
```
You can use += on bash 3.2 on Macs, I've read. But, another way to append to
an array is:
```bash
COMPACTOR_QUEUES=("${COMPACTOR_QUEUES[@]}" $queue:'compactors,here')
```
##########
File path: assemble/bin/accumulo-cluster
##########
@@ -116,19 +138,21 @@ function control_service() {
if [[ $host == localhost || $host == "$(hostname -s)" || $host ==
"$(hostname -f)" || $host == $(get_ip) ]] ; then
ACCUMULO_SERVICE_INSTANCE="${ACCUMULO_SERVICE_INSTANCE}"
"${bin}/accumulo-service" "$service" "$control_cmd"
else
- $SSH "$host" "bash -c
'ACCUMULO_SERVICE_INSTANCE=${ACCUMULO_SERVICE_INSTANCE} ${bin}/accumulo-service
\"$service\" \"$control_cmd\"'"
+ $SSH "$host" "bash -c
'ACCUMULO_SERVICE_INSTANCE=${ACCUMULO_SERVICE_INSTANCE} ${bin}/accumulo-service
\"$service\" \"$control_cmd\" \"{$@:4}'"
Review comment:
This looks incorrect. There's no trailing escaped double-quote. Also,
the quoting is extremely hard to reason about here... not sure it's a great
idea to pass along additional arbitrary arguments to bash within ssh within
bash when they could have their own quoting issues.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]