[GitHub] keith-turner closed pull request #21: Created performance test framework
keith-turner closed pull request #21: Created performance test framework URL: https://github.com/apache/accumulo-testing/pull/21 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/README.md b/README.md index 8826cc3..c1e8d91 100644 --- a/README.md +++ b/README.md @@ -53,7 +53,7 @@ The YARN application can be killed at any time using the YARN resource manager o ## Continuous Ingest & Query The Continuous Ingest test runs many ingest clients that continually create linked lists of data -in Accumulo. During ingest, query applications can be run to continously walk and verify the the +in Accumulo. During ingest, query applications can be run to continuously walk and verify the linked lists and put a query load on Accumulo. At some point, the ingest clients are stopped and a MapReduce job is run to ensure that there are no holes in any linked list. @@ -135,6 +135,78 @@ Run the command below stop the agitator: ./bin/accumulo-testing agitator stop +## Performance Test + +To run performance test a `cluster-control.sh` script is needed to assist with starting, stopping, +wiping, and confguring an Accumulo instance. This script should define the following functions. + +```bash +function get_version { + case $1 in +ACCUMULO) + # TODO echo accumulo version + ;; +HADOOP) + # TODO echo hadoop version + ;; +ZOOKEEPER) + # TODO echo zookeeper version + ;; +*) + return 1 + esac +} + +function start_cluster { + # TODO start Hadoop and Zookeeper if needed +} + +function setup_accumulo { + # TODO kill any running Accumulo instance + # TODO setup a fresh install of Accumulo w/o starting it +} + +function get_config_file { + local file_to_get=$1 + local dest_dir=$2 + # TODO copy $file_to_get from Accumulo conf dir to $dest_dir +} + +function put_config_file { + local config_file=$1 + # TODO copy $config_file to Accumulo conf dir +} + +function put_server_code { + local jar_file=$1 + # TODO add $jar_file to Accumulo's server side classpath. Could put it in $ACCUMULO_HOME/lib/ext +} + +function start_accumulo { + # TODO start accumulo +} + +function stop_cluster { + # TODO kill Accumulo, Hadoop, and Zookeeper +} +``` + + + +An example script for [Uno] is provided. To use this doe the following and set +`UNO_HOME` after copying. + +cp conf/cluster-control.sh.uno conf/cluster-control.sh + +After the cluster control script is setup, the following will run performance +test and produce json result files. + +./bin/performance-test run + +There are some utilities for working with the json result files, run the performance-test script +with no options to see them. + +[Uno]: https://github.com/apache/fluo-uno [modules]: core/src/main/resources/randomwalk/modules [image]: core/src/main/resources/randomwalk/modules/Image.xml [ti]: https://travis-ci.org/apache/accumulo-testing.svg?branch=master diff --git a/bin/performance-test b/bin/performance-test new file mode 100755 index 000..93014e5 --- /dev/null +++ b/bin/performance-test @@ -0,0 +1,97 @@ +#! /usr/bin/env bash + +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +bin_dir=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd ) +at_home=$( cd "$( dirname "$bin_dir" )" && pwd ) +at_version=2.0.0-SNAPSHOT + +function print_usage() { + cat < () + +Possible commands: + run Runs performance tests. + compare Compares results of two test. + csv {files} Converts results to CSV +EOF +} + + +function build_shade_jar() { + at_shaded_jar="$at_home/core/target/accumulo-testing-core-$at_version-shaded.jar" + if [ ! -f "$at_shaded_jar" ]; then +echo "Building $at_shaded_jar" +cd "$at_home" || exit 1 +mvn clean package -P create-shade-jar -D skipTests -D accumulo.version=$(get_version "ACCUMULO") -D hadoop.version=$(get_version "HADOOP") -D zookeeper.version=$(get_version "ZOOKEEPER") + fi +} + +log4j_config="$at_home/conf/log4j.properties" +if [ ! -f
[GitHub] mikewalch opened a new issue #23: Add units to performance test results
mikewalch opened a new issue #23: Add units to performance test results URL: https://github.com/apache/accumulo-testing/issues/23 It would be helpful if the JSON contained units like 'milliseconds' or 'ms' This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] mikewalch commented on a change in pull request #21: Created performance test framework
mikewalch commented on a change in pull request #21: Created performance test framework URL: https://github.com/apache/accumulo-testing/pull/21#discussion_r204536969 ## File path: core/src/main/java/org/apache/accumulo/testing/core/performance/Result.java ## @@ -0,0 +1,30 @@ +package org.apache.accumulo.testing.core.performance; + +public class Result { + + public final String id; + public final Number data; + public final Stats stats; + public final String description; + public final Purpose purpose; + + public enum Purpose { +INFORMATIONAL, COMPARISON Review comment: It might nice to add some comments describing the difference between informational and comparison. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] mikewalch commented on a change in pull request #21: Created performance test framework
mikewalch commented on a change in pull request #21: Created performance test framework URL: https://github.com/apache/accumulo-testing/pull/21#discussion_r204531262 ## File path: core/src/main/java/org/apache/accumulo/testing/core/performance/tests/ScanFewFamiliesPT.java ## @@ -0,0 +1,97 @@ +package org.apache.accumulo.testing.core.performance.tests; Review comment: You are missing license headers. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] mikewalch commented on a change in pull request #21: Created performance test framework
mikewalch commented on a change in pull request #21: Created performance test framework URL: https://github.com/apache/accumulo-testing/pull/21#discussion_r204472745 ## File path: README.md ## @@ -135,6 +135,19 @@ Run the command below stop the agitator: ./bin/accumulo-testing agitator stop +## Performance Test + +To run performance test a `cluster-control.sh` script is needed to assist with starting, stopping, +wiping, and confguring an Accumulo instance. An example script for [Uno] is provided. After Review comment: Could show users how to set this up ``` cp conf/cluster-control.sh.uno conf/cluster-control.sh ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] mikewalch commented on a change in pull request #21: Created performance test framework
mikewalch commented on a change in pull request #21: Created performance test framework URL: https://github.com/apache/accumulo-testing/pull/21#discussion_r204473266 ## File path: README.md ## @@ -135,6 +135,19 @@ Run the command below stop the agitator: ./bin/accumulo-testing agitator stop +## Performance Test + +To run performance test a `cluster-control.sh` script is needed to assist with starting, stopping, +wiping, and confguring an Accumulo instance. An example script for [Uno] is provided. After Review comment: Could mention that `UNO_HOME` needs to be set in this script This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (ACCUMULO-3806) Failing to create a table/namespace because it already exists should not be a warning
[ https://issues.apache.org/jira/browse/ACCUMULO-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553090#comment-16553090 ] Keith Turner commented on ACCUMULO-3806: Does ACCUMULO-3925 address this issue with the introduction of {{AcceptableThriftTableOperationException.java}}? > Failing to create a table/namespace because it already exists should not be a > warning > - > > Key: ACCUMULO-3806 > URL: https://issues.apache.org/jira/browse/ACCUMULO-3806 > Project: Accumulo > Issue Type: Improvement > Components: fate >Reporter: Josh Elser >Priority: Major > Labels: newbie > Fix For: 2.0.0 > > Attachments: > 0001-ACCUMULO-3806-changed-checkTableDoesNotExist-in-accu.patch > > > This is a really common occurrence when you're running randomwalk: > {noformat} > Failed to execute Repo, tid=63d0421f1b17b04a > ThriftTableOperationException(tableId:null, tableName:nspc_001.ctt_000, > op:CREATE, type:EXISTS, description:null) > at > org.apache.accumulo.master.tableOps.Utils.checkTableDoesNotExist(Utils.java:54) > at > org.apache.accumulo.master.tableOps.PopulateZookeeper.call(PopulateZookeeper.java:54) > at > org.apache.accumulo.master.tableOps.PopulateZookeeper.call(PopulateZookeeper.java:30) > at > org.apache.accumulo.master.tableOps.TraceRepo.call(TraceRepo.java:57) > at > org.apache.accumulo.fate.Fate$TransactionRunner.run(Fate.java:72) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at > org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35) > at java.lang.Thread.run(Thread.java:745) > {noformat} > Concurrent table creations run: only one succeeds and the others fail. This > is expected and what FATE was designed to handle. We shouldn't be pushing > these up to the monitor -- should probably be a info or debug message. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] keith-turner opened a new issue #22: Hadoop dependencies not properly converged in shaded jar
keith-turner opened a new issue #22: Hadoop dependencies not properly converged in shaded jar URL: https://github.com/apache/accumulo-testing/issues/22 When building the testing shaded jar using Accumulo 2.0.0-SNAP and Hadoop 2.8.4 the hadoop dependencies are not properly converged in the shaded jar. Seeing warnings like the following. ``` [WARNING] hadoop-client-api-3.0.2.jar, hadoop-hdfs-client-2.8.4.jar define 1642 overlapping classes: [WARNING] - org.apache.hadoop.hdfs.protocol.CacheDirectiveInfo$Expiration [WARNING] - org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$SetOwnerResponseProto$Builder [WARNING] - org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetFileLinkInfoRequestProto$Builder [WARNING] - org.apache.hadoop.hdfs.web.URLConnectionFactory$1 [WARNING] - org.apache.hadoop.hdfs.protocol.proto.XAttrProtos$SetXAttrRequestProto$1 [WARNING] - org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ModifyCacheDirectiveResponseProto$1 [WARNING] - org.apache.hadoop.fs.XAttr$1 [WARNING] - org.apache.hadoop.hdfs.protocol.CachePoolStats$Builder [WARNING] - org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ReportBadBlocksResponseProtoOrBuilder [WARNING] - org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetBlockLocationsResponseProto [WARNING] - 1632 more... ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (ACCUMULO-3806) Failing to create a table/namespace because it already exists should not be a warning
[ https://issues.apache.org/jira/browse/ACCUMULO-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553056#comment-16553056 ] Josh Elser commented on ACCUMULO-3806: -- {quote}I'm not sure that I agree with this {quote} Yeah, doesn't strike me as the right way to solve this. Totally right that this isn't an actionable warning/error – and we know that there are situations in which this is completely expected to happen. If you remove the thrown exception, the client is probably not going to get the correct response (e.g. think that its table creation succeeded which is totally wrong). My take on an improvement would be to suppress the monitor warning IFF we know that this exact table was already created. > Failing to create a table/namespace because it already exists should not be a > warning > - > > Key: ACCUMULO-3806 > URL: https://issues.apache.org/jira/browse/ACCUMULO-3806 > Project: Accumulo > Issue Type: Improvement > Components: fate >Reporter: Josh Elser >Priority: Major > Labels: newbie > Fix For: 2.0.0 > > Attachments: > 0001-ACCUMULO-3806-changed-checkTableDoesNotExist-in-accu.patch > > > This is a really common occurrence when you're running randomwalk: > {noformat} > Failed to execute Repo, tid=63d0421f1b17b04a > ThriftTableOperationException(tableId:null, tableName:nspc_001.ctt_000, > op:CREATE, type:EXISTS, description:null) > at > org.apache.accumulo.master.tableOps.Utils.checkTableDoesNotExist(Utils.java:54) > at > org.apache.accumulo.master.tableOps.PopulateZookeeper.call(PopulateZookeeper.java:54) > at > org.apache.accumulo.master.tableOps.PopulateZookeeper.call(PopulateZookeeper.java:30) > at > org.apache.accumulo.master.tableOps.TraceRepo.call(TraceRepo.java:57) > at > org.apache.accumulo.fate.Fate$TransactionRunner.run(Fate.java:72) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at > org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35) > at java.lang.Thread.run(Thread.java:745) > {noformat} > Concurrent table creations run: only one succeeds and the others fail. This > is expected and what FATE was designed to handle. We shouldn't be pushing > these up to the monitor -- should probably be a info or debug message. -- This message was sent by Atlassian JIRA (v7.6.3#76005)