[07/10] git commit: Merge branch '1.4.5-SNAPSHOT' into 1.5.2-SNAPSHOT

bhavanki Fri, 21 Mar 2014 11:51:40 -0700

Merge branch '1.4.5-SNAPSHOT' into 1.5.2-SNAPSHOT

Conflicts:
        test/system/auto/README.md
        test/system/bench/README
        test/system/continuous/README
        test/system/scalability/README.md
        test/system/test1/README
        test/system/test2/README
        test/system/test4/README



Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo
Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/bcb0905c
Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/bcb0905c
Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/bcb0905c

Branch: refs/heads/master
Commit: bcb0905c474fc981ee30834bd0b42fff906e98d1
Parents: 246104c2 4fdefd7
Author: Bill Havanki <bhava...@cloudera.com>
Authored: Fri Mar 21 14:45:51 2014 -0400
Committer: Bill Havanki <bhava...@cloudera.com>
Committed: Fri Mar 21 14:45:51 2014 -0400

----------------------------------------------------------------------
 test/system/auto/README           | 104 -------------------------------
 test/system/auto/README.md        | 109 +++++++++++++++++++++++++++++++++
 test/system/bench/README          |  44 -------------
 test/system/bench/README.md       |  45 ++++++++++++++
 test/system/continuous/README     |  80 ------------------------
 test/system/continuous/README.md  |  81 ++++++++++++++++++++++++
 test/system/randomwalk/README     |  62 -------------------
 test/system/randomwalk/README.md  |  69 +++++++++++++++++++++
 test/system/scalability/README    |  38 ------------
 test/system/scalability/README.md |  40 ++++++++++++
 test/system/test1/README          |  26 --------
 test/system/test1/README.md       |  29 +++++++++
 test/system/test2/README          |   7 ---
 test/system/test2/README.md       |  10 +++
 test/system/test3/README          |   2 -
 test/system/test3/README.md       |   5 ++
 test/system/test4/README          |   6 --
 test/system/test4/README.md       |   9 +++
 18 files changed, 397 insertions(+), 369 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/accumulo/blob/bcb0905c/test/system/auto/README.md
----------------------------------------------------------------------
diff --cc test/system/auto/README.md
index 0000000,0fb9531..655c3dd
mode 000000,100644..100644
--- a/test/system/auto/README.md
+++ b/test/system/auto/README.md
@@@ -1,0 -1,110 +1,109 @@@
+ Apache Accumulo Functional Tests
+ ================================
+ 
+ These scripts run a series of tests against a small local accumulo instance.
+ To run these scripts, you must have Hadoop and Zookeeper installed and 
running.
+ You will need a functioning C compiler to build a shared library needed for
+ one of the tests.  The test suite is known to run on Linux RedHat Enterprise
+ version 5, and Mac OS X 10.5.
+ 
+ How to Run
+ ----------
+ 
+ The tests are shown as being run from the `ACCUMULO_HOME` directory, but they
+ should run from any directory. Make sure to create "logs" and "walogs"
+ directories in `ACCUMULO_HOME`.  Also, ensure that `accumulo-env.sh` 
specifies its
+ `ACCUMULO_LOG_DIR` in the following way:
+ 
+ > `$ test -z "$ACCUMULO_LOG_DIR" && export 
ACCUMULO_LOG_DIR=$ACCUMULO_HOME/logs`
+ 
+ To list all the test names:
+ 
+ > `$ ./test/system/auto/run.py -l`
+ 
+ You can run the suite like this:
+ 
+ > `$ ./test/system/auto/run.py`
+ 
+ You can select tests using a case-insensitive regular expression:
+ 
+ > `$ ./test/system/auto/run.py -t simple`  
+ > `$ ./test/system/auto/run.py -t SunnyDay`
+ 
+ To run tests repeatedly:
+ 
+ > `$ ./test/system/auto/run.py -r 3`
+ 
+ If you are attempting to debug what is causing a test to fail, you can run the
+ tests in "verbose" mode:
+ 
+ > `$ python test/system/auto/run.py -t SunnyDay -v 10`
+ 
+ If a test is failing, and you would like to examine logs from the run, you can
+ run the test in "dirty" mode which will keep the test from cleaning up all the
+ logs at the end of the run:
+ 
+ > `$ ./test/system/auto/run.py -t some.failing.test -d`
+ 
+ If the test suite hangs, and you would like to re-run the tests starting with
+ the last test that failed:
+ 
+ > `$ ./test/system/auto/run.py -s start.over.test`
+ 
+ If tests tend to time out (on slower hardware, for example), you can scale up
+ the timeout values by a multiplier. This example triples timeouts:
+ 
+ > `$ ./test/system/auto/run.py -f 3`
+ 
+ Test results are normally printed to the console, but you can send them to XML
+ files compatible with Jenkins:
+ 
+ > `$ ./test/system/auto/run.py -x`
+ 
+ Running under MapReduce
+ -----------------------
+ 
+ The full test suite can take nearly an hour.  If you have a larger Hadoop
+ cluster at your disposal, you can run the tests as a MapReduce job:
+ 
+ > `$ python test/system/auto/run.py -l > tests  
+ $ hadoop fs -put tests /user/hadoop/tests  
 -$ ./bin/accumulo org.apache.accumulo.server.test.functional.RunTests \  
 -    /user/hadoop/tests /user/hadoop/results`
++$ ./bin/accumulo org.apache.accumulo.test.functional.RunTests --tests \  
++    /user/hadoop/tests --output /user/hadoop/results`
+ 
+ The example above runs every test. You can trim the tests file to include
+ only the tests you wish to run.
+ 
 -You may specify a 'timeout factor' via an optional integer as a third 
argument:
++You may specify a 'timeout factor' via an optional integer argument:
+ 
 -> `$ ./bin/accumulo org.apache.accumulo.server.test.functional.RunTests \  
 -/user/hadoop/tests /user/hadoop/results timeout_factor`
++> `$ ./bin/accumulo org.apache.accumulo.test.functional.RunTests --tests \  
++/user/hadoop/tests --output /user/hadoop/results --timeoutFactor 
timeout_factor`
+ 
+ Where `timeout_factor` indicates how much we should scale up timeouts. It will
+ be used to set both `mapred.task.timeout` and the "-f" flag used by `run.py`. 
If
+ not given, `timeout_factor` defaults to 1, which corresponds to a
+ `mapred.task.timeout` of 480 seconds.
+ 
+ In some clusters, the user under which MR jobs run is different from the user
+ under which Accumulo is installed, and this can cause failures running the
+ tests. Various configuration and permission changes can be made to help the
+ tests run, including the following:
+ 
+ * Opening up directory and file permissions on each cluster node so that the 
MR
+   user has the same read/write privileges as the Accumulo user. Adding the MR
+   user to a shared group is one easy way to accomplish this. Access is 
required
+   to the Accumulo installation, log, write-ahead log, and configuration
+   directories.
+ * Creating a user directory in HDFS, named after and owned by the MR user,
+   e.g., `/user/mruser`.
+ * Setting the `ZOOKEEPER_HOME` and `HADOOP_CONF_DIR` environment variables 
for the
+   MR user. These can be set using the `mapred.child.env` property in
+   `mapred-site.xml`, e.g.:
+ 
+   `<property>  
+     <name>mapred.child.env</name>  
+     
<value>ZOOKEEPER_HOME=/path/to/zookeeper,HADOOP_CONF_DIR=/path/to/hadoop/conf</value>
  
+   </property>`
+ 
+ Each functional test is run by a mapper, and so you can check the mapper logs
+ to see any error messages tests produce.
 -

http://git-wip-us.apache.org/repos/asf/accumulo/blob/bcb0905c/test/system/continuous/README.md
----------------------------------------------------------------------
diff --cc test/system/continuous/README.md
index 0000000,6cf5d81..3e7e234
mode 000000,100644..100644
--- a/test/system/continuous/README.md
+++ b/test/system/continuous/README.md
@@@ -1,0 -1,72 +1,81 @@@
+ Continuous Query and Ingest
+ ===========================
+ 
+ This directory contains a suite of scripts for placing continuous query and
+ ingest load on accumulo.  The purpose of these script is two-fold. First,
+ place continuous load on accumulo to see if breaks.  Second, collect
+ statistics in order to understand how accumulo behaves.  To run these scripts
+ copy all of the `.example` files and modify them.  You can put these scripts 
in
+ the current directory or define a `CONTINUOUS_CONF_DIR` where the files will 
be
+ read from. These scripts rely on `pssh`. Before running any script you may 
need
+ to use `pssh` to create the log directory on each machine (if you want it 
local).
 -Also, create the table "ci" before running.
++Also, create the table "ci" before running. You can run
++`org.apache.accumulo.test.continuous.GenSplits` to generate splits points for 
a
++continuous ingest table.
+ 
+ The following ingest scripts insert data into accumulo that will form a random
+ graph.
+ 
+ > `$ start-ingest.sh  
+ $ stop-ingest.sh`
+ 
+ The following query scripts randomly walk the graph created by the ingesters.
+ Each walker produce detailed statistics on query/scan times.
+ 
+ > `$ start-walkers.sh  
+ $ stop-walker.sh`
+ 
+ The following scripts start and stop batch walkers.
+ 
+ > `$ start-batchwalkers.sh  
+ $ stop-batchwalkers.sh`
+ 
+ In addition to placing continuous load, the following scripts start and stop a
+ service that continually collect statistics about accumulo and HDFS.
+ 
+ > `$ start-stats.sh  
+ $ stop-stats.sh`
+ 
 -Optionally, start the agitator to periodically kill random servers.
++Optionally, start the agitator to periodically kill the tabletserver and/or 
datanode
++process(es) on random nodes. You can run this script as root and it will 
properly start
++processes as the user you configured in `continuous-env.sh` (`HDFS_USER` for 
the Datanode and
++`ACCUMULO_USER` for Accumulo processes). If you run it as yourself and the 
`HDFS_USER` and
++`ACCUMULO_USER` values are the same as your user, the agitator will not 
change users. In
++the case where you run the agitator as a non-privileged user which isn't the 
same as `HDFS_USER`
++or `ACCUMULO_USER`, the agitator will attempt to `sudo` to these users, which 
relies on correct
++configuration of sudo. Also, be sure that your `HDFS_USER` has password-less 
`ssh` configured.
+ 
+ > `$ start-agitator.sh  
+ $ stop-agitator.sh`
+ 
+ Start all three of these services and let them run for a few hours. Then run
+ `report.pl` to generate a simple HTML report containing plots and histograms
+ showing what has transpired.
+ 
+ A MapReduce job to verify all data created by continuous ingest can be run
+ with the following command.  Before running the command modify the `VERIFY_*`
+ variables in `continuous-env.sh` if needed.  Do not run ingest while running 
this
+ command, this will cause erroneous reporting of UNDEFINED nodes. The MapReduce
+ job will scan a reference after it has scanned the definition.
+ 
+ > `$ run-verify.sh`
+ 
+ Each entry, except for the first batch of entries, inserted by continuous
+ ingest references a previously flushed entry.  Since we are referencing 
flushed
+ entries, they should always exist.  The MapReduce job checks that all
+ referenced entries exist.  If it finds any that do not exist it will increment
+ the UNDEFINED counter and emit the referenced but undefined node.  The 
MapReduce
+ job produces two other counts : REFERENCED and UNREFERENCED.  It is
+ expected that these two counts are non zero.  REFERENCED counts nodes that are
+ defined and referenced.  UNREFERENCED counts nodes that defined and
+ unreferenced, these are the latest nodes inserted.
+ 
+ To stress accumulo, run the following script which starts a MapReduce job
+ that reads and writes to your continuous ingest table.  This MapReduce job
+ will write out an entry for every entry in the table (except for ones created
+ by the MapReduce job itself). Stop ingest before running this MapReduce job.
+ Do not run more than one instance of this MapReduce job concurrently against a
+ table.
+ 
+ > `$ run-moru.sh`
+ 

http://git-wip-us.apache.org/repos/asf/accumulo/blob/bcb0905c/test/system/scalability/README.md
----------------------------------------------------------------------
diff --cc test/system/scalability/README.md
index 0000000,d0a5772..a09b91c
mode 000000,100644..100644
--- a/test/system/scalability/README.md
+++ b/test/system/scalability/README.md
@@@ -1,0 -1,40 +1,40 @@@
+ Apache Accumulo Scalability Tests
+ =================================
+ 
+ The scalability test framework needs to be configured for your Accumulo
+ instance by performing the following steps.
+ 
+ WARNING: Each scalability test rewrites your `conf/slaves` file and 
reinitializes
+ your Accumulo instance. Do not run these tests on a cluster holding essential
+ data.
+ 
+ 1.  Make sure you have both `ACCUMULO_HOME` and `HADOOP_HOME` set in your
+     `$ACCUMULO_CONF_DIR/accumulo-env.sh.`
+ 
+ 2.  Create a 'site.conf' file in the `conf` directory containing settings
+     needed by test nodes to connect to Accumulo, and to guide the tests.
+ 
+     `$ cp conf/site.conf.example conf/site.conf`
+ 
+ 3.  Create an 'Ingest.conf' file in the `conf` directory containing 
performance
+     settings for the Ingest test. (This test is currently the only scalability
+     test available.)
+ 
+     `$ cp conf/Ingest.conf.example conf/Ingest.conf`
+ 
+     Each test has a unique ID (e.g., "Ingest") which correlates with its test
+     code in:
+ 
 -    `org.apache.accumulo.server.test.scalability.tests.<ID>`
++    `org.apache.accumulo.test.scalability.tests.<ID>`
+ 
+     This ID correlates with a config file:
+ 
+     `conf/<ID>.conf`
+ 
+ To run the test, specify its ID to the run.py script.
+ 
+ > `$ nohup ./run.py Ingest > test1.log 2>&1 &`
+ 
+ A timestamped directory will be created, and results are placed in it as each
+ test completes.
+ 

http://git-wip-us.apache.org/repos/asf/accumulo/blob/bcb0905c/test/system/test1/README.md
----------------------------------------------------------------------
diff --cc test/system/test1/README.md
index 0000000,ee2de5a..c847e13
mode 000000,100644..100644
--- a/test/system/test1/README.md
+++ b/test/system/test1/README.md
@@@ -1,0 -1,29 +1,29 @@@
+ Command to run from command line
+ 
+ Can run this test with pre-existing splits; use the following command to 
create the table with
+ 100 pre-existing splits 
+ 
 -> `$ ../../../bin/accumulo 
'org.apache.accumulo.server.test.TestIngest$CreateTable' \  
 -0 5000000 100 <user> <pw>`
++> `$ ../../../bin/accumulo 'org.apache.accumulo.test.TestIngest' 
--createTable \  
++-u root -p secret --splits 100 --rows 0`
+ 
+ Could try running verify commands after stopping and restarting accumulo
+ 
+ When write ahead log is implemented can try killing tablet server in middle 
of ingest
+ 
+ Run 5 parallel ingesters and verify:
+ 
+ > `$ . ingest_test.sh`  
+ (wait)  
+ `$ . verify_test.sh`  
+ (wait)
+ 
+ Overwrite previous ingest:
+ > `$ . ingest_test_2.sh`  
+ (wait)  
+ `$ . verify_test_2.sh`  
+ (wait)
+ 
+ Delete what was previously ingested:
+ > `$ . ingest_test_3.sh`  
+ (wait)
+ 

http://git-wip-us.apache.org/repos/asf/accumulo/blob/bcb0905c/test/system/test2/README.md
----------------------------------------------------------------------
diff --cc test/system/test2/README.md
index 0000000,09cbc7a..4b860f6
mode 000000,100644..100644
--- a/test/system/test2/README.md
+++ b/test/system/test2/README.md
@@@ -1,0 -1,9 +1,10 @@@
+ Test Concurrent Read/Write
+ ==========================
+ 
+ Can run this test with pre-existing splits; use the following command to 
create the table with
+ 100 pre-existing splits:
+ 
 -> `$ hadoop jar ../../../lib/accumulo.jar \   
 -'org.apache.accumulo.server.test.TestIngest$CreateTable' 0 5000000 100`
++> `$ ../../../bin/accumulo org.apache.accumulo.test.TestIngest --createTable 
\   
++-u root -p secret --splits 100 --rows 0  
++$ . concurrent.sh`
+ 

http://git-wip-us.apache.org/repos/asf/accumulo/blob/bcb0905c/test/system/test4/README.md
----------------------------------------------------------------------
diff --cc test/system/test4/README.md
index 0000000,3f03fc3..da66cf1
mode 000000,100644..100644
--- a/test/system/test4/README.md
+++ b/test/system/test4/README.md
@@@ -1,0 -1,9 +1,9 @@@
+ Test Bulk Importing Data
+ ========================
+ 
+ Can run this test with pre-existing splits... use the following command to 
create the table with
+ 100 pre-existing splits 
+ 
 -> `$ hadoop jar ../../../lib/accumulo.jar \   
 -'org.apache.accumulo.server.test.TestIngest$CreateTable' 0 5000000 100`
++> `../../../bin/accumulo org.apache.accumulo.test.TestIngest --createTable \  
 
++-u root -p secret --rows 0 --splits 100`
+

[07/10] git commit: Merge branch '1.4.5-SNAPSHOT' into 1.5.2-SNAPSHOT

Reply via email to