Author: ecn
Date: Wed Jan 11 14:29:28 2012
New Revision: 1230064
URL: http://svn.apache.org/viewvc?rev=1230064&view=rev
Log:
ACCUMULO-285 finely tune instructions, turn off speculative execution
Modified:
incubator/accumulo/branches/1.4/src/wikisearch/README
incubator/accumulo/branches/1.4/src/wikisearch/ingest/src/main/java/org/apache/accumulo/wikisearch/ingest/WikipediaIngester.java
Modified: incubator/accumulo/branches/1.4/src/wikisearch/README
URL:
http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/src/wikisearch/README?rev=1230064&r1=1230063&r2=1230064&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/src/wikisearch/README (original)
+++ incubator/accumulo/branches/1.4/src/wikisearch/README Wed Jan 11 14:29:28
2012
@@ -15,9 +15,9 @@
INSTRUCTIONS
------------
- 1. Copy the conf/wikipedia.xml.example to conf/wikipedia.xml and change
it to specify Accumulo information.
- 2. Copy the lib/wikisearch-*.jar and lib/protobuf*.jar to
$ACCUMULO_HOME/lib/ext
- 3. Then run bin/ingest.sh with one argument (the name of the directory
in HDFS where the wikipedia XML
+ 1. Copy the ingest/conf/wikipedia.xml.example to
ingest/conf/wikipedia.xml and change it to specify Accumulo information.
+ 2. Copy the ingest/lib/wikisearch-*.jar and ingest/lib/protobuf*.jar to
$ACCUMULO_HOME/lib/ext
+ 3. Then run ingest/bin/ingest.sh with one argument (the name of the
directory in HDFS where the wikipedia XML
files reside) and this will kick off a MapReduce job to ingest the
data into Accumulo.
Query
@@ -34,7 +34,7 @@
-------------
1. Modify the query/src/main/resources/META-INF/ejb-jar.xml file with
the same information that you put into the wikipedia.xml
file from the Ingest step above.
- 2. Re-build the query distribution by running 'mvn assembly:single' in
the top-level directory.
+ 2. Re-build the query distribution by running 'mvn package
assembly:single' in the top-level directory.
3. Untar the resulting file in the $JBOSS_HOME/server/default
directory.
$ cd $JBOSS_HOME/server/default
Modified:
incubator/accumulo/branches/1.4/src/wikisearch/ingest/src/main/java/org/apache/accumulo/wikisearch/ingest/WikipediaIngester.java
URL:
http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/src/wikisearch/ingest/src/main/java/org/apache/accumulo/wikisearch/ingest/WikipediaIngester.java?rev=1230064&r1=1230063&r2=1230064&view=diff
==============================================================================
---
incubator/accumulo/branches/1.4/src/wikisearch/ingest/src/main/java/org/apache/accumulo/wikisearch/ingest/WikipediaIngester.java
(original)
+++
incubator/accumulo/branches/1.4/src/wikisearch/ingest/src/main/java/org/apache/accumulo/wikisearch/ingest/WikipediaIngester.java
Wed Jan 11 14:29:28 2012
@@ -135,7 +135,8 @@ public class WikipediaIngester extends C
public int run(String[] args) throws Exception {
Job job = new Job(getConf(), "Ingest Wikipedia");
Configuration conf = job.getConfiguration();
-
+ conf.set("mapred.map.tasks.speculative.execution", "false");
+
String tablename = WikipediaConfiguration.getTableName(conf);
String zookeepers = WikipediaConfiguration.getZookeepers(conf);