Author: gdfm
Date: Sun Apr 3 08:17:59 2016
New Revision: 1737551
URL: http://svn.apache.org/viewvc?rev=1737551&view=rev
Log:
Updated Markdown processor to kramdown
Modified:
incubator/samoa/site/documentation/Adaptive-Model-Rules-Regressor.html
incubator/samoa/site/documentation/Bagging-and-Boosting.html
incubator/samoa/site/documentation/Building-SAMOA.html
incubator/samoa/site/documentation/Bylaws.html
incubator/samoa/site/documentation/Content-Event.html
incubator/samoa/site/documentation/Developing-New-Tasks-in-SAMOA.html
incubator/samoa/site/documentation/Distributed-Stream-Clustering.html
incubator/samoa/site/documentation/Distributed-Stream-Frequent-Itemset-Mining.html
incubator/samoa/site/documentation/Executing-SAMOA-with-Apache-Avro-Files.html
incubator/samoa/site/documentation/Executing-SAMOA-with-Apache-S4.html
incubator/samoa/site/documentation/Executing-SAMOA-with-Apache-Samza.html
incubator/samoa/site/documentation/Executing-SAMOA-with-Apache-Storm.html
incubator/samoa/site/documentation/Getting-Started.html
incubator/samoa/site/documentation/Home.html
incubator/samoa/site/documentation/Learner.html
incubator/samoa/site/documentation/Prequential-Evaluation-Task.html
incubator/samoa/site/documentation/Processing-Item.html
incubator/samoa/site/documentation/Processor.html
incubator/samoa/site/documentation/SAMOA-Topology.html
incubator/samoa/site/documentation/SAMOA-and-Machine-Learning.html
incubator/samoa/site/documentation/SAMOA-for-MOA-users.html
incubator/samoa/site/documentation/Scalable-Advanced-Massive-Online-Analysis.html
incubator/samoa/site/documentation/Stream.html
incubator/samoa/site/documentation/Task.html
incubator/samoa/site/documentation/Team.html
incubator/samoa/site/documentation/Topology-Builder.html
incubator/samoa/site/documentation/Vertical-Hoeffding-Tree-Classifier.html
Modified: incubator/samoa/site/documentation/Adaptive-Model-Rules-Regressor.html
URL:
http://svn.apache.org/viewvc/incubator/samoa/site/documentation/Adaptive-Model-Rules-Regressor.html?rev=1737551&r1=1737550&r2=1737551&view=diff
==============================================================================
--- incubator/samoa/site/documentation/Adaptive-Model-Rules-Regressor.html
(original)
+++ incubator/samoa/site/documentation/Adaptive-Model-Rules-Regressor.html Sun
Apr 3 08:17:59 2016
@@ -74,37 +74,41 @@
<article class="post-content">
<h3 id="adaptive-model-rules-regressor">Adaptive Model Rules Regressor</h3>
-
-<p><a
href="http://www.ecmlpkdd2013.org/wp-content/uploads/2013/07/251.pdf">Adaptive
Model Rules (AMRules)</a> is an innovative algorithm for learning regression
rules with streaming data. In AMRules, the rule model consists of a set of
normal rules and a default rule (a rule with no features). Hoeffding bound is
used to define a confidence interval to decide whether to expand a rule. If the
ratio of the 2 largest standard deviation reduction (SDR) measure among all
potential features of a rule is is within this interval, the feature with the
largest SDR will be added to the rule to expand it. If the default rule is
expanded, it will become a normal rule and will be added to the model's
rule set. A new default rule is initialized to replace the expanded one. A rule
in the set might also be removed if the Page-Hinckley test indicates that its
cumulative error exceed a threshold.</p>
+<p><a
href="http://www.ecmlpkdd2013.org/wp-content/uploads/2013/07/251.pdf">Adaptive
Model Rules (AMRules)</a> is an innovative algorithm for learning regression
rules with streaming data. In AMRules, the rule model consists of a set of
normal rules and a default rule (a rule with no features). Hoeffding bound is
used to define a confidence interval to decide whether to expand a rule. If the
ratio of the 2 largest standard deviation reduction (SDR) measure among all
potential features of a rule is is within this interval, the feature with the
largest SDR will be added to the rule to expand it. If the default rule is
expanded, it will become a normal rule and will be added to the modelâs rule
set. A new default rule is initialized to replace the expanded one. A rule in
the set might also be removed if the Page-Hinckley test indicates that its
cumulative error exceed a threshold.</p>
<h3 id="vertical-adaptive-model-rules-regressor">Vertical Adaptive Model Rules
Regressor</h3>
-
<p>Vertical Adaptive Model Rules Regressor (VAMR) is the vertical parallel
implementation of AMRules in SAMOA. The diagram below shows the components of
the implementation.
-<img src="images/vamr.png" alt="Vertical AMRules"></p>
+<img src="images/vamr.png" alt="Vertical AMRules" /></p>
<p>The <em>Source PI</em> and <em>Evaluator PI</em> are components of the <a
href="Prequential-Evaluation-Task.html">Prequential Evaluation task</a>. The
<em>Source PI</em> produces the incoming instances while <em>Evaluator PI</em>
reads prediction results from VAMR and reports their accuracy and
throughput.</p>
-<p>The core of VAMR implementation consists of one <em>Model Aggregator
PI</em> and multiple <em>Learner PIs</em>. Each <em>Learner PI</em> is
responsible for training a subset of rules. The <em>Model Aggregator PI</em>
manages the rule model (rule set and default rule) to compute the prediction
results for incoming instances. It is also responsible for the training the
default rule and creation of new rules. </p>
+<p>The core of VAMR implementation consists of one <em>Model Aggregator
PI</em> and multiple <em>Learner PIs</em>. Each <em>Learner PI</em> is
responsible for training a subset of rules. The <em>Model Aggregator PI</em>
manages the rule model (rule set and default rule) to compute the prediction
results for incoming instances. It is also responsible for the training the
default rule and creation of new rules.</p>
<p>For each incoming instance from <em>Source PI</em>, <em>Model Aggregator
PI</em> appies the current rule set to compute the prediction. The instance is
also forwarded from <em>Model Aggregator PI</em> to the <em>Learner PI(s)</em>
to train those rules that cover this instance. If an instance is not covered by
any rule in the set, the default rule will be used for prediction and will also
be trained with this instance. When the default rule expands and create a new
rule, the new rule will be sent from <em>Model aggregator PI</em> to one of the
<em>Learner PIs</em>. When the <em>Learner PIs</em> expand or remove a rule, an
update message is also sent back to the <em>Model Aggregator PI</em>.</p>
-<p>The number of <em>Learner PIs</em> can be set with the <code>-p</code>
option:</p>
-<div class="highlight"><pre><code class="language-"
data-lang="">PrequentialEvaluationTask -l
(org.apache.samoa.learners.classifiers.rules.VerticalAMRulesRegressor -p 4)
-</code></pre></div>
-<h3 id="horizontal-adaptive-model-rules-regressor">Horizontal Adaptive Model
Rules Regressor</h3>
+<p>The number of <em>Learner PIs</em> can be set with the <code
class="highlighter-rouge">-p</code> option:</p>
+
+<p><code class="highlighter-rouge">
+PrequentialEvaluationTask -l
(org.apache.samoa.learners.classifiers.rules.VerticalAMRulesRegressor -p 4)
+</code></p>
+<h3 id="horizontal-adaptive-model-rules-regressor">Horizontal Adaptive Model
Rules Regressor</h3>
<p>Horizontal Adaptive Model Rules Regressor (HAMR) is an extended
implementation of VAMR. The components of a [[Prequential Evaluation
task|Prequential Evaluation Task]] with HAMR are shown in the diagram below.
-<img src="images/hamr.png" alt="Horizontal AMRules"></p>
+<img src="images/hamr.png" alt="Horizontal AMRules" /></p>
-<p>In HAMR, the <em>Model Aggregator PI</em> is replicated, each processes
only a partition of the incoming stream from <em>Source PI</em>. The default
rule is moved from the <em>Model Aggregator PI</em> to a special <em>Learner
PI</em>, called <em>Default Rule Learner PI</em>. This new PI is reposible for
both the training and predicting steps for default rule. </p>
+<p>In HAMR, the <em>Model Aggregator PI</em> is replicated, each processes
only a partition of the incoming stream from <em>Source PI</em>. The default
rule is moved from the <em>Model Aggregator PI</em> to a special <em>Learner
PI</em>, called <em>Default Rule Learner PI</em>. This new PI is reposible for
both the training and predicting steps for default rule.</p>
-<p>For each incoming instance from <em>Source PI</em>, <em>Model Aggregator
PIs</em> apply the current rule set to compute the prediction. If the instance
is covered by a rule in the set, its prediction is computed by the <em>Model
Aggregator PI</em> and, then, it is forwarded to the <em>Learner PI(s)</em> for
training. Otherwise, the instance is forwarded to <em>Default Rule Learner
PI</em> for both prediction and training. </p>
+<p>For each incoming instance from <em>Source PI</em>, <em>Model Aggregator
PIs</em> apply the current rule set to compute the prediction. If the instance
is covered by a rule in the set, its prediction is computed by the <em>Model
Aggregator PI</em> and, then, it is forwarded to the <em>Learner PI(s)</em> for
training. Otherwise, the instance is forwarded to <em>Default Rule Learner
PI</em> for both prediction and training.</p>
<p>Newly created rules are sent from <em>Default Rule Learner PI</em> to all
<em>Model Aggregator PIs</em> and one of the <em>Learner PIs</em>. Update
messages are also sent from <em>Learner PIs</em> to all <em>Model Aggregator
PIs</em> when a rule is expanded or removed.</p>
-<p>The number of <em>Learner PIs</em> can be set with the <code>-p</code>
option and the number of <em>Model Aggregator PIs</em> can be set with the
<code>-r</code> option:</p>
-<div class="highlight"><pre><code class="language-"
data-lang="">PrequentialEvaluationTask -l
(org.apache.samoa.learners.classifiers.rules.HorizontalAMRulesRegressor -r 4 -p
2)
-</code></pre></div>
+<p>The number of <em>Learner PIs</em> can be set with the <code
class="highlighter-rouge">-p</code> option and the number of <em>Model
Aggregator PIs</em> can be set with the <code
class="highlighter-rouge">-r</code> option:</p>
+
+<p><code class="highlighter-rouge">
+PrequentialEvaluationTask -l
(org.apache.samoa.learners.classifiers.rules.HorizontalAMRulesRegressor -r 4 -p
2)
+</code></p>
+
+
</article>
<!-- </div> -->
Modified: incubator/samoa/site/documentation/Bagging-and-Boosting.html
URL:
http://svn.apache.org/viewvc/incubator/samoa/site/documentation/Bagging-and-Boosting.html?rev=1737551&r1=1737550&r2=1737551&view=diff
==============================================================================
--- incubator/samoa/site/documentation/Bagging-and-Boosting.html (original)
+++ incubator/samoa/site/documentation/Bagging-and-Boosting.html Sun Apr 3
08:17:59 2016
@@ -79,34 +79,28 @@
It is possible to use the classifiers available in <a
href="http://moa.cms.waikato.ac.nz">MOA</a> by using the <a
href="https://github.com/samoa-moa/samoa-moa">SAMOA-MOA</a> adapter.</p>
<h3 id="bagging">Bagging</h3>
+<p>You can use Bagging as a SAMOA learner, specifying the number of learners
to use with parameter <code class="highlighter-rouge">-s</code> and the base
learner to use with parameter <code class="highlighter-rouge">-l</code></p>
-<p>You can use Bagging as a SAMOA learner, specifying the number of learners
to use with parameter <code>-s</code> and the base learner to use with
parameter <code>-l</code></p>
-
-<p><code>(classifiers.ensemble.Bagging -s 10 -l
(classifiers.trees.VerticalHoeffdingTree))</code></p>
+<p><code class="highlighter-rouge">(classifiers.ensemble.Bagging -s 10 -l
(classifiers.trees.VerticalHoeffdingTree))</code></p>
<h6 id="only-with-samoa-moa-adapter">Only with SAMOA-MOA adapter</h6>
-
-<p><code>(classifiers.ensemble.Bagging -s 10 -l (classifiers.SingleClassifier
-l (MOAClassifierAdapter -l moa.classifiers.trees.HoeffdingTree)))</code></p>
+<p><code class="highlighter-rouge">(classifiers.ensemble.Bagging -s 10 -l
(classifiers.SingleClassifier -l (MOAClassifierAdapter -l
moa.classifiers.trees.HoeffdingTree)))</code></p>
<h3 id="adaptive-bagging">Adaptive Bagging</h3>
-
<p>If data is evolving, it is better to use an adaptive version of bagging,
where each base learner has a change detector that monitors its accuracy. When
the accuracy of a base learner decreases, a new base learner is built to
replace it.</p>
-<p><code>(classifiers.ensemble.AdaptiveBagging -s 10 -l
(classifiers.trees.VerticalHoeffdingTree))</code></p>
-
-<h6 id="only-with-samoa-moa-adapter">Only with SAMOA-MOA adapter</h6>
+<p><code class="highlighter-rouge">(classifiers.ensemble.AdaptiveBagging -s 10
-l (classifiers.trees.VerticalHoeffdingTree))</code></p>
-<p><code>(classifiers.ensemble.AdaptiveBagging -s 10 -l
(classifiers.SingleClassifier -l
(org.apache.samoa.learners.classifiers.MOAClassifierAdapter -l
moa.classifiers.trees.HoeffdingTree)))</code></p>
+<h6 id="only-with-samoa-moa-adapter-1">Only with SAMOA-MOA adapter</h6>
+<p><code class="highlighter-rouge">(classifiers.ensemble.AdaptiveBagging -s 10
-l (classifiers.SingleClassifier -l
(org.apache.samoa.learners.classifiers.MOAClassifierAdapter -l
moa.classifiers.trees.HoeffdingTree)))</code></p>
<h3 id="boosting">Boosting</h3>
-
<p>Boosting is a well known ensemble method, that has a very good performance
in non-streaming setting. SAMOA implements the version of Oza and Russel
(<em>Nikunj C. Oza, Stuart J. Russell: Experimental comparisons of online and
batch versions of bagging and boosting. KDD 2001:359-364</em>)</p>
-<p><code>(classifiers.ensemble.Boosting -s 10 -l
(classifiers.trees.VerticalHoeffdingTree))</code></p>
-
-<h6 id="only-with-samoa-moa-adapter">Only with SAMOA-MOA adapter</h6>
+<p><code class="highlighter-rouge">(classifiers.ensemble.Boosting -s 10 -l
(classifiers.trees.VerticalHoeffdingTree))</code></p>
-<p><code>(classifiers.ensemble.Boosting -s 10 -l (classifiers.SingleClassifier
-l (MOAClassifierAdapter -l moa.classifiers.trees.HoeffdingTree)))</code></p>
+<h6 id="only-with-samoa-moa-adapter-2">Only with SAMOA-MOA adapter</h6>
+<p><code class="highlighter-rouge">(classifiers.ensemble.Boosting -s 10 -l
(classifiers.SingleClassifier -l (MOAClassifierAdapter -l
moa.classifiers.trees.HoeffdingTree)))</code></p>
</article>
Modified: incubator/samoa/site/documentation/Building-SAMOA.html
URL:
http://svn.apache.org/viewvc/incubator/samoa/site/documentation/Building-SAMOA.html?rev=1737551&r1=1737550&r2=1737551&view=diff
==============================================================================
--- incubator/samoa/site/documentation/Building-SAMOA.html (original)
+++ incubator/samoa/site/documentation/Building-SAMOA.html Sun Apr 3 08:17:59
2016
@@ -74,23 +74,27 @@
<article class="post-content">
<p>To build SAMOA to run on local mode, on your own computer without a
cluster, is simple as cloning the repository and installing it.</p>
-<div class="highlight"><pre><code class="language-bash" data-lang="bash">git
clone http://git.apache.org/incubator-samoa.git
-<span class="nb">cd </span>incubator-samoa
+
+<p><code class="highlighter-rouge">bash
+git clone http://git.apache.org/incubator-samoa.git
+cd incubator-samoa
mvn package
-</code></pre></div>
-<p>The deployable jar for SAMOA will be in
<code>target/SAMOA-Local-0.3.0-SNAPSHOT.jar</code>.</p>
+</code>
+The deployable jar for SAMOA will be in <code
class="highlighter-rouge">target/SAMOA-Local-0.3.0-SNAPSHOT.jar</code>.</p>
<h3 id="storm">Storm</h3>
-
<p>Simply clone the repository and install SAMOA.</p>
-<div class="highlight"><pre><code class="language-bash" data-lang="bash">git
clone http://git.apache.org/incubator-samoa.git
-<span class="nb">cd </span>incubator-samoa
+
+<p><code class="highlighter-rouge">bash
+git clone http://git.apache.org/incubator-samoa.git
+cd incubator-samoa
mvn -Pstorm package
-</code></pre></div>
-<p>The deployable jar for SAMOA will be in
<code>target/SAMOA-Storm-0.3.0-SNAPSHOT.jar</code>.</p>
+</code></p>
+
+<p>The deployable jar for SAMOA will be in <code
class="highlighter-rouge">target/SAMOA-Storm-0.3.0-SNAPSHOT.jar</code>.</p>
<ul>
-<li><a href="Executing-SAMOA-with-Apache-Storm.html">1.1 Executing SAMOA with
Apache Storm</a></li>
+ <li><a href="Executing-SAMOA-with-Apache-Storm.html">1.1 Executing SAMOA
with Apache Storm</a></li>
</ul>
<h3 id="s4">S4</h3>
@@ -98,16 +102,19 @@ mvn -Pstorm package
<p>If you want to compile SAMOA for Apache S4, you will need to install the S4
dependencies manually as explained in <a
href="Executing-SAMOA-with-Apache-S4.html">Executing SAMOA with Apache
S4</a>.</p>
<p>Once the dependencies are installed, you can simply clone the repository
and install SAMOA.</p>
-<div class="highlight"><pre><code class="language-bash" data-lang="bash">git
clone http://git.apache.org/incubator-samoa.git
-<span class="nb">cd </span>incubator-samoa
-mvn -P<variant> package <span class="c"># where variant is "storm" or
"s4"</span>
-
-mvn -Pstorm,s4 package <span class="c"># e.g., to get both versions</span>
-</code></pre></div>
-<p>The deployable jars for SAMOA will be in
<code>target/SAMOA-<variant>-<version>-SNAPSHOT.jar</code>. For
example, for S4 <code>target/SAMOA-S4-0.3.0-SNAPSHOT.jar</code>.</p>
+
+<p>```bash
+git clone http://git.apache.org/incubator-samoa.git
+cd incubator-samoa
+mvn -P<variant> package # where variant is "storm" or "s4"</variant></p>
+
+<p>mvn -Pstorm,s4 package # e.g., to get both versions
+```</p>
+
+<p>The deployable jars for SAMOA will be in <code
class="highlighter-rouge">target/SAMOA-<variant>-<version>-SNAPSHOT.jar</code>.
For example, for S4 <code
class="highlighter-rouge">target/SAMOA-S4-0.3.0-SNAPSHOT.jar</code>.</p>
<ul>
-<li><a href="Executing-SAMOA-with-Apache-S4.html">1.2 Executing SAMOA with
Apache S4</a></li>
+ <li><a href="Executing-SAMOA-with-Apache-S4.html">1.2 Executing SAMOA with
Apache S4</a></li>
</ul>
</article>
Modified: incubator/samoa/site/documentation/Bylaws.html
URL:
http://svn.apache.org/viewvc/incubator/samoa/site/documentation/Bylaws.html?rev=1737551&r1=1737550&r2=1737551&view=diff
==============================================================================
--- incubator/samoa/site/documentation/Bylaws.html (original)
+++ incubator/samoa/site/documentation/Bylaws.html Sun Apr 3 08:17:59 2016
@@ -87,7 +87,7 @@
<h3 id="users">Users:</h3>
-<p>The most important participants in the project are people who use our
software. The majority of our developers start out as users and guide their
development efforts from the user's perspective.</p>
+<p>The most important participants in the project are people who use our
software. The majority of our developers start out as users and guide their
development efforts from the userâs perspective.</p>
<p>Users contribute to Apache projects by providing feedback to developers in
the form of bug reports and feature suggestions. In addition, users participate
in the Apache community by helping other users on mailing lists and user
support forums.</p>
@@ -97,7 +97,7 @@
<h3 id="committers">Committers:</h3>
-<p>The project's Committers are responsible for the project's
technical management. Committers have access to all project source
repositories. Committers may cast binding votes on any technical discussion
regarding SAMOA.</p>
+<p>The projectâs Committers are responsible for the projectâs technical
management. Committers have access to all project source repositories.
Committers may cast binding votes on any technical discussion regarding
SAMOA.</p>
<p>Committer access is by invitation only and must be approved by lazy
consensus of the active PMC members. A Committer is considered emeritus by
their own declaration or by not contributing in any form to the project for
over six months. An emeritus Committer may request reinstatement of commit
access from the PMC. Such reinstatement is subject to lazy consensus approval
of active PMC members.</p>
@@ -110,15 +110,15 @@
<p>The PMC is responsible to the board and the ASF for the management and
oversight of the Apache SAMOA codebase. The responsibilities of the PMC
include:</p>
<ul>
-<li>Deciding what is distributed as products of the Apache SAMOA project, in
particular all releases must be approved by the PMC;</li>
-<li>Maintaining the project's shared resources, including the codebase
repository, mailing lists, websites;</li>
-<li>Speaking on behalf of the project;</li>
-<li>Resolving license disputes regarding products of the project;</li>
-<li>Nominating new PMC members and Committers;</li>
-<li>Maintaining these bylaws and other guidelines of the project.</li>
+ <li>Deciding what is distributed as products of the Apache SAMOA project, in
particular all releases must be approved by the PMC;</li>
+ <li>Maintaining the projectâs shared resources, including the codebase
repository, mailing lists, websites;</li>
+ <li>Speaking on behalf of the project;</li>
+ <li>Resolving license disputes regarding products of the project;</li>
+ <li>Nominating new PMC members and Committers;</li>
+ <li>Maintaining these bylaws and other guidelines of the project.</li>
</ul>
-<p>Membership of the PMC is by invitation only and must be approved by
consensus of the active PMC members. A PMC member is considered
"emeritus" by their own declaration or by not contributing in any
form to the project for over six months. An emeritus member may request
reinstatement to the PMC. Such reinstatement is subject to consensus approval
of the active PMC members.</p>
+<p>Membership of the PMC is by invitation only and must be approved by
consensus of the active PMC members. A PMC member is considered âemeritusâ
by their own declaration or by not contributing in any form to the project for
over six months. An emeritus member may request reinstatement to the PMC. Such
reinstatement is subject to consensus approval of the active PMC members.</p>
<p>The chair of the PMC is appointed by the ASF board. The chair is an office
holder of the Apache Software Foundation (Vice President, Apache SAMOA) and has
primary responsibility to the board for the management of the projects within
the scope of the SAMOA PMC. The chair reports to the board quarterly on
developments within the SAMOA project.</p>
@@ -126,27 +126,30 @@
<h2 id="voting">Voting</h2>
-<p>Decisions regarding the project are made by votes on the primary project
development mailing list (<a
href="mailto:[email protected]">[email protected]</a>).
Where necessary, PMC voting may take place on the private SAMOA PMC mailing
list. Votes are clearly indicated by subject line starting with [VOTE]. Votes
may contain multiple items for approval and these should be clearly separated.
Voting is carried out by replying to the vote mail. Voting may take four
flavors.</p>
+<p>Decisions regarding the project are made by votes on the primary project
development mailing list ([email protected]). Where necessary,
PMC voting may take place on the private SAMOA PMC mailing list. Votes are
clearly indicated by subject line starting with [VOTE]. Votes may contain
multiple items for approval and these should be clearly separated. Voting is
carried out by replying to the vote mail. Voting may take four flavors.</p>
-<table><thead>
-<tr>
-<th>Vote</th>
-<th>Meaning</th>
-</tr>
-</thead><tbody>
-<tr>
-<td>+1</td>
-<td>'Yes', 'Agree', or 'the action should be
performed'.</td>
-</tr>
-<tr>
-<td>+0</td>
-<td>Neutral about the proposed action (or mildly negative but not enough so to
want to block it).</td>
-</tr>
-<tr>
-<td>-1</td>
-<td>This is a negative vote. On issues where consensus is required, this vote
counts as a veto. All vetoes must contain an explanation of why the veto is
appropriate. Vetoes with no explanation are void. It may also be appropriate
for a -1 vote to include an alternative course of action.</td>
-</tr>
-</tbody></table>
+<table>
+ <thead>
+ <tr>
+ <th>Vote</th>
+ <th>Meaning</th>
+ </tr>
+ </thead>
+ <tbody>
+ <tr>
+ <td>+1</td>
+ <td>âYesâ, âAgreeâ, or âthe action should be performedâ.</td>
+ </tr>
+ <tr>
+ <td>+0</td>
+ <td>Neutral about the proposed action (or mildly negative but not enough
so to want to block it).</td>
+ </tr>
+ <tr>
+ <td>-1</td>
+ <td>This is a negative vote. On issues where consensus is required, this
vote counts as a veto. All vetoes must contain an explanation of why the veto
is appropriate. Vetoes with no explanation are void. It may also be appropriate
for a -1 vote to include an alternative course of action.</td>
+ </tr>
+ </tbody>
+</table>
<p>All participants in the SAMOA project are encouraged to show their
agreement with or against a particular action by voting. For technical
decisions, only the votes of active Committers are binding. Non-binding votes
are still useful for those with binding votes to understand the perception of
an action in the wider SAMOA community. For PMC decisions, only the votes of
active PMC members are binding.</p>
@@ -158,33 +161,36 @@
<p>These are the types of approval that can be sought. Different actions
require different types of approval.</p>
-<table><thead>
-<tr>
-<th>Approval</th>
-<th>Requirements</th>
-</tr>
-</thead><tbody>
-<tr>
-<td>Consensus</td>
-<td>requires all binding-vote holders to cast +1 votes and no binding -1
vetoes (consensus votes are rarely required due to the impracticality of
getting all eligible voters to cast a vote).</td>
-</tr>
-<tr>
-<td>2/3 Majority</td>
-<td>requires at least 2/3 of binding-vote holders to cast +1 votes. (2/3
majority is typically used for actions that affect the foundation of the
project, e.g., adopting a new codebase to replace an existing product).</td>
-</tr>
-<tr>
-<td>Lazy Consensus</td>
-<td>requires 2 binding +1 votes and no -1 votes ('silence gives
assent').</td>
-</tr>
-<tr>
-<td>Lazy Majority</td>
-<td>requires 3 binding +1 votes and more binding +1 votes than -1 vetoes.</td>
-</tr>
-<tr>
-<td>Lazy 2/3 Majority</td>
-<td>requires at least 3 votes and twice as many +1 votes as -1 vetoes.</td>
-</tr>
-</tbody></table>
+<table>
+ <thead>
+ <tr>
+ <th>Approval</th>
+ <th>Requirements</th>
+ </tr>
+ </thead>
+ <tbody>
+ <tr>
+ <td>Consensus</td>
+ <td>requires all binding-vote holders to cast +1 votes and no binding -1
vetoes (consensus votes are rarely required due to the impracticality of
getting all eligible voters to cast a vote).</td>
+ </tr>
+ <tr>
+ <td>2/3 Majority</td>
+ <td>requires at least 2/3 of binding-vote holders to cast +1 votes. (2/3
majority is typically used for actions that affect the foundation of the
project, e.g., adopting a new codebase to replace an existing product).</td>
+ </tr>
+ <tr>
+ <td>Lazy Consensus</td>
+ <td>requires 2 binding +1 votes and no -1 votes (âsilence gives
assentâ).</td>
+ </tr>
+ <tr>
+ <td>Lazy Majority</td>
+ <td>requires 3 binding +1 votes and more binding +1 votes than -1
vetoes.</td>
+ </tr>
+ <tr>
+ <td>Lazy 2/3 Majority</td>
+ <td>requires at least 3 votes and twice as many +1 votes as -1
vetoes.</td>
+ </tr>
+ </tbody>
+</table>
<h3 id="vetoes">Vetoes</h3>
@@ -196,105 +202,108 @@
<p>This section describes the various actions which are undertaken within the
project, the corresponding approval required for that action and those who have
binding votes over the action.</p>
-<table><thead>
-<tr>
-<th>Action</th>
-<th>Description</th>
-<th>Approval</th>
-<th>Binding Votes</th>
-<th>Minimum Length</th>
-<th>Mailing List</th>
-</tr>
-</thead><tbody>
-<tr>
-<td>Code Change</td>
-<td>A change made to a codebase of the project and committed by a committer.
This includes source code, documentation, and website content.</td>
-<td>Lazy Consensus (with at least one +1 vote from someone who has not
authored the patch). The code can be committed as soon as the required number
of binding votes is reached.</td>
-<td>Active Committers</td>
-<td>1 day</td>
-<td>JIRA or GitHub pull request (with notification sent to dev@)</td>
-</tr>
-<tr>
-<td>Release Plan</td>
-<td>Defines the timetable and actions for a release. The plan also nominates a
Release Manager.</td>
-<td>Lazy Majority</td>
-<td>Active Committers</td>
-<td>3 days</td>
-<td>dev@</td>
-</tr>
-<tr>
-<td>Product Release</td>
-<td>Accepting the official release of a product of the project.</td>
-<td>Lazy Majority</td>
-<td>Active PMC members</td>
-<td>3 days</td>
-<td>dev@</td>
-</tr>
-<tr>
-<td>Adoption of New Codebase</td>
-<td>Replacing the codebase for an existing, released product with an
alternative codebase. If such a vote fails to gain approval, the existing code
base will continue. This action also covers the creation of new sub-projects
and sub-modules within the project.</td>
-<td>Lazy 2/3 Majority</td>
-<td>Active PMC members</td>
-<td>7 days</td>
-<td>dev@</td>
-</tr>
-<tr>
-<td>New Committer</td>
-<td>Electing a new Committer for the project.</td>
-<td>Lazy Consensus</td>
-<td>Active PMC members</td>
-<td>7 days</td>
-<td>private@</td>
-</tr>
-<tr>
-<td>New PMC Member</td>
-<td>Promoting a Committer to the PMC of the project.</td>
-<td>Consensus</td>
-<td>Active PMC members</td>
-<td>7 days</td>
-<td>private@</td>
-</tr>
-<tr>
-<td>Emeritus PMC Member re-instatement</td>
-<td>When an emeritus PMC member requests to be re-instated as an active PMC
member.</td>
-<td>Consensus</td>
-<td>Active PMC members</td>
-<td>7 days</td>
-<td>private@</td>
-</tr>
-<tr>
-<td>Emeritus Committer re-instatement</td>
-<td>When an emeritus Committer requests to be re-instated as an active
committer.</td>
-<td>Consensus</td>
-<td>Active PMC members</td>
-<td>7 days</td>
-<td>private@</td>
-</tr>
-<tr>
-<td>Committer Removal</td>
-<td>When removal of commit privileges is sought. Note: Such actions will also
be referred to the ASF board by the PMC chair.</td>
-<td>Consensus</td>
-<td>Active PMC members (excluding the committer in question if member of the
PMC)</td>
-<td>7 Days</td>
-<td>private@</td>
-</tr>
-<tr>
-<td>PMC Member Removal</td>
-<td>When removal of a PMC member is sought. Note: Such actions will also be
referred to the ASF board by the PMC chair.</td>
-<td>Consensus</td>
-<td>Active PMC members (excluding the member in question)</td>
-<td>7 Days</td>
-<td>private@</td>
-</tr>
-<tr>
-<td>Modifying Bylaws</td>
-<td>Modifying this document.</td>
-<td>2/3 Majority</td>
-<td>Active PMC members</td>
-<td>7 Days</td>
-<td>dev@</td>
-</tr>
-</tbody></table>
+<table>
+ <thead>
+ <tr>
+ <th>Action</th>
+ <th>Description</th>
+ <th>Approval</th>
+ <th>Binding Votes</th>
+ <th>Minimum Length</th>
+ <th>Mailing List</th>
+ </tr>
+ </thead>
+ <tbody>
+ <tr>
+ <td>Code Change</td>
+ <td>A change made to a codebase of the project and committed by a
committer. This includes source code, documentation, and website content.</td>
+ <td>Lazy Consensus (with at least one +1 vote from someone who has not
authored the patch). The code can be committed as soon as the required number
of binding votes is reached.</td>
+ <td>Active Committers</td>
+ <td>1 day</td>
+ <td>JIRA or GitHub pull request (with notification sent to dev@)</td>
+ </tr>
+ <tr>
+ <td>Release Plan</td>
+ <td>Defines the timetable and actions for a release. The plan also
nominates a Release Manager.</td>
+ <td>Lazy Majority</td>
+ <td>Active Committers</td>
+ <td>3 days</td>
+ <td>dev@</td>
+ </tr>
+ <tr>
+ <td>Product Release</td>
+ <td>Accepting the official release of a product of the project.</td>
+ <td>Lazy Majority</td>
+ <td>Active PMC members</td>
+ <td>3 days</td>
+ <td>dev@</td>
+ </tr>
+ <tr>
+ <td>Adoption of New Codebase</td>
+ <td>Replacing the codebase for an existing, released product with an
alternative codebase. If such a vote fails to gain approval, the existing code
base will continue. This action also covers the creation of new sub-projects
and sub-modules within the project.</td>
+ <td>Lazy 2/3 Majority</td>
+ <td>Active PMC members</td>
+ <td>7 days</td>
+ <td>dev@</td>
+ </tr>
+ <tr>
+ <td>New Committer</td>
+ <td>Electing a new Committer for the project.</td>
+ <td>Lazy Consensus</td>
+ <td>Active PMC members</td>
+ <td>7 days</td>
+ <td>private@</td>
+ </tr>
+ <tr>
+ <td>New PMC Member</td>
+ <td>Promoting a Committer to the PMC of the project.</td>
+ <td>Consensus</td>
+ <td>Active PMC members</td>
+ <td>7 days</td>
+ <td>private@</td>
+ </tr>
+ <tr>
+ <td>Emeritus PMC Member re-instatement</td>
+ <td>When an emeritus PMC member requests to be re-instated as an active
PMC member.</td>
+ <td>Consensus</td>
+ <td>Active PMC members</td>
+ <td>7 days</td>
+ <td>private@</td>
+ </tr>
+ <tr>
+ <td>Emeritus Committer re-instatement</td>
+ <td>When an emeritus Committer requests to be re-instated as an active
committer.</td>
+ <td>Consensus</td>
+ <td>Active PMC members</td>
+ <td>7 days</td>
+ <td>private@</td>
+ </tr>
+ <tr>
+ <td>Committer Removal</td>
+ <td>When removal of commit privileges is sought. Note: Such actions will
also be referred to the ASF board by the PMC chair.</td>
+ <td>Consensus</td>
+ <td>Active PMC members (excluding the committer in question if member of
the PMC)</td>
+ <td>7 Days</td>
+ <td>private@</td>
+ </tr>
+ <tr>
+ <td>PMC Member Removal</td>
+ <td>When removal of a PMC member is sought. Note: Such actions will also
be referred to the ASF board by the PMC chair.</td>
+ <td>Consensus</td>
+ <td>Active PMC members (excluding the member in question)</td>
+ <td>7 Days</td>
+ <td>private@</td>
+ </tr>
+ <tr>
+ <td>Modifying Bylaws</td>
+ <td>Modifying this document.</td>
+ <td>2/3 Majority</td>
+ <td>Active PMC members</td>
+ <td>7 Days</td>
+ <td>dev@</td>
+ </tr>
+ </tbody>
+</table>
</article>
Modified: incubator/samoa/site/documentation/Content-Event.html
URL:
http://svn.apache.org/viewvc/incubator/samoa/site/documentation/Content-Event.html?rev=1737551&r1=1737550&r2=1737551&view=diff
==============================================================================
--- incubator/samoa/site/documentation/Content-Event.html (original)
+++ incubator/samoa/site/documentation/Content-Event.html Sun Apr 3 08:17:59
2016
@@ -75,121 +75,120 @@
<article class="post-content">
<p>A message or an event is called Content Event in SAMOA. As the name
suggests, it is an event which contains content which needs to be processed by
the processors.</p>
-<h3 id="1-implementation">1. Implementation</h3>
+<h3 id="implementation">1. Implementation</h3>
+<p>ContentEvent has been implemented as an interface in SAMOA. Users need to
implement <code class="highlighter-rouge">ContentEvent</code> interface to
create their custom message classes. As it can be seen in the following code,
key is the necessary part of a message.</p>
-<p>ContentEvent has been implemented as an interface in SAMOA. Users need to
implement <code>ContentEvent</code> interface to create their custom message
classes. As it can be seen in the following code, key is the necessary part of
a message.</p>
-<div class="highlight"><pre><code class="language-" data-lang="">package
org.apache.samoa.core;
+<p>```
+package org.apache.samoa.core;</p>
-public interface ContentEvent extends java.io.Serializable {
+<p>public interface ContentEvent extends java.io.Serializable {</p>
- public String getKey();
+<div class="highlighter-rouge"><pre class="highlight"><code>public String
getKey();
- public void setKey(String str);
+public void setKey(String str);
- public boolean isLastEvent();
-}
-</code></pre></div>
-<h3 id="2-methods">2. Methods</h3>
-
-<p>Following is a brief description of methods.</p>
-
-<h5 id="2-1-string-getkey">2.1 <code>String getKey()</code></h5>
+public boolean isLastEvent(); } ``` ###2. Methods Following is a brief
description of methods.
+</code></pre>
+</div>
+<h5 id="string-getkey">2.1 <code class="highlighter-rouge">String
getKey()</code></h5>
<p>Each message is identified by a key in SAMOA. All user-defined message
classes should have a key state variable. Each instance of the custom message
should be assigned a key. This method should return the key of the respective
message.</p>
-<h5 id="2-2-void-setkey-string-str">2.2 <code>void setKey(String
str)</code></h5>
-
+<h5 id="void-setkeystring-str">2.2 <code class="highlighter-rouge">void
setKey(String str)</code></h5>
<p>This method is used to assign a key to the message.</p>
-<h5 id="2-3-boolean-islastevent">2.3 <code>boolean isLastEvent()</code></h5>
-
+<h5 id="boolean-islastevent">2.3 <code class="highlighter-rouge">boolean
isLastEvent()</code></h5>
<p>This method lets SAMOA know that this message is the last message.</p>
-<h3 id="3-example">3. Example</h3>
+<h3 id="example">3. Example</h3>
+<p>Following is the example of a <code
class="highlighter-rouge">Message</code> class which implements <code
class="highlighter-rouge">ContentEvent</code> interface. As <code
class="highlighter-rouge">ContentEvent</code> is an interface, it can not hold
variables. A user-defined message class should have its own data variables and
its getter methods. In the following example, <code
class="highlighter-rouge">value</code> variable of type <code
class="highlighter-rouge">Object</code> is added to the class. Using a generic
type <code class="highlighter-rouge">Object</code> is beneficial in the sense
that any object can be passed to it and later it can be casted back to the
original type. The following example also adds a <code
class="highlighter-rouge">streamId</code> variable which stores the <code
class="highlighter-rouge">id</code> of the stream the message belongs to. This
is not a requirement but can be beneficial in certain applications.</p>
-<p>Following is the example of a <code>Message</code> class which implements
<code>ContentEvent</code> interface. As <code>ContentEvent</code> is an
interface, it can not hold variables. A user-defined message class should have
its own data variables and its getter methods. In the following example,
<code>value</code> variable of type <code>Object</code> is added to the class.
Using a generic type <code>Object</code> is beneficial in the sense that any
object can be passed to it and later it can be casted back to the original
type. The following example also adds a <code>streamId</code> variable which
stores the <code>id</code> of the stream the message belongs to. This is not a
requirement but can be beneficial in certain applications.</p>
-<div class="highlight"><pre><code class="language-" data-lang="">import
org.apache.samoa.core.ContentEvent;
+<p>```
+import org.apache.samoa.core.ContentEvent;</p>
-/**
+<p>/**
* A general key-value message class which adds a stream id in the class
variables
* Stream id information helps in determining to which stream does the message
belongs to.
*/
-public class Message implements ContentEvent {
+public class Message implements ContentEvent {</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>/**
+ * To tell if the message is the last message of the stream. This may be
required in some applications where
+ * a stream can cease to exist
+ */
+private boolean last=false;
+/**
+ * Id of the stream to which the message belongs
+ */
+private String streamId;
+/**
+ * The key of the message. Can be any sting value. Duplicates are allowed.
+ */
+private String key;
+/**
+ * The value of the message. Can be any object. Casting may be necessary to
the desired type.
+ */
+private Object value;
- /**
- * To tell if the message is the last message of the stream. This may be
required in some applications where
- * a stream can cease to exist
- */
- private boolean last=false;
- /**
- * Id of the stream to which the message belongs
- */
- private String streamId;
- /**
- * The key of the message. Can be any sting value. Duplicates are allowed.
- */
- private String key;
- /**
- * The value of the message. Can be any object. Casting may be necessary
to the desired type.
- */
- private Object value;
-
- public Message()
- {}
-
- /**
- * @param key
- * @param value
- * @param isLastEvent
- * @param streamId
- */
- public Message(String key, Object value, boolean isLastEvent, String
streamId)
- {
- this.key=key;
- this.value = value;
- this.last = isLastEvent;
- this.streamId=streamId;
- }
-
- @Override
- public String getKey() {
- return key;
- }
-
- @Override
- public void setKey(String str) {
- this.key = str;
- }
-
- @Override
- public boolean isLastEvent() {
- return last;
- }
-
- /**
- * @return value of the message
- */
- public String getValue()
- {
- return value.toString();
- }
-
- /**
- * @return id of the stream to which the message belongs
- */
- public String getStreamId() {
- return streamId;
- }
- /**
- * @param streamId
- */
- public void setStreamId(String streamId) {
- this.streamId = streamId;
- }
+public Message()
+{}
+
+/**
+ * @param key
+ * @param value
+ * @param isLastEvent
+ * @param streamId
+ */
+public Message(String key, Object value, boolean isLastEvent, String streamId)
+{
+ this.key=key;
+ this.value = value;
+ this.last = isLastEvent;
+ this.streamId=streamId;
+}
+@Override
+public String getKey() {
+ return key;
}
-</code></pre></div>
+@Override
+public void setKey(String str) {
+ this.key = str;
+}
+
+@Override
+public boolean isLastEvent() {
+ return last;
+}
+
+/**
+ * @return value of the message
+ */
+public String getValue()
+{
+ return value.toString();
+}
+
+/**
+ * @return id of the stream to which the message belongs
+ */
+public String getStreamId() {
+ return streamId;
+}
+/**
+ * @param streamId
+ */
+public void setStreamId(String streamId) {
+ this.streamId = streamId;
+}
+</code></pre>
+</div>
+
+<p>}</p>
+
+<p>```</p>
+
</article>
<!-- </div> -->
Modified: incubator/samoa/site/documentation/Developing-New-Tasks-in-SAMOA.html
URL:
http://svn.apache.org/viewvc/incubator/samoa/site/documentation/Developing-New-Tasks-in-SAMOA.html?rev=1737551&r1=1737550&r2=1737551&view=diff
==============================================================================
--- incubator/samoa/site/documentation/Developing-New-Tasks-in-SAMOA.html
(original)
+++ incubator/samoa/site/documentation/Developing-New-Tasks-in-SAMOA.html Sun
Apr 3 08:17:59 2016
@@ -73,147 +73,165 @@
</header>
<article class="post-content">
- <p>A <em>task</em> is a machine learning related activity such as a
specific evaluation for a classifier. For instance the <em>prequential
evaluation</em> task is a task that uses each instance first for testing and
then for training a model built using a specific classification algorithm. A
task corresponds to a topology in SAMOA. </p>
+ <p>A <em>task</em> is a machine learning related activity such as a
specific evaluation for a classifier. For instance the <em>prequential
evaluation</em> task is a task that uses each instance first for testing and
then for training a model built using a specific classification algorithm. A
task corresponds to a topology in SAMOA.</p>
<p>In this tutorial, we will develop a simple Hello World task.</p>
<h3 id="hello-world-task">Hello World Task</h3>
-
<p>The Hello World task consists of a source processor, a destination
processor with a parallelism hint setting, and a stream that connects the two.
The source processor will generate a random integer which will be sent to the
destination processor. The figure below shows the layout of Hello World
task.</p>
-<p><img src="images/HelloWorldTask.png" alt="Hello World Task"></p>
+<p><img src="images/HelloWorldTask.png" alt="Hello World Task" /></p>
-<p>To develop the task, we create a new class that implements the interface
<code>org.apache.samoa.tasks.Task</code>. For convenience we also implement
<code>com.github.javacliparser.Configurable</code> which allows to parse
command-line options.</p>
+<p>To develop the task, we create a new class that implements the interface
<code class="highlighter-rouge">org.apache.samoa.tasks.Task</code>. For
convenience we also implement <code
class="highlighter-rouge">com.github.javacliparser.Configurable</code> which
allows to parse command-line options.</p>
-<p>The <code>init</code> method builds the topology by instantiating the
necessary <code>Processors</code>, <code>Streams</code> and connecting the
source processor with the destination processor.</p>
+<p>The <code class="highlighter-rouge">init</code> method builds the topology
by instantiating the necessary <code
class="highlighter-rouge">Processors</code>, <code
class="highlighter-rouge">Streams</code> and connecting the source processor
with the destination processor.</p>
<h3 id="hello-world-source-processor">Hello World Source Processor</h3>
+<p>We need a source processor which is an instance of <code
class="highlighter-rouge">EntranceProcessor</code> to start a task in SAMOA. In
this tutorial, the source processor is <code
class="highlighter-rouge">HelloWorldSourceProcessor</code>.</p>
-<p>We need a source processor which is an instance of
<code>EntranceProcessor</code> to start a task in SAMOA. In this tutorial, the
source processor is <code>HelloWorldSourceProcessor</code>. </p>
+<p>The SAMOA runtime invokes the <code
class="highlighter-rouge">nextEvent</code> method of <code
class="highlighter-rouge">EntranceProcessor</code> until its <code
class="highlighter-rouge">hasNext</code> method returns false. Each call to
<code class="highlighter-rouge">nextEvent</code> should return the next <code
class="highlighter-rouge">ContentEvent</code> to be sent to the topology. In
this tutorial, <code class="highlighter-rouge">HelloWorldSourceProcessor</code>
sends events of type <code
class="highlighter-rouge">HelloWorldContentEvent</code>.</p>
-<p>The SAMOA runtime invokes the <code>nextEvent</code> method of
<code>EntranceProcessor</code> until its <code>hasNext</code> method returns
false. Each call to <code>nextEvent</code> should return the next
<code>ContentEvent</code> to be sent to the topology. In this tutorial,
<code>HelloWorldSourceProcessor</code> sends events of type
<code>HelloWorldContentEvent</code>.</p>
+<p>Here is the relevant code in <code
class="highlighter-rouge">HelloWorldSourceProcessor</code>:</p>
-<p>Here is the relevant code in <code>HelloWorldSourceProcessor</code>:</p>
-<div class="highlight"><pre><code class="language-" data-lang=""> private
Random rnd;
+<p>```
+ private Random rnd;
private final long maxInst;
- private long count;
-
- @Override
- public boolean hasNext() {
- return count < maxInst;
- }
+ private long count;</p>
- @Override
- public ContentEvent nextEvent() {
- count++;
- return new HelloWorldContentEvent(rnd.nextInt(), false);
- }
-</code></pre></div>
-<p>We also need to create a new type of <code>ContentEvent</code> to hold our
data. In this tutorial we call it <code>HelloWorldContentEvent</code> and its
content is simply an integer.</p>
-<div class="highlight"><pre><code class="language-" data-lang="">public class
HelloWorldContentEvent implements ContentEvent {
-
- private static final long serialVersionUID = -2406968925730298156L;
- private final boolean isLastEvent;
- private final int helloWorldData;
-
- public HelloWorldContentEvent(int helloWorldData, boolean isLastEvent) {
- this.isLastEvent = isLastEvent;
- this.helloWorldData = helloWorldData;
- }
+<div class="highlighter-rouge"><pre class="highlight"><code>@Override
+public boolean hasNext() {
+ return count < maxInst;
+}
- @Override
- public String getKey() {
- return null;
- }
+@Override
+public ContentEvent nextEvent() {
+ count++;
+ return new HelloWorldContentEvent(rnd.nextInt(), false);
+} ```
+</code></pre>
+</div>
+
+<p>We also need to create a new type of <code
class="highlighter-rouge">ContentEvent</code> to hold our data. In this
tutorial we call it <code
class="highlighter-rouge">HelloWorldContentEvent</code> and its content is
simply an integer.</p>
+
+<p>```
+public class HelloWorldContentEvent implements ContentEvent {</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>private static
final long serialVersionUID = -2406968925730298156L;
+private final boolean isLastEvent;
+private final int helloWorldData;
+
+public HelloWorldContentEvent(int helloWorldData, boolean isLastEvent) {
+ this.isLastEvent = isLastEvent;
+ this.helloWorldData = helloWorldData;
+}
- @Override
- public void setKey(String str) {
- // do nothing, it's key-less content event
- }
+@Override
+public String getKey() {
+ return null;
+}
- @Override
- public boolean isLastEvent() {
- return isLastEvent;
- }
+@Override
+public void setKey(String str) {
+ // do nothing, it's key-less content event
+}
- public int getHelloWorldData() {
- return helloWorldData;
- }
+@Override
+public boolean isLastEvent() {
+ return isLastEvent;
+}
- @Override
- public String toString() {
- return "HelloWorldContentEvent [helloWorldData=" + helloWorldData +
"]";
- }
+public int getHelloWorldData() {
+ return helloWorldData;
}
-</code></pre></div>
-<h3 id="hello-world-destination-processor">Hello World Destination
Processor</h3>
+@Override
+public String toString() {
+ return "HelloWorldContentEvent [helloWorldData=" + helloWorldData + "]";
+} } ```
+</code></pre>
+</div>
+
+<h3 id="hello-world-destination-processor">Hello World Destination
Processor</h3>
<p>The destination processor for SAMOA is pretty straightforward and it will
print the data from the event.</p>
-<div class="highlight"><pre><code class="language-" data-lang="">public class
HelloWorldDestinationProcessor implements Processor {
- private static final long serialVersionUID = -6042613438148776446L;
- private int processorId;
+<p>```
+public class HelloWorldDestinationProcessor implements Processor {</p>
- @Override
- public boolean process(ContentEvent event) {
- System.out.println(processorId + ": " + event);
- return true;
- }
+<div class="highlighter-rouge"><pre class="highlight"><code>private static
final long serialVersionUID = -6042613438148776446L;
+private int processorId;
- @Override
- public void onCreate(int id) {
- this.processorId = id;
- }
+@Override
+public boolean process(ContentEvent event) {
+ System.out.println(processorId + ": " + event);
+ return true;
+}
- @Override
- public Processor newProcessor(Processor p) {
- return new HelloWorldDestinationProcessor();
- }
+@Override
+public void onCreate(int id) {
+ this.processorId = id;
}
-</code></pre></div>
+
+@Override
+public Processor newProcessor(Processor p) {
+ return new HelloWorldDestinationProcessor();
+} } ```
+</code></pre>
+</div>
+
<h3 id="putting-it-all-together">Putting It All Together</h3>
+<p>To put all the components together, we need to go back to class <code
class="highlighter-rouge">HelloWorldTask</code>. First, we need to implement
the code for setting up the <code
class="highlighter-rouge">TopologyBuilder</code>. This code is necessary to be
able to run on multiple platforms.</p>
-<p>To put all the components together, we need to go back to class
<code>HelloWorldTask</code>. First, we need to implement the code for setting
up the <code>TopologyBuilder</code>. This code is necessary to be able to run
on multiple platforms.</p>
-<div class="highlight"><pre><code class="language-" data-lang=""> @Override
+<p><code class="highlighter-rouge">
+ @Override
public void setFactory(ComponentFactory factory) {
builder = new TopologyBuilder(factory);
logger.debug("Sucessfully instantiating TopologyBuilder");
builder.initTopology(evaluationNameOption.getValue());
logger.debug("Sucessfully initializing SAMOA topology with name {}",
evaluationNameOption.getValue());
}
-</code></pre></div>
-<p>After this method is called we have a functioning builder to get components
for our topology. Next, the <code>init</code> method is called by SAMOA to
start the task.
-First we instantiate the source <code>EntranceProcessor</code>.
-After adding the entrance processor to the topology, we create a stream
originating from it. We use the create stream method of
<code>TopologyBuilder</code>.
+</code></p>
+
+<p>After this method is called we have a functioning builder to get components
for our topology. Next, the <code class="highlighter-rouge">init</code> method
is called by SAMOA to start the task.
+First we instantiate the source <code
class="highlighter-rouge">EntranceProcessor</code>.
+After adding the entrance processor to the topology, we create a stream
originating from it. We use the create stream method of <code
class="highlighter-rouge">TopologyBuilder</code>.
Next we create the destination processor and connect it to the stream by using
shuffle grouping.
Once we have created all the components, we use the builder to build the
topology.</p>
-<div class="highlight"><pre><code class="language-" data-lang=""> @Override
+
+<p>```
+ @Override
public void init() {
// create source EntranceProcesor
sourceProcessor = new
HelloWorldSourceProcessor(instanceLimitOption.getValue());
- builder.addEntranceProcessor(sourceProcessor);
+ builder.addEntranceProcessor(sourceProcessor);</p>
- // create Stream
- Stream stream = builder.createStream(sourceProcessor);
+<div class="highlighter-rouge"><pre class="highlight"><code> // create
Stream
+ Stream stream = builder.createStream(sourceProcessor);
- // create destination Processor
- destProcessor = new HelloWorldDestinationProcessor();
- builder.addProcessor(destProcessor,
helloWorldParallelismOption.getValue());
- builder.connectInputShuffleStream(stream, destProcessor);
-
- // build the topology
- helloWorldTopology = builder.build();
- logger.debug("Successfully built the topology");
- }
-</code></pre></div>
-<h3 id="running-it">Running It</h3>
+ // create destination Processor
+ destProcessor = new HelloWorldDestinationProcessor();
+ builder.addProcessor(destProcessor,
helloWorldParallelismOption.getValue());
+ builder.connectInputShuffleStream(stream, destProcessor);
+
+ // build the topology
+ helloWorldTopology = builder.build();
+ logger.debug("Successfully built the topology");
+} ```
+</code></pre>
+</div>
+<h3 id="running-it">Running It</h3>
<p>To run the example in local mode:</p>
-<div class="highlight"><pre><code class="language-" data-lang="">bin/samoa
local target/SAMOA-Local-0.0.1-SNAPSHOT.jar
"org.apache.samoa.examples.HelloWorldTask -p 4 -i 100"
-</code></pre></div>
+
+<p><code class="highlighter-rouge">
+bin/samoa local target/SAMOA-Local-0.0.1-SNAPSHOT.jar
"org.apache.samoa.examples.HelloWorldTask -p 4 -i 100"
+</code></p>
+
<p>To run the example in Storm local mode:</p>
-<div class="highlight"><pre><code class="language-" data-lang="">java -cp
$STORM_HOME/lib/*:$STORM_HOME/storm-0.8.2.jar:target/SAMOA-Storm-0.0.1-SNAPSHOT.jar
org.apache.samoa.LocalStormDoTask "org.apache.samoa.examples.HelloWorldTask -p
4 -i 1000"
-</code></pre></div>
+
+<p><code class="highlighter-rouge">
+java -cp
$STORM_HOME/lib/*:$STORM_HOME/storm-0.8.2.jar:target/SAMOA-Storm-0.0.1-SNAPSHOT.jar
org.apache.samoa.LocalStormDoTask "org.apache.samoa.examples.HelloWorldTask -p
4 -i 1000"
+</code></p>
+
<p>All the code for the HelloWorldTask and its components can be found <a
href="https://github.com/yahoo/samoa/tree/master/samoa-api/src/main/java/org/apache/samoa/examples">here</a>.</p>
</article>
Modified: incubator/samoa/site/documentation/Distributed-Stream-Clustering.html
URL:
http://svn.apache.org/viewvc/incubator/samoa/site/documentation/Distributed-Stream-Clustering.html?rev=1737551&r1=1737550&r2=1737551&view=diff
==============================================================================
--- incubator/samoa/site/documentation/Distributed-Stream-Clustering.html
(original)
+++ incubator/samoa/site/documentation/Distributed-Stream-Clustering.html Sun
Apr 3 08:17:59 2016
@@ -74,22 +74,24 @@
<article class="post-content">
<h2 id="apache-samoa-clustering-algorithm">Apache SAMOA Clustering
Algorithm</h2>
+<p>The SAMOA Clustering Algorithm is invoked by using the <code
class="highlighter-rouge">ClusteringEvaluation</code> task. The clustering task
can be executed with default values just by running:</p>
+
+<p><code class="highlighter-rouge">
+bin/samoa storm target/SAMOA-Storm-0.0.1-SNAPSHOT.jar "ClusteringEvaluation"
+</code></p>
-<p>The SAMOA Clustering Algorithm is invoked by using the
<code>ClusteringEvaluation</code> task. The clustering task can be executed
with default values just by running:</p>
-<div class="highlight"><pre><code class="language-" data-lang="">bin/samoa
storm target/SAMOA-Storm-0.0.1-SNAPSHOT.jar "ClusteringEvaluation"
-</code></pre></div>
<p>Parameters:</p>
<ul>
-<li><code>-l</code>: clusterer to train</li>
-<li><code>-s</code>: stream to learn from</li>
-<li><code>-i</code>: maximum number of instances to test/train on (-1 = no
limit)</li>
-<li><code>-f</code>: how many instances between samples of the learning
performance</li>
-<li><code>-n</code>: evaluation name (default:
ClusteringEvaluation_TimeStamp)</li>
-<li><code>-d</code>: file to append intermediate csv results to</li>
+ <li><code class="highlighter-rouge">-l</code>: clusterer to train</li>
+ <li><code class="highlighter-rouge">-s</code>: stream to learn from</li>
+ <li><code class="highlighter-rouge">-i</code>: maximum number of instances
to test/train on (-1 = no limit)</li>
+ <li><code class="highlighter-rouge">-f</code>: how many instances between
samples of the learning performance</li>
+ <li><code class="highlighter-rouge">-n</code>: evaluation name (default:
ClusteringEvaluation_TimeStamp)</li>
+ <li><code class="highlighter-rouge">-d</code>: file to append intermediate
csv results to</li>
</ul>
-<p>In terms of the SAMOA API, Clustering Evaluation consists of a
<code>source</code> processor, a <code>clusterer</code>, and a
<code>evaluator</code> processor. <code>Source</code> processor sends the
instances to the classifier using <code>source</code> stream. The clusterer
sends the clustering results to the <code>evaluator</code> processor via the
<code>result</code> stream. The <code>source Processor</code> corresponds to
the <code>-s</code> option of Clustering Evaluation, and the clusterer
corresponds to the <code>-l</code> option.</p>
+<p>In terms of the SAMOA API, Clustering Evaluation consists of a <code
class="highlighter-rouge">source</code> processor, a <code
class="highlighter-rouge">clusterer</code>, and a <code
class="highlighter-rouge">evaluator</code> processor. <code
class="highlighter-rouge">Source</code> processor sends the instances to the
classifier using <code class="highlighter-rouge">source</code> stream. The
clusterer sends the clustering results to the <code
class="highlighter-rouge">evaluator</code> processor via the <code
class="highlighter-rouge">result</code> stream. The <code
class="highlighter-rouge">source Processor</code> corresponds to the <code
class="highlighter-rouge">-s</code> option of Clustering Evaluation, and the
clusterer corresponds to the <code class="highlighter-rouge">-l</code>
option.</p>
</article>
Modified:
incubator/samoa/site/documentation/Distributed-Stream-Frequent-Itemset-Mining.html
URL:
http://svn.apache.org/viewvc/incubator/samoa/site/documentation/Distributed-Stream-Frequent-Itemset-Mining.html?rev=1737551&r1=1737550&r2=1737551&view=diff
==============================================================================
---
incubator/samoa/site/documentation/Distributed-Stream-Frequent-Itemset-Mining.html
(original)
+++
incubator/samoa/site/documentation/Distributed-Stream-Frequent-Itemset-Mining.html
Sun Apr 3 08:17:59 2016
@@ -73,69 +73,73 @@
</header>
<article class="post-content">
- <h2 id="1-introduction">1. Introduction</h2>
-
-<p>SAMOA takes a micro-batching approach to frequent itemset mining (FIM). It
uses <a href="https://dl.acm.org/citation.cfm?id=2396776">PARMA</a> as a base
algorithm for distributed sample-based frequent itemset mining. PARMA provides
the guaranty that all the frequent itemsets would be present in the result that
it returns.It also returns some false positives. The problem with FIM in
streams is that the stream has an evolving nature. The itemsets that were
frequent last year may not be frequent this year. To handle this, SAMOA
implements <a href="https://dl.acm.org/citation.cfm?id=1164180">Time Biased
Sampling</a> approach. This sampling method depends on a parameter
<em>lambda</em> which determines the size of the reservoir sample. This also
tells us how much biased the sample would be towards newer itemsets. As PARMA
has its own way of determining sample sizes, SAMOA does not allow users to
choose <em>lambda</em> and determines its value using the sample size
determined by PARMA
using the approximation <code>lambda = 1/sampleSize</code>. </p>
-
-<h2 id="2-concepts">2. Concepts</h2>
-
-<p>SAMOA implements FIM for streams in three processors i.e.
StreamSourceProcessor, SamplerProcessor and AggregatorProcessor. The tasks of
each of these are explained below.</p>
+ <h2 id="introduction">1. Introduction</h2>
+<p>SAMOA takes a micro-batching approach to frequent itemset mining (FIM). It
uses <a href="https://dl.acm.org/citation.cfm?id=2396776">PARMA</a> as a base
algorithm for distributed sample-based frequent itemset mining. PARMA provides
the guaranty that all the frequent itemsets would be present in the result that
it returns.It also returns some false positives. The problem with FIM in
streams is that the stream has an evolving nature. The itemsets that were
frequent last year may not be frequent this year. To handle this, SAMOA
implements <a href="https://dl.acm.org/citation.cfm?id=1164180">Time Biased
Sampling</a> approach. This sampling method depends on a parameter
<em>lambda</em> which determines the size of the reservoir sample. This also
tells us how much biased the sample would be towards newer itemsets. As PARMA
has its own way of determining sample sizes, SAMOA does not allow users to
choose <em>lambda</em> and determines its value using the sample size
determined by PARMA
using the approximation <code class="highlighter-rouge">lambda =
1/sampleSize</code>.
+## 2. Concepts
+SAMOA implements FIM for streams in three processors i.e.
StreamSourceProcessor, SamplerProcessor and AggregatorProcessor. The tasks of
each of these are explained below.</p>
<ol>
-<li><p>StreamSourceP takes as input the input transaction file.
StreamSourceProcessor (Entrance PI) starts sending the transactions randomly to
SamplerProcessor instances. The number of SamplerProcessors to instantiate is
taken as an argument from the user but is verified by PARMA. PARMA determines
this number based on the <code>epsilon</code> and <code>phi</code> parameters
provided by the user. StreamSourceProcessor sends an FPM='yes' command
to all the instances of SamplerProcessor after 2M transactions where
M=numSamples*sampleSize. After first FPM='yes' command, all later
FPM='yes' commands are sent after <code>fpmGap</code> transactions
which is one of the parameter SAMOA FIM task takes as input.</p></li>
-<li><p>All the instances of SamplerProcessor start building a Time Biased
Reservoir Sample in which newer transactions have more weight. Time biased
sampling is the default approach but user can provide his own sampler by
implementing <code>samoa.samplers.SamplerInterface</code>. When a
SamplerProcessor receives FPM='yes' command, it starts FIM/FPM on the
reservoir irrespective of whether the reservoir is full or not. When it
completes, it sends the result item-sets to the AggregatorProcessor with the
epoch/batch id. At the end of the result, each SamplerProcessor sends the
(âepoch_endâ,<epochNum>) message to the AggregatorProcessor.</p></li>
-<li><p>AggregatorProcessor receives the result item-sets from all
SamplerProcessors. It maintains different queues for different batch ids and
also maintains a count of the number of SamplerProcessors which have finished
sending their results for a corresponding batch/epoch. Whenever the
<code>epoch_end</code> message count becomes equal to the number of instances
of SampleProcessor, AggregatorProcessor aggregates the results and stores it in
the file system using the output path specified by the user.</p></li>
+ <li>
+ <p>StreamSourceP takes as input the input transaction file.
StreamSourceProcessor (Entrance PI) starts sending the transactions randomly to
SamplerProcessor instances. The number of SamplerProcessors to instantiate is
taken as an argument from the user but is verified by PARMA. PARMA determines
this number based on the <code class="highlighter-rouge">epsilon</code> and
<code class="highlighter-rouge">phi</code> parameters provided by the user.
StreamSourceProcessor sends an FPM=âyesâ command to all the instances of
SamplerProcessor after 2M transactions where M=numSamples*sampleSize. After
first FPM=âyesâ command, all later FPM=âyesâ commands are sent after
<code class="highlighter-rouge">fpmGap</code> transactions which is one of the
parameter SAMOA FIM task takes as input.</p>
+ </li>
+ <li>
+ <p>All the instances of SamplerProcessor start building a Time Biased
Reservoir Sample in which newer transactions have more weight. Time biased
sampling is the default approach but user can provide his own sampler by
implementing <code
class="highlighter-rouge">samoa.samplers.SamplerInterface</code>. When a
SamplerProcessor receives FPM=âyesâ command, it starts FIM/FPM on the
reservoir irrespective of whether the reservoir is full or not. When it
completes, it sends the result item-sets to the AggregatorProcessor with the
epoch/batch id. At the end of the result, each SamplerProcessor sends the
(âepoch_endâ,<epochnum>) message to the AggregatorProcessor.</epochnum></p>
+ </li>
+ <li>
+ <p>AggregatorProcessor receives the result item-sets from all
SamplerProcessors. It maintains different queues for different batch ids and
also maintains a count of the number of SamplerProcessors which have finished
sending their results for a corresponding batch/epoch. Whenever the <code
class="highlighter-rouge">epoch_end</code> message count becomes equal to the
number of instances of SampleProcessor, AggregatorProcessor aggregates the
results and stores it in the file system using the output path specified by the
user.</p>
+ </li>
</ol>
-<p>In this way, epochs never overlap.If <code>fpmGap</code> is small and the
StreamSourceProcessor dispatches an FPM='yes' command before the
slowest SamplerProcessor finishes FIM on the last epoch, the speed of the
global FIM will be equal to the local FIM of the slowest SamplerProcessor. (or
AggregatorProcessor if it is slower than the slowest SamplerProcessor)</p>
-
-<p><img src="images/SAMOA%20FIM.jpg" alt="SAMOA FIM"></p>
+<p>In this way, epochs never overlap.If <code
class="highlighter-rouge">fpmGap</code> is small and the StreamSourceProcessor
dispatches an FPM=âyesâ command before the slowest SamplerProcessor
finishes FIM on the last epoch, the speed of the global FIM will be equal to
the local FIM of the slowest SamplerProcessor. (or AggregatorProcessor if it is
slower than the slowest SamplerProcessor)</p>
-<h2 id="3-how-to-run">3. How to run</h2>
+<p><img src="images/SAMOA FIM.jpg" alt="SAMOA FIM" /></p>
+<h2 id="how-to-run">3. How to run</h2>
<p>Following is an example of the command used to run the SAMOA FIM task.</p>
-<div class="highlight"><pre><code class="language-" data-lang="">bin/samoa
storm target/SAMOA-Storm-0.0.1-SNAPSHOT.jar "FpmTask -t Myfpmtopology -r
(org.apache.samoa.fpm.processors.FileReaderProcessor -i
/datasets/freqDataCombined.txt) -m
(org.apache.samoa.fpm.processors.ParmaStreamFpmMiner -e .1 -d .1 -f 10 -t 20 -n
23 -p 0.08 -b 100000 -s
org.apache.samoa.samplers.reservoir.TimeBiasedReservoirSampler) -w
(org.apache.samoa.fpm.processors.FileWriterProcessor -o /output/outPARMA) "
-</code></pre></div>
+
+<p><code class="highlighter-rouge">
+bin/samoa storm target/SAMOA-Storm-0.0.1-SNAPSHOT.jar "FpmTask -t
Myfpmtopology -r (org.apache.samoa.fpm.processors.FileReaderProcessor -i
/datasets/freqDataCombined.txt) -m
(org.apache.samoa.fpm.processors.ParmaStreamFpmMiner -e .1 -d .1 -f 10 -t 20 -n
23 -p 0.08 -b 100000 -s
org.apache.samoa.samplers.reservoir.TimeBiasedReservoirSampler) -w
(org.apache.samoa.fpm.processors.FileWriterProcessor -o /output/outPARMA) "
+</code></p>
+
<p>Parameters:
To run an FIM task, four parameters are required</p>
<ul>
-<li><code>-t</code>: Topology name (Can be any name)</li>
-<li><code>-r</code>: The reader class</li>
-<li><code>-m</code>: The miner class</li>
-<li><code>-w</code>: The writer class</li>
+ <li><code class="highlighter-rouge">-t</code>: Topology name (Can be any
name)</li>
+ <li><code class="highlighter-rouge">-r</code>: The reader class</li>
+ <li><code class="highlighter-rouge">-m</code>: The miner class</li>
+ <li><code class="highlighter-rouge">-w</code>: The writer class</li>
</ul>
-<p>In the example above, <code>FileReaderProcessor</code> is used as a reader
class. It takes only one parameter:</p>
+<p>In the example above, <code
class="highlighter-rouge">FileReaderProcessor</code> is used as a reader class.
It takes only one parameter:</p>
<ul>
-<li><code>-i</code>: Path to input file</li>
+ <li><code class="highlighter-rouge">-i</code>: Path to input file</li>
</ul>
-<p>Similarly, <code>FileWriterProcessor</code> is used as a writer class. It
takes only one parameter:</p>
+<p>Similarly, <code class="highlighter-rouge">FileWriterProcessor</code> is
used as a writer class. It takes only one parameter:</p>
<ul>
-<li><code>-o</code>: Path to output file</li>
+ <li><code class="highlighter-rouge">-o</code>: Path to output file</li>
</ul>
-<p>SAMOA comes with a built-in distributed frequent mining algorithm PARMA as
described above but users can plug-in their custom miners by implementing the
<code>FpmMinerInterface</code>. The built-in PARMA miner can be used with the
following parameters:</p>
+<p>SAMOA comes with a built-in distributed frequent mining algorithm PARMA as
described above but users can plug-in their custom miners by implementing the
<code class="highlighter-rouge">FpmMinerInterface</code>. The built-in PARMA
miner can be used with the following parameters:</p>
<ul>
-<li><code>-e</code>: epsilon parameter for <a
href="https://dl.acm.org/citation.cfm?id=2396776">PARMA</a></li>
-<li><code>-d</code>: delta parameter for <a
href="https://dl.acm.org/citation.cfm?id=2396776">PARMA</a></li>
-<li><code>-f</code>: minimum frequency (percentage) of a frequent itemset</li>
-<li><code>-t</code>: maximum length of a transaction</li>
-<li><code>-n</code>: number of samples to maintain</li>
-<li><code>-a</code>: number of aggregators to initiate</li>
-<li><code>-p</code>: phi parameter for <a
href="https://dl.acm.org/citation.cfm?id=2396776">PARMA</a></li>
-<li><code>-i</code>: path to input file</li>
-<li><code>-o</code>: path to output file</li>
-<li><code>-b</code>: batch size or fpmGap (Number of transactions after which
FIM should be performed)</li>
-<li><code>-s</code>: Sampler Class to be used for sampling at each node</li>
+ <li><code class="highlighter-rouge">-e</code>: epsilon parameter for <a
href="https://dl.acm.org/citation.cfm?id=2396776">PARMA</a></li>
+ <li><code class="highlighter-rouge">-d</code>: delta parameter for <a
href="https://dl.acm.org/citation.cfm?id=2396776">PARMA</a></li>
+ <li><code class="highlighter-rouge">-f</code>: minimum frequency
(percentage) of a frequent itemset</li>
+ <li><code class="highlighter-rouge">-t</code>: maximum length of a
transaction</li>
+ <li><code class="highlighter-rouge">-n</code>: number of samples to
maintain</li>
+ <li><code class="highlighter-rouge">-a</code>: number of aggregators to
initiate</li>
+ <li><code class="highlighter-rouge">-p</code>: phi parameter for <a
href="https://dl.acm.org/citation.cfm?id=2396776">PARMA</a></li>
+ <li><code class="highlighter-rouge">-i</code>: path to input file</li>
+ <li><code class="highlighter-rouge">-o</code>: path to output file</li>
+ <li><code class="highlighter-rouge">-b</code>: batch size or fpmGap (Number
of transactions after which FIM should be performed)</li>
+ <li><code class="highlighter-rouge">-s</code>: Sampler Class to be used for
sampling at each node</li>
</ul>
<h2 id="note">Note</h2>
-
<p>This method is currently unavailable in the master branch of SAMOA due to
licensing restriction.</p>
</article>