Modified: incubator/samoa/site/documentation/Processor.html
URL: 
http://svn.apache.org/viewvc/incubator/samoa/site/documentation/Processor.html?rev=1737551&r1=1737550&r2=1737551&view=diff
==============================================================================
--- incubator/samoa/site/documentation/Processor.html (original)
+++ incubator/samoa/site/documentation/Processor.html Sun Apr  3 08:17:59 2016
@@ -74,79 +74,71 @@
 
   <article class="post-content">
     <p>Processor is the basic logical processing unit. All logic is written in 
the processor. In SAMOA, a Processor is an interface. Users can implement this 
interface to build their own processors.
-<img src="images/Topology.png" alt="Topology"></p>
-
-<h3 id="adding-a-processor-to-the-topology">Adding a Processor to the 
topology</h3>
+<img src="images/Topology.png" alt="Topology" />
+### Adding a Processor to the topology</p>
 
 <p>There are two ways to add a processor to the topology.</p>
 
-<h4 id="1-processor">1. Processor</h4>
-
-<p>All physical topology units are created with the help of a 
<code>TopologyBuilder</code>. Following code snippet shows how to add a 
Processor to the topology.
-<code>
+<h4 id="processor">1. Processor</h4>
+<p>All physical topology units are created with the help of a <code 
class="highlighter-rouge">TopologyBuilder</code>. Following code snippet shows 
how to add a Processor to the topology.
+<code class="highlighter-rouge">
 Processor processor = new ExampleProcessor();
 builder.addProcessor(processor, paralellism);
 </code>
-<code>addProcessor()</code> method of <code>TopologyBuilder</code> is used to 
add the processor. Its first argument is the instance of a Processor which 
needs to be added. Its second argument is the parallelism hint. It tells the 
underlying platforms how many parallel instances of this processor should be 
created on different nodes.</p>
-
-<h4 id="2-entrance-processor">2. Entrance Processor</h4>
+<code class="highlighter-rouge">addProcessor()</code> method of <code 
class="highlighter-rouge">TopologyBuilder</code> is used to add the processor. 
Its first argument is the instance of a Processor which needs to be added. Its 
second argument is the parallelism hint. It tells the underlying platforms how 
many parallel instances of this processor should be created on different 
nodes.</p>
 
+<h4 id="entrance-processor">2. Entrance Processor</h4>
 <p>Some processors generates their own streams, and they are used as the 
source of a topology. They connect to external sources, pull data and provide 
it to the topology in the form of streams.
-All physical topology units are created with the help of a 
<code>TopologyBuilder</code>. The following code snippet shows how to add an 
entrance processor to the topology and create a stream from it.
-<code>
+All physical topology units are created with the help of a <code 
class="highlighter-rouge">TopologyBuilder</code>. The following code snippet 
shows how to add an entrance processor to the topology and create a stream from 
it.
+<code class="highlighter-rouge">
 EntranceProcessor entranceProcessor = new EntranceProcessor();
 builder.addEntranceProcessor(entranceProcessor);
 Stream source = builder.createStream(entranceProcessor);
 </code></p>
 
 <h3 id="preview-of-processor">Preview of Processor</h3>
-<div class="highlight"><pre><code class="language-" data-lang="">package 
samoa.core;
+<p><code class="highlighter-rouge">
+package samoa.core;
 public interface Processor extends java.io.Serializable{
-    boolean process(ContentEvent event);
-    void onCreate(int id);
-    Processor newProcessor(Processor p);
+       boolean process(ContentEvent event);
+       void onCreate(int id);
+       Processor newProcessor(Processor p);
 }
-</code></pre></div>
-<h3 id="methods">Methods</h3>
-
-<h4 id="1-boolean-process-contentevent-event">1. <code>boolean 
process(ContentEvent event)</code></h4>
-
-<p>Users should implement the three methods shown above. 
<code>process(ContentEvent event)</code> is the method in which all processing 
logic should be implemented. <code>ContentEvent</code> is a type (interface) 
which contains the event. This method will be called each time a new event is 
received. It should return <code>true</code> if the event has been correctly 
processed, <code>false</code> otherwise.</p>
-
-<h4 id="2-void-oncreate-int-id">2. <code>void onCreate(int id)</code></h4>
+</code>
+### Methods</p>
 
-<p>is the method in which all initialization code should be written. Multiple 
copies/instances of the Processor are created based on the parallelism hint 
specified by the user. SAMOA assigns each instance a unique id which is passed 
as a parameter <code>id</code> to <code>onCreate(int it)</code> method of each 
instance.</p>
+<h4 id="boolean-processcontentevent-event">1. <code 
class="highlighter-rouge">boolean process(ContentEvent event)</code></h4>
+<p>Users should implement the three methods shown above. <code 
class="highlighter-rouge">process(ContentEvent event)</code> is the method in 
which all processing logic should be implemented. <code 
class="highlighter-rouge">ContentEvent</code> is a type (interface) which 
contains the event. This method will be called each time a new event is 
received. It should return <code class="highlighter-rouge">true</code> if the 
event has been correctly processed, <code 
class="highlighter-rouge">false</code> otherwise.</p>
 
-<h4 id="3-processor-newprocessor-processor-p">3. <code>Processor 
newProcessor(Processor p)</code></h4>
+<h4 id="void-oncreateint-id">2. <code class="highlighter-rouge">void 
onCreate(int id)</code></h4>
+<p>is the method in which all initialization code should be written. Multiple 
copies/instances of the Processor are created based on the parallelism hint 
specified by the user. SAMOA assigns each instance a unique id which is passed 
as a parameter <code class="highlighter-rouge">id</code> to <code 
class="highlighter-rouge">onCreate(int it)</code> method of each instance.</p>
 
-<p>is very simple to implement. This method is just a technical overhead that 
has no logical use except that it helps SAMOA in some of its internals. Users 
should just return a new copy of the instance of this class which implements 
this Processor interface. </p>
+<h4 id="processor-newprocessorprocessor-p">3. <code 
class="highlighter-rouge">Processor newProcessor(Processor p)</code></h4>
+<p>is very simple to implement. This method is just a technical overhead that 
has no logical use except that it helps SAMOA in some of its internals. Users 
should just return a new copy of the instance of this class which implements 
this Processor interface.</p>
 
 <h3 id="preview-of-entranceprocessor">Preview of EntranceProcessor</h3>
-<div class="highlight"><pre><code class="language-" data-lang="">package 
org.apache.samoa.core;
+<p>```
+package org.apache.samoa.core;</p>
 
-public interface EntranceProcessor extends Processor {
+<p>public interface EntranceProcessor extends Processor {
     public boolean isFinished();
     public boolean hasNext();
     public ContentEvent nextEvent();
 }
-</code></pre></div>
-<h3 id="methods">Methods</h3>
-
-<h4 id="1-boolean-isfinished">1. <code>boolean isFinished()</code></h4>
-
-<p>returns whether to expect more events coming from the entrance processor. 
If the source is a live stream this method should return always 
<code>false</code>. If the source is a file, the method should return 
<code>true</code> once the file has been fully processed.</p>
+```
+### Methods</p>
 
-<h4 id="2-boolean-hasnext">2. <code>boolean hasNext()</code></h4>
+<h4 id="boolean-isfinished">1. <code class="highlighter-rouge">boolean 
isFinished()</code></h4>
+<p>returns whether to expect more events coming from the entrance processor. 
If the source is a live stream this method should return always <code 
class="highlighter-rouge">false</code>. If the source is a file, the method 
should return <code class="highlighter-rouge">true</code> once the file has 
been fully processed.</p>
 
-<p>returns whether the next event is ready for consumption. If the method 
returns <code>true</code> a subsequent call to <code>nextEvent</code> should 
yield the next event to be processed. If the method returns <code>false</code> 
the engine can use this information to avoid continuously polling the entrance 
processor.</p>
+<h4 id="boolean-hasnext">2. <code class="highlighter-rouge">boolean 
hasNext()</code></h4>
+<p>returns whether the next event is ready for consumption. If the method 
returns <code class="highlighter-rouge">true</code> a subsequent call to <code 
class="highlighter-rouge">nextEvent</code> should yield the next event to be 
processed. If the method returns <code class="highlighter-rouge">false</code> 
the engine can use this information to avoid continuously polling the entrance 
processor.</p>
 
-<h4 id="3-contentevent-nextevent">3. <code>ContentEvent nextEvent()</code></h4>
-
-<p>is the main method for the entrance processor as it returns the next event 
to be processed by the topology. It should be called only if 
<code>isFinished()</code> returned <code>false</code> and 
<code>hasNext()</code> returned <code>true</code>.</p>
+<h4 id="contentevent-nextevent">3. <code 
class="highlighter-rouge">ContentEvent nextEvent()</code></h4>
+<p>is the main method for the entrance processor as it returns the next event 
to be processed by the topology. It should be called only if <code 
class="highlighter-rouge">isFinished()</code> returned <code 
class="highlighter-rouge">false</code> and <code 
class="highlighter-rouge">hasNext()</code> returned <code 
class="highlighter-rouge">true</code>.</p>
 
 <h3 id="note">Note</h3>
-
-<p>All state variables of the class implementing this interface must be 
serializable. It can be done by implementing the <code>Serializable</code> 
interface. The simple way to skip this requirement is to declare those 
variables as <code>transient</code> and initialize them in the 
<code>onCreate()</code> method. Remember, all initializations of such transient 
variables done in the constructor will be lost.</p>
+<p>All state variables of the class implementing this interface must be 
serializable. It can be done by implementing the <code 
class="highlighter-rouge">Serializable</code> interface. The simple way to skip 
this requirement is to declare those variables as <code 
class="highlighter-rouge">transient</code> and initialize them in the <code 
class="highlighter-rouge">onCreate()</code> method. Remember, all 
initializations of such transient variables done in the constructor will be 
lost.</p>
 
   </article>
 

Modified: incubator/samoa/site/documentation/SAMOA-Topology.html
URL: 
http://svn.apache.org/viewvc/incubator/samoa/site/documentation/SAMOA-Topology.html?rev=1737551&r1=1737550&r2=1737551&view=diff
==============================================================================
--- incubator/samoa/site/documentation/SAMOA-Topology.html (original)
+++ incubator/samoa/site/documentation/SAMOA-Topology.html Sun Apr  3 08:17:59 
2016
@@ -76,18 +76,18 @@
     <p>Apache SAMOA allows users to write their stream processing algorithms 
in an easy and platform independent way. SAMOA defines its own topology which 
is very intuitive and simple to use. Currently SAMOA has the following basic 
topology elements.</p>
 
 <ol>
-<li><a href="Processor.html">Processor</a></li>
-<li><a href="Content-Event.html">Content Event</a></li>
-<li><a href="Stream.html">Stream</a></li>
-<li><a href="Task.html">Task</a></li>
-<li><a href="Topology-Builder.html">Topology Builder</a></li>
-<li><a href="Learner.html">Learner</a></li>
-<li><strong>Advanced topic</strong>: <a href="Processing-Item.html">Processing 
Item</a></li>
+  <li><a href="Processor.html">Processor</a></li>
+  <li><a href="Content-Event.html">Content Event</a></li>
+  <li><a href="Stream.html">Stream</a></li>
+  <li><a href="Task.html">Task</a></li>
+  <li><a href="Topology-Builder.html">Topology Builder</a></li>
+  <li><a href="Learner.html">Learner</a></li>
+  <li><strong>Advanced topic</strong>: <a 
href="Processing-Item.html">Processing Item</a></li>
 </ol>
 
 <p>Processor and Content Event are the logical units to build your algorithm, 
Stream and Task are the physical units to wire the various pieces of your 
algorithm, whereas Topology Builder is an administrative unit that provides 
bookkeeping services. Learner is the base interface for learning algorithms. 
Processing Items are internal wrappers for Processors used inside SAMOA.</p>
 
-<p><img src="images/Topology.png" alt="Topology"></p>
+<p><img src="images/Topology.png" alt="Topology" /></p>
 
   </article>
 

Modified: incubator/samoa/site/documentation/SAMOA-and-Machine-Learning.html
URL: 
http://svn.apache.org/viewvc/incubator/samoa/site/documentation/SAMOA-and-Machine-Learning.html?rev=1737551&r1=1737550&r2=1737551&view=diff
==============================================================================
--- incubator/samoa/site/documentation/SAMOA-and-Machine-Learning.html 
(original)
+++ incubator/samoa/site/documentation/SAMOA-and-Machine-Learning.html Sun Apr  
3 08:17:59 2016
@@ -73,15 +73,15 @@
   </header>
 
   <article class="post-content">
-    <p>SAMOA&#39;s main goal is to help developers to create easily machine 
learning algorithms on top of any distributed stream processing engine. Here we 
present the available machine learning algorithms implemented in SAMOA and how 
to use them. </p>
+    <p>SAMOA’s main goal is to help developers to create easily machine 
learning algorithms on top of any distributed stream processing engine. Here we 
present the available machine learning algorithms implemented in SAMOA and how 
to use them.</p>
 
 <ul>
-<li><a href="Prequential-Evaluation-Task.html">2.1 Prequential Evaluation 
Task</a></li>
-<li><a href="Vertical-Hoeffding-Tree-Classifier.html">2.2 Vertical Hoeffding 
Tree Classifier</a></li>
-<li><a href="Adaptive-Model-Rules-Regressor.html">2.3 Adaptive Model Rules 
Regressor</a></li>
-<li><a href="Bagging-and-Boosting.html">2.4 Bagging and Boosting</a></li>
-<li><a href="Distributed-Stream-Clustering.html">2.5 Distributed Stream 
Clustering</a></li>
-<li><a href="Distributed-Stream-Frequent-Itemset-Mining.html">2.6 Distributed 
Stream Frequent Itemset Mining</a></li>
+  <li><a href="Prequential-Evaluation-Task.html">2.1 Prequential Evaluation 
Task</a></li>
+  <li><a href="Vertical-Hoeffding-Tree-Classifier.html">2.2 Vertical Hoeffding 
Tree Classifier</a></li>
+  <li><a href="Adaptive-Model-Rules-Regressor.html">2.3 Adaptive Model Rules 
Regressor</a></li>
+  <li><a href="Bagging-and-Boosting.html">2.4 Bagging and Boosting</a></li>
+  <li><a href="Distributed-Stream-Clustering.html">2.5 Distributed Stream 
Clustering</a></li>
+  <li><a href="Distributed-Stream-Frequent-Itemset-Mining.html">2.6 
Distributed Stream Frequent Itemset Mining</a></li>
 </ul>
 
   </article>

Modified: incubator/samoa/site/documentation/SAMOA-for-MOA-users.html
URL: 
http://svn.apache.org/viewvc/incubator/samoa/site/documentation/SAMOA-for-MOA-users.html?rev=1737551&r1=1737550&r2=1737551&view=diff
==============================================================================
--- incubator/samoa/site/documentation/SAMOA-for-MOA-users.html (original)
+++ incubator/samoa/site/documentation/SAMOA-for-MOA-users.html Sun Apr  3 
08:17:59 2016
@@ -73,23 +73,23 @@
   </header>
 
   <article class="post-content">
-    <p>If you&#39;re an advanced user of <a 
href="http://moa.cms.waikato.ac.nz/";>MOA</a>, you&#39;ll find easy to run 
SAMOA. You need to note the following:</p>
+    <p>If you’re an advanced user of <a 
href="http://moa.cms.waikato.ac.nz/";>MOA</a>, you’ll find easy to run SAMOA. 
You need to note the following:</p>
 
 <ul>
-<li>There is no GUI interface in SAMOA</li>
-<li>You can run SAMOA in the following modes:
-
-<ol>
-<li>Simulation Environment. Use <code>org.apache.samoa.DoTask</code> instead 
of <code>moa.DoTask</code><br></li>
-<li>Storm Local Mode. Use <code>org.apache.samoa.LocalStormDoTask</code> 
instead of <code>moa.DoTask</code></li>
-<li>Storm Cluster Mode. You need to use the <code>samoa</code> script as it is 
explained in <a href="Executing%20SAMOA%20with%20Apache%20Storm">Executing 
SAMOA with Apache Storm</a>.</li>
-<li>S4. You need to use the <code>samoa</code> script as it is explained in <a 
href="Executing%20SAMOA%20with%20Apache%20S4">Executing SAMOA with Apache 
S4</a></li>
-</ol></li>
+  <li>There is no GUI interface in SAMOA</li>
+  <li>You can run SAMOA in the following modes:
+    <ol>
+      <li>Simulation Environment. Use <code 
class="highlighter-rouge">org.apache.samoa.DoTask</code> instead of <code 
class="highlighter-rouge">moa.DoTask</code></li>
+      <li>Storm Local Mode. Use <code 
class="highlighter-rouge">org.apache.samoa.LocalStormDoTask</code> instead of 
<code class="highlighter-rouge">moa.DoTask</code></li>
+      <li>Storm Cluster Mode. You need to use the <code 
class="highlighter-rouge">samoa</code> script as it is explained in <a 
href="Executing SAMOA with Apache Storm">Executing SAMOA with Apache 
Storm</a>.</li>
+      <li>S4. You need to use the <code class="highlighter-rouge">samoa</code> 
script as it is explained in <a href="Executing SAMOA with Apache S4">Executing 
SAMOA with Apache S4</a></li>
+    </ol>
+  </li>
 </ul>
 
-<p>To start with SAMOA, you can start with a simple example using the 
CoverType dataset as it is discussed in <a href="Getting%20Started">Getting 
Started</a>.  </p>
+<p>To start with SAMOA, you can start with a simple example using the 
CoverType dataset as it is discussed in <a href="Getting Started">Getting 
Started</a>.</p>
 
-<p>To use MOA algorithms inside SAMOA, take a look at <a 
href="https://github.com/samoa-moa/samoa-moa";>https://github.com/samoa-moa/samoa-moa</a>.
 </p>
+<p>To use MOA algorithms inside SAMOA, take a look at <a 
href="https://github.com/samoa-moa/samoa-moa";>https://github.com/samoa-moa/samoa-moa</a>.</p>
 
   </article>
 

Modified: 
incubator/samoa/site/documentation/Scalable-Advanced-Massive-Online-Analysis.html
URL: 
http://svn.apache.org/viewvc/incubator/samoa/site/documentation/Scalable-Advanced-Massive-Online-Analysis.html?rev=1737551&r1=1737550&r2=1737551&view=diff
==============================================================================
--- 
incubator/samoa/site/documentation/Scalable-Advanced-Massive-Online-Analysis.html
 (original)
+++ 
incubator/samoa/site/documentation/Scalable-Advanced-Massive-Online-Analysis.html
 Sun Apr  3 08:17:59 2016
@@ -75,14 +75,14 @@
   <article class="post-content">
     <p>Scalable Advanced Massive Online Analysis (SAMOA) contains various 
algorithms for machine learning and data mining on data streams, and allows to 
run them on different distributed stream processing engines (DSPEs) such as 
Storm and S4. Currently, SAMOA contains methods for classification via Vertical 
Hoeffding Trees, bagging and boosting and clustering via CluStream.</p>
 
-<p>In this pages, we explain how to build and execute SAMOA for the different 
distributed stream processing engines (DSPEs): </p>
+<p>In this pages, we explain how to build and execute SAMOA for the different 
distributed stream processing engines (DSPEs):</p>
 
 <ul>
-<li><a href="Building-SAMOA.html">Building SAMOA</a></li>
-<li><a href="Executing-SAMOA-with-Apache-Storm.html">Executing SAMOA with 
Apache Storm</a></li>
-<li><a href="Executing-SAMOA-with-Apache-S4.html">Executing SAMOA with Apache 
S4</a></li>
-<li><a href="Executing-SAMOA-with-Apache-Samza.html">Executing SAMOA with 
Apache Samza</a></li>
-<li><a href="Executing-SAMOA-with-Apache-Avro-Files.html">Executing SAMOA with 
Apache Avro Files</a></li>
+  <li><a href="Building-SAMOA.html">Building SAMOA</a></li>
+  <li><a href="Executing-SAMOA-with-Apache-Storm.html">Executing SAMOA with 
Apache Storm</a></li>
+  <li><a href="Executing-SAMOA-with-Apache-S4.html">Executing SAMOA with 
Apache S4</a></li>
+  <li><a href="Executing-SAMOA-with-Apache-Samza.html">Executing SAMOA with 
Apache Samza</a></li>
+  <li><a href="Executing-SAMOA-with-Apache-Avro-Files.html">Executing SAMOA 
with Apache Avro Files</a></li>
 </ul>
 
   </article>

Modified: incubator/samoa/site/documentation/Stream.html
URL: 
http://svn.apache.org/viewvc/incubator/samoa/site/documentation/Stream.html?rev=1737551&r1=1737550&r2=1737551&view=diff
==============================================================================
--- incubator/samoa/site/documentation/Stream.html (original)
+++ incubator/samoa/site/documentation/Stream.html Sun Apr  3 08:17:59 2016
@@ -73,47 +73,51 @@
   </header>
 
   <article class="post-content">
-    <p>A stream is a physical unit of SAMOA topology which connects different 
Processors with each other. Stream is also created by a 
<code>TopologyBuilder</code> just like a Processor. A stream can have a single 
source but many destinations. A Processor which is the source of a stream, owns 
the stream.</p>
-
-<h3 id="1-creating-a-stream">1. Creating a Stream</h3>
+    <p>A stream is a physical unit of SAMOA topology which connects different 
Processors with each other. Stream is also created by a <code 
class="highlighter-rouge">TopologyBuilder</code> just like a Processor. A 
stream can have a single source but many destinations. A Processor which is the 
source of a stream, owns the stream.</p>
 
+<h3 id="creating-a-stream">1. Creating a Stream</h3>
 <p>The following code snippet shows how a Stream is created:</p>
-<div class="highlight"><pre><code class="language-" 
data-lang="">builder.initTopology("MyTopology");
+
+<p><code class="highlighter-rouge">
+builder.initTopology("MyTopology");
 Processor sourceProcessor = new Sampler();
 builder.addProcessor(samplerProcessor, 3);
 Stream sourceDataStream = builder.createStream(sourceProcessor);
-</code></pre></div>
-<h3 id="2-connecting-a-stream">2. Connecting a Stream</h3>
+</code></p>
 
+<h3 id="connecting-a-stream">2. Connecting a Stream</h3>
 <p>As described above, a Stream can have many destinations. In the following 
figure, a single stream from sourceProcessor is connected to three different 
destination Processors each having three instances.</p>
 
-<p><img src="images/SAMOA%20Message%20Shuffling.png" alt="SAMOA Message 
Shuffling"></p>
-
-<p>SAMOA supports three different ways of distribution of messages to multiple 
instances of a Processor.</p>
+<p><img src="images/SAMOA Message Shuffling.png" alt="SAMOA Message Shuffling" 
/></p>
 
-<h4 id="2-1-shuffle">2.1 Shuffle</h4>
-
-<p>In this way of message distribution, messages/events are distributed 
randomly among various instances of a Processor. 
+<p>SAMOA supports three different ways of distribution of messages to multiple 
instances of a Processor.
+####2.1 Shuffle
+In this way of message distribution, messages/events are distributed randomly 
among various instances of a Processor. 
 Following figure shows how the messages are distributed.
-<img src="images/SAMOA%20Explain%20Shuffling.png" alt="SAMOA Explain 
Shuffling">
+<img src="images/SAMOA Explain Shuffling.png" alt="SAMOA Explain Shuffling" />
 Following code snipped shows how to connect a stream to a destination using 
random shuffling.</p>
-<div class="highlight"><pre><code class="language-" 
data-lang="">builder.connectInputShuffleStream(sourceDataStream, 
destinationProcessor);
-</code></pre></div>
-<h4 id="2-2-key">2.2 Key</h4>
 
-<p>In this way of message distribution, messages with same key are sent to 
same instance of a Processor.
+<p><code class="highlighter-rouge">
+builder.connectInputShuffleStream(sourceDataStream, destinationProcessor);
+</code>
+####2.2 Key
+In this way of message distribution, messages with same key are sent to same 
instance of a Processor.
 Following figure illustrates key-based distribution.
-<img src="images/SAMOA%20Explain%20Key%20Shuffling.png" alt="SAMOA Explain Key 
Shuffling">
+<img src="images/SAMOA Explain Key Shuffling.png" alt="SAMOA Explain Key 
Shuffling" />
 Following code snippet shows how to connect a stream to a destination using 
key-based distribution.</p>
-<div class="highlight"><pre><code class="language-" 
data-lang="">builder.connectInputKeyStream(sourceDataStream, 
destinationProcessor);
-</code></pre></div>
-<h4 id="2-3-all">2.3 All</h4>
 
-<p>In this way of message distribution, all messages of a stream are sent to 
all instances of a destination Processor. Following figure illustrates this 
distribution process.
-<img src="images/SAMOA%20Explain%20All%20Shuffling.png" alt="SAMOA Explain All 
Shuffling">
+<p><code class="highlighter-rouge">
+builder.connectInputKeyStream(sourceDataStream, destinationProcessor);
+</code>
+####2.3 All
+In this way of message distribution, all messages of a stream are sent to all 
instances of a destination Processor. Following figure illustrates this 
distribution process.
+<img src="images/SAMOA Explain All Shuffling.png" alt="SAMOA Explain All 
Shuffling" />
 Following code snippet shows how to connect a stream to a destination using 
All-based distribution.</p>
-<div class="highlight"><pre><code class="language-" 
data-lang="">builder.connectInputAllStream(sourceDataStream, 
destinationProcessor);
-</code></pre></div>
+
+<p><code class="highlighter-rouge">
+builder.connectInputAllStream(sourceDataStream, destinationProcessor);
+</code></p>
+
   </article>
 
 <!-- </div> -->

Modified: incubator/samoa/site/documentation/Task.html
URL: 
http://svn.apache.org/viewvc/incubator/samoa/site/documentation/Task.html?rev=1737551&r1=1737550&r2=1737551&view=diff
==============================================================================
--- incubator/samoa/site/documentation/Task.html (original)
+++ incubator/samoa/site/documentation/Task.html Sun Apr  3 08:17:59 2016
@@ -73,56 +73,55 @@
   </header>
 
   <article class="post-content">
-    <p>Task is similar to a job in Hadoop. Task is an execution entity. A 
topology must be defined inside a task. SAMOA can only execute classes that 
implement <code>Task</code> interface.</p>
+    <p>Task is similar to a job in Hadoop. Task is an execution entity. A 
topology must be defined inside a task. SAMOA can only execute classes that 
implement <code class="highlighter-rouge">Task</code> interface.</p>
 
-<h3 id="1-implementation">1. Implementation</h3>
-<div class="highlight"><pre><code class="language-" data-lang="">package 
org.apache.samoa.tasks;
+<h3 id="implementation">1. Implementation</h3>
+<p>```
+package org.apache.samoa.tasks;</p>
 
-import org.apache.samoa.topology.ComponentFactory;
-import org.apache.samoa.topology.Topology;
+<p>import org.apache.samoa.topology.ComponentFactory;
+import org.apache.samoa.topology.Topology;</p>
 
-/**
+<p>/**
  * Task interface, the mother of all SAMOA tasks!
  */
-public interface Task {
+public interface Task {</p>
 
-    /**
-     * Initialize this SAMOA task, 
-     * i.e. create and connect Processors and Streams
-     * and initialize the topology
-     */
-    public void init(); 
-
-    /**
-     * Return the final topology object to be executed in the cluster
-     * @return topology object to be submitted to be executed in the cluster
-     */
-    public Topology getTopology();
-
-    /**
-     * Sets the factory.
-     * TODO: propose to hide factory from task, 
-     * i.e. Task will only see TopologyBuilder, 
-     * and factory creation will be handled by TopologyBuilder
-     *
-     * @param factory the new factory
-     */
-    public void setFactory(ComponentFactory factory) ;
-}
-</code></pre></div>
-<h3 id="2-methods">2. Methods</h3>
-
-<h5 id="2-1-void-init">2.1 <code>void init()</code></h5>
-
-<p>This method should build the desired topology by creating Processors and 
Streams and connecting them to each other.</p>
+<div class="highlighter-rouge"><pre class="highlight"><code>/**
+ * Initialize this SAMOA task, 
+ * i.e. create and connect Processors and Streams
+ * and initialize the topology
+ */
+public void init();    
 
-<h5 id="2-2-topology-gettopology">2.2 <code>Topology getTopology()</code></h5>
+/**
+ * Return the final topology object to be executed in the cluster
+ * @return topology object to be submitted to be executed in the cluster
+ */
+public Topology getTopology();
+
+/**
+ * Sets the factory.
+ * TODO: propose to hide factory from task, 
+ * i.e. Task will only see TopologyBuilder, 
+ * and factory creation will be handled by TopologyBuilder
+ *
+ * @param factory the new factory
+ */
+public void setFactory(ComponentFactory factory) ; } ```
+</code></pre>
+</div>
+
+<h3 id="methods">2. Methods</h3>
+<p>#####2.1 <code class="highlighter-rouge">void init()</code>
+This method should build the desired topology by creating Processors and 
Streams and connecting them to each other.</p>
 
-<p>This method should return the topology built by <code>init</code> to the 
engine for execution.</p>
+<h5 id="topology-gettopology">2.2 <code class="highlighter-rouge">Topology 
getTopology()</code></h5>
+<p>This method should return the topology built by <code 
class="highlighter-rouge">init</code> to the engine for execution.</p>
 
-<h5 id="2-3-void-setfactory-componentfactory-factory">2.3 <code>void 
setFactory(ComponentFactory factory)</code></h5>
+<h5 id="void-setfactorycomponentfactory-factory">2.3 <code 
class="highlighter-rouge">void setFactory(ComponentFactory factory)</code></h5>
+<p>Utility method to accept a <code 
class="highlighter-rouge">ComponentFactory</code> to use in building the 
topology.</p>
 
-<p>Utility method to accept a <code>ComponentFactory</code> to use in building 
the topology.</p>
 
   </article>
 

Modified: incubator/samoa/site/documentation/Team.html
URL: 
http://svn.apache.org/viewvc/incubator/samoa/site/documentation/Team.html?rev=1737551&r1=1737550&r2=1737551&view=diff
==============================================================================
--- incubator/samoa/site/documentation/Team.html (original)
+++ incubator/samoa/site/documentation/Team.html Sun Apr  3 08:17:59 2016
@@ -76,52 +76,51 @@
     <h2 id="team">Team</h2>
 
 <table class="table table-striped">
-    <thead>
-        <th class="text-center"></th>
-        <th class="text-center">Name</th>
-        <th class="text-center">Role</th>
-        <th class="text-center">Apache ID</th>
-    </thead>
-    <tr>
-        <td class="text-center"></td>
-        <td class="text-center"><a href="http://gdfm.me/";>Gianmarco De 
Francisci Morales</a></td>
-        <td class="text-center">PPMC</td>
-        <td class="text-center">gdfm</td>
-    </tr>
-    <tr>
-        <td class="text-center"></td>
-        <td class="text-center"><a href="http://www.albertbifet.com";>Albert 
Bifet</a></td>
-        <td class="text-center">PPMC</td>
-        <td class="text-center">abifet</td>
-    </tr>   
-    <tr>
-        <td class="text-center"></td>
-        <td class="text-center">Nicolas Kourtellis</td>
-        <td class="text-center">PPMC</td>
-        <td class="text-center">nkourtellis</td>
-    </tr>
-    <tr>
-        <td class="text-center"></td>
-        <td class="text-center"><a href="http://www.otnira.com";>Arinto 
Murdopo</a></td>
-        <td class="text-center">PPMC</td>
-        <td class="text-center">arinto</td>
-    </tr>
-    <tr>
-        <td class="text-center"></td>
-        <td class="text-center">Matthieu Morel</td>
-        <td class="text-center">PPMC</td>
-        <td class="text-center">mmorel</td>
-    </tr>
-    <tr>
-        <td class="text-center"></td>
-        <td class="text-center"><a href="http://www.van-laere.net";>Olivier Van 
Laere</a></td>
-        <td class="text-center">PPMC</td>
-        <td class="text-center">ovlaere</td>
-    </tr>
+       <thead>
+               <th class="text-center"></th>
+               <th class="text-center">Name</th>
+               <th class="text-center">Role</th>
+               <th class="text-center">Apache ID</th>
+       </thead>
+       <tr>
+               <td class="text-center"></td>
+               <td class="text-center"><a href="http://gdfm.me/";>Gianmarco De 
Francisci Morales</a></td>
+               <td class="text-center">PPMC</td>
+               <td class="text-center">gdfm</td>
+       </tr>
+       <tr>
+               <td class="text-center"></td>
+               <td class="text-center"><a 
href="http://www.albertbifet.com";>Albert Bifet</a></td>
+               <td class="text-center">PPMC</td>
+               <td class="text-center">abifet</td>
+       </tr>   
+       <tr>
+               <td class="text-center"></td>
+               <td class="text-center">Nicolas Kourtellis</td>
+               <td class="text-center">PPMC</td>
+               <td class="text-center">nkourtellis</td>
+       </tr>
+       <tr>
+               <td class="text-center"></td>
+               <td class="text-center"><a href="http://www.otnira.com";>Arinto 
Murdopo</a></td>
+               <td class="text-center">PPMC</td>
+               <td class="text-center">arinto</td>
+       </tr>
+       <tr>
+               <td class="text-center"></td>
+               <td class="text-center">Matthieu Morel</td>
+               <td class="text-center">PPMC</td>
+               <td class="text-center">mmorel</td>
+       </tr>
+       <tr>
+               <td class="text-center"></td>
+               <td class="text-center"><a 
href="http://www.van-laere.net";>Olivier Van Laere</a></td>
+               <td class="text-center">PPMC</td>
+               <td class="text-center">ovlaere</td>
+       </tr>
 </table>
 
 <h3 id="contributors">Contributors</h3>
-
 <ul>
 <li><a href="http://www.lsi.upc.edu/~marias/";>Marta Arias</a></li>
 <li>Foteini Beligianni</li>

Modified: incubator/samoa/site/documentation/Topology-Builder.html
URL: 
http://svn.apache.org/viewvc/incubator/samoa/site/documentation/Topology-Builder.html?rev=1737551&r1=1737550&r2=1737551&view=diff
==============================================================================
--- incubator/samoa/site/documentation/Topology-Builder.html (original)
+++ incubator/samoa/site/documentation/Topology-Builder.html Sun Apr  3 
08:17:59 2016
@@ -73,32 +73,35 @@
   </header>
 
   <article class="post-content">
-    <p><code>TopologyBuilder</code> is a builder class which builds physical 
units of the topology and assemble them together. Each topology has a name. 
Following code snippet shows all the steps of creating a topology with one 
<code>EntrancePI</code>, two PIs and a few streams.</p>
-<div class="highlight"><pre><code class="language-" 
data-lang="">TopologyBuilder builder = new TopologyBuilder(factory) // 
ComponentFactory factory
-builder.initTopology("Parma Topology"); //initiates an empty topology with a 
name
-//********************************Topology 
building***********************************
+    <p><code class="highlighter-rouge">TopologyBuilder</code> is a builder 
class which builds physical units of the topology and assemble them together. 
Each topology has a name. Following code snippet shows all the steps of 
creating a topology with one <code class="highlighter-rouge">EntrancePI</code>, 
two PIs and a few streams.</p>
+
+<p>```
+TopologyBuilder builder = new TopologyBuilder(factory) // ComponentFactory 
factory
+builder.initTopology(“Parma Topology”); //initiates an empty topology with 
a name
+//<strong>**</strong><strong>**</strong><strong>**</strong><strong>**</strong><strong>**</strong><strong>Topology
 
building</strong><strong>**</strong><strong>**</strong><strong>**</strong><strong>**</strong><strong>**</strong>***
 StreamSource sourceProcessor = new 
StreamSource(inputPath,d,sampleSize,fpmGap,epsilon,phi,numSamples);
 builder.addEntranceProcessor(sourceProcessor);
 Stream sourceDataStream = builder.createStream(sourceProcessor);
 sourceProcessor.setDataStream(sourceDataStream);
 Stream sourceControlStream = builder.createStream(sourceProcessor);
-sourceProcessor.setControlStream(sourceControlStream);
+sourceProcessor.setControlStream(sourceControlStream);</p>
 
-Sampler sampler = new 
Sampler(minFreqPercent,sampleSize,(float)epsilon,outputPath,sampler);
+<p>Sampler sampler = new 
Sampler(minFreqPercent,sampleSize,(float)epsilon,outputPath,sampler);
 builder.addProcessor(sampler, numSamples);
 builder.connectInputAllStream(sourceControlStream, sampler);
-builder.connectInputShuffleStream(sourceDataStream, sampler);
+builder.connectInputShuffleStream(sourceDataStream, sampler);</p>
 
-Stream samplerDataStream = builder.createStream(sampler);
+<p>Stream samplerDataStream = builder.createStream(sampler);
 samplerP.setSamplerDataStream(samplerDataStream);
 Stream samplerControlStream = builder.createStream(sampler);
-samplerP.setSamplerControlStream(samplerControlStream);
+samplerP.setSamplerControlStream(samplerControlStream);</p>
 
-Aggregator aggregatorProcessor = new 
Aggregator(outputPath,(long)numSamples,(long)sampleSize,(long)reqApproxNum,(float)epsilon);
+<p>Aggregator aggregatorProcessor = new 
Aggregator(outputPath,(long)numSamples,(long)sampleSize,(long)reqApproxNum,(float)epsilon);
 builder.addProcessor(aggregatorProcessor, numAggregators);
 builder.connectInputKeyStream(samplerDataStream, aggregatorProcessor);
 builder.connectInputAllStream(samplerControlStream, aggregatorProcessor);
-</code></pre></div>
+```</p>
+
   </article>
 
 <!-- </div> -->

Modified: 
incubator/samoa/site/documentation/Vertical-Hoeffding-Tree-Classifier.html
URL: 
http://svn.apache.org/viewvc/incubator/samoa/site/documentation/Vertical-Hoeffding-Tree-Classifier.html?rev=1737551&r1=1737550&r2=1737551&view=diff
==============================================================================
--- incubator/samoa/site/documentation/Vertical-Hoeffding-Tree-Classifier.html 
(original)
+++ incubator/samoa/site/documentation/Vertical-Hoeffding-Tree-Classifier.html 
Sun Apr  3 08:17:59 2016
@@ -76,27 +76,24 @@
     <p>Vertical Hoeffding Tree (VHT) classifier is a distributed classifier 
that utilizes vertical parallelism on top of the Very Fast Decision Tree (VFDT) 
or Hoeffding Tree classifier.</p>
 
 <h3 id="very-fast-decision-tree-vfdt-classifier">Very Fast Decision Tree 
(VFDT) classifier</h3>
-
 <p><a href="http://doi.acm.org/10.1145/347090.347107";>Hoeffding Tree or 
VFDT</a> is the standard decision tree algorithm for data stream 
classification. VFDT uses the Hoeffding bound to decide the minimum number of 
arriving instances to achieve certain level of confidence in splitting the 
node. This confidence level determines how close the statistics between the 
attribute chosen by VFDT and the attribute chosen by decision tree for batch 
learning.</p>
 
 <p>For a more comprehensive summary of VFDT, read chapter 3 of <a 
href="http://heanet.dl.sourceforge.net/project/moa-datastream/documentation/StreamMining.pdf";>Data
 Stream Mining: A Practical Approach</a>.</p>
 
 <h3 id="vertical-parallelism">Vertical Parallelism</h3>
+<p>Vertical Parallelism is a parallelism approach which partitions the 
instances in term of attribute for parallel processing. 
Vertical-parallelism-based decision tree induction processes the partitioned 
instances (which consists of subset of attribute) to calculate the 
information-theoretic criteria in parallel. For example, if we have instances 
with 100 attributes and we partition the instances into 5 portions, we will 
have 20 attributes per portion. The algorithm processes the 20 attributes in 
parallel to determine the “local” best attribute to split and combine the 
parallel computation results to determine the “global” best attribute to 
split and grow the tree.</p>
 
-<p>Vertical Parallelism is a parallelism approach which partitions the 
instances in term of attribute for parallel processing. 
Vertical-parallelism-based decision tree induction processes the partitioned 
instances (which consists of subset of attribute) to calculate the 
information-theoretic criteria in parallel. For example, if we have instances 
with 100 attributes and we partition the instances into 5 portions, we will 
have 20 attributes per portion. The algorithm processes the 20 attributes in 
parallel to determine the &quot;local&quot; best attribute to split and combine 
the parallel computation results to determine the &quot;global&quot; best 
attribute to split and grow the tree. </p>
-
-<p>For more explanation about available parallelism types for decision tree 
induction, you can read chapter 4 of <a 
href="../SAMOA-Developers-Guide-0-0-1.pdf">Distributed Decision Tree Learning 
for Mining Big Data Streams</a>, the Developer&#39;s Guide of SAMOA.  </p>
+<p>For more explanation about available parallelism types for decision tree 
induction, you can read chapter 4 of <a 
href="../SAMOA-Developers-Guide-0-0-1.pdf">Distributed Decision Tree Learning 
for Mining Big Data Streams</a>, the Developer’s Guide of SAMOA.</p>
 
 <h3 id="vertical-hoeffding-tree-vht-classifier">Vertical Hoeffding Tree (VHT) 
classifier</h3>
-
 <p>VHT is implemented using the SAMOA API. The diagram below shows the 
implementation:
-<img src="images/VHT.png" alt="Vertical Hoeffding Tree"></p>
+<img src="images/VHT.png" alt="Vertical Hoeffding Tree" /></p>
 
 <p>The <em>source Processor</em> and the <em>evaluator Processor</em> are 
components of the <a href="Prequential-Evaluation-Task">prequential evaluation 
task</a> in SAMOA. The <em>model-aggregator Processor</em> contains  the 
decision tree model. It connects to <em>local-statistic Processor</em> via 
<em>attribute</em> stream and <em>control</em> stream. The <em>model-aggregator 
Processor</em> splits instances based on attribute and each <em>local-statistic 
Processor</em> contains local statistic for attributes that assigned to it. The 
<em>model-aggregator Processor</em> sends the split instances via attribute 
stream and it sends control messages to ask <em>local-statistic Processor</em> 
to perform computation via <em>control</em> stream. Users configure <em>n</em>, 
which is the parallelism level of the algorithm. The parallelism level is 
translated into the number of local-statistic Processors in the algorithm.</p>
 
 <p>The <em>model-aggregator Processor</em> sends the classification result via 
<em>result</em> stream to the <em>evaluator Processor</em> for the 
corresponding evaluation task or other destination Processor. The <em>evaluator 
Processor</em> performs an evaluation of the algorithm showing accuracy and 
throughput. Incoming instances to the <em>model-aggregator Processor</em> 
arrive via <em>source</em> stream. The calculation results from local statistic 
arrive to the <em>model-aggregator Processor</em> via 
<em>computation-result</em> stream.</p>
 
-<p>For more details about the algorithms (i.e. pseudocode), go to section 4.2 
of <a href="../SAMOA-Developers-Guide-0-0-1.pdf">Distributed Decision Tree 
Learning for Mining Big Data Streams</a>, the Developer&#39;s Guide of SAMOA.  
</p>
+<p>For more details about the algorithms (i.e. pseudocode), go to section 4.2 
of <a href="../SAMOA-Developers-Guide-0-0-1.pdf">Distributed Decision Tree 
Learning for Mining Big Data Streams</a>, the Developer’s Guide of SAMOA.</p>
 
   </article>
 



Reply via email to