http://git-wip-us.apache.org/repos/asf/incubator-hivemall-site/blob/68241a08/userguide/regression/kddcup12tr2_lr_amplify.html
----------------------------------------------------------------------
diff --git a/userguide/regression/kddcup12tr2_lr_amplify.html 
b/userguide/regression/kddcup12tr2_lr_amplify.html
index ef32b5d..b40d2e1 100644
--- a/userguide/regression/kddcup12tr2_lr_amplify.html
+++ b/userguide/regression/kddcup12tr2_lr_amplify.html
@@ -999,6 +999,21 @@
             
         </li>
     
+        <li class="chapter " data-level="5.6" 
data-path="../binaryclass/titanic_rf.html">
+            
+                <a href="../binaryclass/titanic_rf.html">
+            
+                    
+                        <b>5.6.</b>
+                    
+                    Kaggle Titanic Tutorial
+            
+                </a>
+            
+
+            
+        </li>
+    
 
     
         
@@ -1651,7 +1666,7 @@
 -->
 <p>This article explains <em>amplify</em> technique that is useful for 
improving prediction score.</p>
 <p>Iterations are mandatory in machine learning (e.g., in <a 
href="http://en.wikipedia.org/wiki/Stochastic_gradient_descent"; 
target="_blank">stochastic gradient descent</a>) to get good prediction models. 
However, MapReduce is known to be not suited for iterative algorithms because 
IN/OUT of each MapReduce job is through HDFS.</p>
-<p>In this example, we show how Hivemall deals with this problem. We use <a 
href="https://github.com/myui/hivemall/wiki/KDDCup-2012-track-2-CTR-prediction-dataset";
 target="_blank">KDD Cup 2012, Track 2 Task</a> as an example.</p>
+<p>In this example, we show how Hivemall deals with this problem. We use <a 
href="kddcup12tr2_dataset.html">KDD Cup 2012, Track 2 Task</a> as an 
example.</p>
 <p><strong>WARNING</strong>: rand_amplify() is supported in v0.2-beta1 and 
later.</p>
 <hr>
 <h1 
id="amplify-training-examples-in-map-phase-and-shuffle-them-in-reduce-phase">Amplify
 training examples in Map phase and shuffle them in Reduce phase</h1>
@@ -1690,7 +1705,7 @@ So, we recommend users to use an amplified view for 
training as follows:</p>
 </code></pre>
 <p>The above query is executed by 2 MapReduce jobs as shown below:</p>
 <p><img src="../resources/images/amplify.png" alt="amplifier"></p>
-<p>Using <em>trainning_x3</em>  instead of the plain training table results in 
higher and better AUC (0.746214) in <a 
href="https://github.com/myui/hivemall/wiki/KDDCup-2012-track-2-CTR-prediction-(regression\"
 target="_blank">this</a>) example.</p>
+<p>Using <em>trainning_x3</em>  instead of the plain training table results in 
higher and better AUC (0.746214) in <a 
href="kddcup12tr2_lr.html#evaluation">this example</a>.</p>
 <p>A problem in amplify() is that the shuffle (copy) and merge phase of the 
stage 1 could become a bottleneck.
 When the training table is so large that involves 100 Map tasks, the merge 
operator needs to merge at least 100 files by (external) merge sort! </p>
 <p>Note that the actual bottleneck is not M/R iterations but shuffling 
training instance. Iteration without shuffling (as in <a 
href="http://spark.incubator.apache.org/examples.html"; target="_blank">the 
Spark example</a>) causes very slow convergence and results in requiring more 
iterations. Shuffling cannot be avoided even in iterative MapReduce 
variants.</p>
@@ -1713,7 +1728,7 @@ The rand_amplify UDTF outputs rows in a random order when 
the local buffer speci
 <p><img src="../resources/images/randamplify.png" alt="Random amplify"></p>
 <p>The map-local multiplication and shuffling has no bottleneck in the merge 
phase and the query is efficiently executed within a single MapReduce job.</p>
 <p><img src="../resources/images/randamplify_elapsed.png" alt="rand_amplify 
elapsed"></p>
-<p>Using <em>rand_amplify</em> results in a better AUC (0.743392) in <a 
href="https://github.com/myui/hivemall/wiki/KDDCup-2012-track-2-CTR-prediction-(regression\"
 target="_blank">this</a>) example.</p>
+<p>Using <em>rand_amplify</em> results in a better AUC (0.743392) in <a 
href="kddcup12tr2_lr.html#evaluation">this example</a>.</p>
 <hr>
 <h1 id="conclusion">Conclusion</h1>
 <p>We recommend users to use <em>amplify()</em> for small training inputs and 
to use <em>rand_amplify()</em> for large training inputs to get a better 
accuracy in a reasonable training time.</p>
@@ -1743,7 +1758,25 @@ The rand_amplify UDTF outputs rows in a random order 
when the local buffer speci
 </tr>
 </tbody>
 </table>
-<p><div id="page-footer"><hr><p><sub><font color="gray">
+<p><div id="page-footer"><hr><!--
+  Licensed to the Apache Software Foundation (ASF) under one
+  or more contributor license agreements.  See the NOTICE file
+  distributed with this work for additional information
+  regarding copyright ownership.  The ASF licenses this file
+  to you under the Apache License, Version 2.0 (the
+  "License"); you may not use this file except in compliance
+  with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing,
+  software distributed under the License is distributed on an
+  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  KIND, either express or implied.  See the License for the
+  specific language governing permissions and limitations
+  under the License.
+-->
+<p><sub><font color="gray">
 Apache Hivemall is an effort undergoing incubation at The Apache Software 
Foundation (ASF), sponsored by the Apache Incubator.
 </font></sub></p>
 </div></p>
@@ -1780,7 +1813,7 @@ Apache Hivemall is an effort undergoing incubation at The 
Apache Software Founda
     <script>
         var gitbook = gitbook || [];
         gitbook.push(function() {
-            gitbook.page.hasChanged({"page":{"title":"Logistic Regression with 
Amplifier","level":"7.2.3","depth":2,"next":{"title":"AdaGrad, 
AdaDelta","level":"7.2.4","depth":2,"path":"regression/kddcup12tr2_adagrad.md","ref":"regression/kddcup12tr2_adagrad.md","articles":[]},"previous":{"title":"Logistic
 Regression, Passive 
Aggressive","level":"7.2.2","depth":2,"path":"regression/kddcup12tr2_lr.md","ref":"regression/kddcup12tr2_lr.md","articles":[]},"dir":"ltr"},"config":{"plugins":["theme-api","edit-link","github","splitter","sitemap","etoc","callouts","toggle-chapters","anchorjs","codeblock-filename","expandable-chapters","multipart","codeblock-filename","katex","emphasize","localized-footer"],"styles":{"website":"styles/website.css","pdf":"styles/pdf.css","epub":"styles/epub.css","mobi":"styles/mobi.css","ebook":"styles/ebook.css","print":"styles/print.css"},"pluginsConfig":{"emphasize":{},"callouts":{},"etoc":{"maxdepth":3,"mindepth":1,"notoc":true},"github":{"url":"https://gi
 
thub.com/apache/incubator-hivemall/"},"splitter":{},"search":{},"downloadpdf":{"base":"https://github.com/apache/incubator-hivemall/docs/gitbook","label":"PDF","multilingual":false},"multipart":{},"localized-footer":{"filename":"FOOTER.md"},"lunr":{"maxIndexSize":1000000,"ignoreSpecialCharacters":false},"katex":{},"fontsettings":{"theme":"white","family":"sans","size":2,"font":"sans"},"highlight":{},"codeblock-filename":{},"sitemap":{"hostname":"http://hivemall.incubator.apache.org/"},"theme-api":{"languages":[],"split":false,"theme":"dark"},"sharing":{"facebook":true,"twitter":true,"google":false,"weibo":false,"instapaper":false,"vk":false,"all":["facebook","google","twitter","weibo","instapaper"]},"edit-link":{"label":"Edit","base":"https://github.com/apache/incubator-hivemall/docs/gitbook"},"theme-default":{"styles":{"website":"styles/website.css","pdf":"styles/pdf.css","epub":"styles/epub.css","mobi":"styles/mobi.css","ebook":"styles/ebook.css","print":"styles/print.css"},"showL
 evel":true},"anchorjs":{"selector":"h1,h2,h3,*:not(.callout) > 
h4,h5"},"toggle-chapters":{},"expandable-chapters":{}},"theme":"default","pdf":{"pageNumbers":true,"fontSize":12,"fontFamily":"Arial","paperSize":"a4","chapterMark":"pagebreak","pageBreaksBefore":"/","margin":{"right":62,"left":62,"top":56,"bottom":56}},"structure":{"langs":"LANGS.md","readme":"README.md","glossary":"GLOSSARY.md","summary":"SUMMARY.md"},"variables":{},"title":"Hivemall
 User Manual","links":{"sidebar":{"<i class=\"fa fa-home\"></i> 
Home":"http://hivemall.incubator.apache.org/"}},"gitbook":"3.x.x","description":"User
 Manual for Apache 
Hivemall"},"file":{"path":"regression/kddcup12tr2_lr_amplify.md","mtime":"2016-11-14T09:52:36.000Z","type":"markdown"},"gitbook":{"version":"3.2.2","time":"2016-11-14T10:40:22.987Z"},"basePath":"..","book":{"language":""}});
+            gitbook.page.hasChanged({"page":{"title":"Logistic Regression with 
Amplifier","level":"7.2.3","depth":2,"next":{"title":"AdaGrad, 
AdaDelta","level":"7.2.4","depth":2,"path":"regression/kddcup12tr2_adagrad.md","ref":"regression/kddcup12tr2_adagrad.md","articles":[]},"previous":{"title":"Logistic
 Regression, Passive 
Aggressive","level":"7.2.2","depth":2,"path":"regression/kddcup12tr2_lr.md","ref":"regression/kddcup12tr2_lr.md","articles":[]},"dir":"ltr"},"config":{"plugins":["theme-api","edit-link","github","splitter","sitemap","etoc","callouts","toggle-chapters","anchorjs","codeblock-filename","expandable-chapters","multipart","codeblock-filename","katex","emphasize","localized-footer"],"styles":{"website":"styles/website.css","pdf":"styles/pdf.css","epub":"styles/epub.css","mobi":"styles/mobi.css","ebook":"styles/ebook.css","print":"styles/print.css"},"pluginsConfig":{"emphasize":{},"callouts":{},"etoc":{"maxdepth":3,"mindepth":1,"notoc":true},"github":{"url":"https://gi
 
thub.com/apache/incubator-hivemall/"},"splitter":{},"search":{},"downloadpdf":{"base":"https://github.com/apache/incubator-hivemall/docs/gitbook","label":"PDF","multilingual":false},"multipart":{},"localized-footer":{"filename":"FOOTER.md"},"lunr":{"maxIndexSize":1000000,"ignoreSpecialCharacters":false},"katex":{},"fontsettings":{"theme":"white","family":"sans","size":2,"font":"sans"},"highlight":{},"codeblock-filename":{},"sitemap":{"hostname":"http://hivemall.incubator.apache.org/"},"theme-api":{"languages":[],"split":false,"theme":"dark"},"sharing":{"facebook":true,"twitter":true,"google":false,"weibo":false,"instapaper":false,"vk":false,"all":["facebook","google","twitter","weibo","instapaper"]},"edit-link":{"label":"Edit","base":"https://github.com/apache/incubator-hivemall/docs/gitbook"},"theme-default":{"styles":{"website":"styles/website.css","pdf":"styles/pdf.css","epub":"styles/epub.css","mobi":"styles/mobi.css","ebook":"styles/ebook.css","print":"styles/print.css"},"showL
 evel":true},"anchorjs":{"selector":"h1,h2,h3,*:not(.callout) > 
h4,h5"},"toggle-chapters":{},"expandable-chapters":{}},"theme":"default","pdf":{"pageNumbers":true,"fontSize":12,"fontFamily":"Arial","paperSize":"a4","chapterMark":"pagebreak","pageBreaksBefore":"/","margin":{"right":62,"left":62,"top":56,"bottom":56}},"structure":{"langs":"LANGS.md","readme":"README.md","glossary":"GLOSSARY.md","summary":"SUMMARY.md"},"variables":{},"title":"Hivemall
 User Manual","links":{"sidebar":{"<i class=\"fa fa-home\"></i> 
Home":"http://hivemall.incubator.apache.org/"}},"gitbook":"3.x.x","description":"User
 Manual for Apache 
Hivemall"},"file":{"path":"regression/kddcup12tr2_lr_amplify.md","mtime":"2016-11-17T11:40:35.000Z","type":"markdown"},"gitbook":{"version":"3.2.2","time":"2016-11-17T12:16:14.647Z"},"basePath":"..","book":{"language":""}});
         });
     </script>
 </div>

http://git-wip-us.apache.org/repos/asf/incubator-hivemall-site/blob/68241a08/userguide/resources/images/kddtrack2tables.png
----------------------------------------------------------------------
diff --git a/userguide/resources/images/kddtrack2tables.png 
b/userguide/resources/images/kddtrack2tables.png
new file mode 100644
index 0000000..90012db
Binary files /dev/null and b/userguide/resources/images/kddtrack2tables.png 
differ

Reply via email to