Github user takuti commented on a diff in the pull request:
https://github.com/apache/incubator-hivemall/pull/158#discussion_r213891864
--- Diff: docs/gitbook/getting_started/tutorial.md ---
@@ -0,0 +1,493 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+-->
+
+# Step-by-Step Tutorial on Supervised Learning with Apache Hivemall
+
+<!-- toc -->
+
+## What is Hivemall?
+
+[Apache Hive](https://hive.apache.org/) is a data warehousing solution
that enables us to process large-scale data in the form of SQL easily. Assume
that you have a table named `purchase_history` which can be artificially
created as:
+
+```sql
+create table if not exists purchase_history
+(id bigint, day_of_week string, price int, category string, label int)
+;
+```
+
+
+```sql
+insert overwrite table purchase_history
+select 1 as id, "Saturday" as day_of_week, "male" as gender, 600 as price,
"book" as category, 1 as label
+union all
+select 2 as id, "Friday" as day_of_week, "female" as gender, 4800 as
price, "sports" as category, 0 as label
+union all
+select 3 as id, "Friday" as day_of_week, "other" as gender, 18000 as
price, "entertainment" as category, 0 as label
+union all
+select 4 as id, "Thursday" as day_of_week, "male" as gender, 200 as price,
"food" as category, 0 as label
+union all
+select 5 as id, "Wednesday" as day_of_week, "female" as gender, 1000 as
price, "electronics" as category, 1 as label
+;
+```
+
+The syntax of Hive queries, namely **HiveQL**, is very similar to SQL:
+
+```sql
+select count(1) from purchase_log
+```
+
+> 5
+
+[Apache Hivemall](https://github.com/apache/incubator-hivemall) is a
collection of user-defined functions (UDFs) for HiveQL which is strongly
optimized for machine learning (ML) and data science. To give an example, you
can efficiently build a logistic regression model with the stochastic gradient
descent (SGD) optimization by issuing the following ~10 lines of query:
+
+```sql
+SELECT
+ train_classifier(
+ features,
+ label,
+ '-loss_function logloss -optimizer SGD'
+ ) as (feature, weight)
+FROM
+ training
+;
+```
+
+
+On the TD console, Hivemall function
[`hivemall_version()`](http://hivemall.incubator.apache.org/userguide/misc/funcs.html#others)
shows current Hivemall version that is available on TD, for example:
+
+```sql
+select hivemall_version()
+```
+
+> "0.5.1-20180703-SNAPSHOT-31924dc" (as of July 23, 2018)
--- End diff --
This complex version code is from one specific Hivemall user, [Treasure
Data](https://www.treasuredata.com/) (TD). Let's make it more generic as:
> "0.5.1-incubating-SNAPSHOT"
---