incubator-hivemall git commit: Close #88: [HIVEMALL-50] Add a description about Feature Pairing

myui Fri, 23 Jun 2017 02:57:54 -0700

Repository: incubator-hivemall
Updated Branches:
  refs/heads/master 2915a78c7 -> c06378a81



Close #88: [HIVEMALL-50] Add a description about Feature Pairing


Project: http://git-wip-us.apache.org/repos/asf/incubator-hivemall/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-hivemall/commit/c06378a8
Tree: http://git-wip-us.apache.org/repos/asf/incubator-hivemall/tree/c06378a8
Diff: http://git-wip-us.apache.org/repos/asf/incubator-hivemall/diff/c06378a8

Branch: refs/heads/master
Commit: c06378a81723e3998f90c08ec7444ead5b6f2263
Parents: 2915a78
Author: Makoto Yui <[email protected]>
Authored: Fri Jun 23 18:56:57 2017 +0900
Committer: Makoto Yui <[email protected]>
Committed: Fri Jun 23 18:56:57 2017 +0900

----------------------------------------------------------------------
 docs/gitbook/SUMMARY.md                   | 38 ++++++++------
 docs/gitbook/binaryclass/general.md       |  6 ++-
 docs/gitbook/clustering/lda.md            | 48 +++++++++--------
 docs/gitbook/clustering/plsa.md           | 48 ++++++++++-------
 docs/gitbook/eval/auc.md                  |  9 +++-
 docs/gitbook/ft_engineering/pairing.md    | 19 +++++++
 docs/gitbook/ft_engineering/polynomial.md | 73 ++++++++++++++++++++++++++
 docs/gitbook/misc/prediction.md           | 36 +++++++------
 8 files changed, 202 insertions(+), 75 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/c06378a8/docs/gitbook/SUMMARY.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/SUMMARY.md b/docs/gitbook/SUMMARY.md
index 638b77b..32b0150 100644
--- a/docs/gitbook/SUMMARY.md
+++ b/docs/gitbook/SUMMARY.md
@@ -57,10 +57,12 @@
 * [Feature Hashing](ft_engineering/hashing.md)
 * [Feature Selection](ft_engineering/selection.md)
 * [Feature Binning](ft_engineering/binning.md)
-* [TF-IDF Calculation](ft_engineering/tfidf.md)
+* [FEATURE PAIRING](ft_engineering/pairing.md)
+    * [Polynomial Features](ft_engineering/polynomial.md)
 * [FEATURE TRANSFORMATION](ft_engineering/ft_trans.md)
     * [Feature Vectorization](ft_engineering/vectorization.md)
     * [Quantify non-number features](ft_engineering/quantify.md)
+* [TF-IDF Calculation](ft_engineering/tfidf.md)
 
 ## Part IV - Evaluation
 
@@ -72,43 +74,43 @@
 * [Data Generation](eval/datagen.md)
     * [Logistic Regression data generation](eval/lr_datagen.md)
     
-## Part V - Prediction
+## Part V - Supervised Learning
 
 * [How Prediction Works](misc/prediction.md)
-* [Regression](regression/general.md)
-* [Binary Classification](binaryclass/general.md)
     
-## Part VI - Binary classification tutorials
+## Part VI - Binary classification
 
-* [a9a](binaryclass/a9a.md)
+* [Binary Classification](binaryclass/general.md)
+
+* [a9a tutorial](binaryclass/a9a.md)
     * [Data preparation](binaryclass/a9a_dataset.md)
     * [Logistic Regression](binaryclass/a9a_lr.md)
     * [Mini-batch Gradient Descent](binaryclass/a9a_minibatch.md)
 
-* [News20](binaryclass/news20.md)
+* [News20 tutorial](binaryclass/news20.md)
     * [Data preparation](binaryclass/news20_dataset.md)
     * [Perceptron, Passive Aggressive](binaryclass/news20_pa.md)
     * [CW, AROW, SCW](binaryclass/news20_scw.md)
     * [AdaGradRDA, AdaGrad, AdaDelta](binaryclass/news20_adagrad.md)
 
-* [KDD2010a](binaryclass/kdd2010a.md)
+* [KDD2010a tutorial](binaryclass/kdd2010a.md)
     * [Data preparation](binaryclass/kdd2010a_dataset.md)
     * [PA, CW, AROW, SCW](binaryclass/kdd2010a_scw.md)
 
-* [KDD2010b](binaryclass/kdd2010b.md)
+* [KDD2010b tutorial](binaryclass/kdd2010b.md)
     * [Data preparation](binaryclass/kdd2010b_dataset.md)
     * [AROW](binaryclass/kdd2010b_arow.md)
 
-* [Webspam](binaryclass/webspam.md)
+* [Webspam tutorial](binaryclass/webspam.md)
     * [Data pareparation](binaryclass/webspam_dataset.md)
     * [PA1, AROW, SCW](binaryclass/webspam_scw.md)
 
-* [Kaggle Titanic](binaryclass/titanic_rf.md)
+* [Kaggle Titanic tutorial](binaryclass/titanic_rf.md)
 
 
-## Part VII - Multiclass classification tutorials
+## Part VII - Multiclass classification
 
-* [News20 Multiclass](multiclass/news20.md)
+* [News20 Multiclass tutorial](multiclass/news20.md)
     * [Data preparation](multiclass/news20_dataset.md)
     * [Data preparation for one-vs-the-rest 
classifiers](multiclass/news20_one-vs-the-rest_dataset.md)
     * [PA](multiclass/news20_pa.md)
@@ -116,18 +118,20 @@
     * [Ensemble learning](multiclass/news20_ensemble.md)
     * [one-vs-the-rest classifier](multiclass/news20_one-vs-the-rest.md)
 
-* [Iris](multiclass/iris.md)
+* [Iris tutorial](multiclass/iris.md)
     * [Data preparation](multiclass/iris_dataset.md)
     * [SCW](multiclass/iris_scw.md)
     * [RandomForest](multiclass/iris_randomforest.md)
 
-## Part VIII - Regression tutorials
+## Part VIII - Regression
+
+* [Regression](regression/general.md)
 
-* [E2006-tfidf regression](regression/e2006.md)
+* [E2006-tfidf regression tutorial](regression/e2006.md)
     * [Data preparation](regression/e2006_dataset.md)
     * [Passive Aggressive, AROW](regression/e2006_arow.md)
 
-* [KDDCup 2012 track 2 CTR prediction](regression/kddcup12tr2.md)
+* [KDDCup 2012 track 2 CTR prediction tutorial](regression/kddcup12tr2.md)
     * [Data preparation](regression/kddcup12tr2_dataset.md)
     * [Logistic Regression, Passive Aggressive](regression/kddcup12tr2_lr.md)
     * [Logistic Regression with 
Amplifier](regression/kddcup12tr2_lr_amplify.md)

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/c06378a8/docs/gitbook/binaryclass/general.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/binaryclass/general.md 
b/docs/gitbook/binaryclass/general.md
index 50ea688..931cc58 100644
--- a/docs/gitbook/binaryclass/general.md
+++ b/docs/gitbook/binaryclass/general.md
@@ -56,6 +56,10 @@ from
 group by feature;
 ```
 
+> #### Note
+>
+> `-total_steps` option is an optional parameter and training works without it.
+
 # Prediction & evaluation
 
 ```sql
@@ -72,7 +76,7 @@ predict as (
   select
     t.rowid,
     sigmoid(sum(m.weight * t.value)) as prob,
-    CAST((case when sigmoid(sum(m.weight * t.value)) >= 0.5 then 1.0 else 0.0 
end) as FLOAT) as label
+    (case when sigmoid(sum(m.weight * t.value)) >= 0.5 then 1.0 else 0.0 
end)as label
   from
     test_exploded t LEFT OUTER JOIN
     classification_model m ON (t.feature = m.feature)

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/c06378a8/docs/gitbook/clustering/lda.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/clustering/lda.md b/docs/gitbook/clustering/lda.md
index 8b8e5f5..b433472 100644
--- a/docs/gitbook/clustering/lda.md
+++ b/docs/gitbook/clustering/lda.md
@@ -46,19 +46,21 @@ with word_counts as (
   select
     docid,
     feature(word, count(word)) as word_count
-  from docs t1 LATERAL VIEW explode(tokenize(doc, true)) t2 as word
+  from 
+    docs t1 
+    LATERAL VIEW explode(tokenize(doc, true)) t2 as word
   where
     not is_stopword(word)
   group by
     docid, word
 )
-select docid, collect_set(word_count) as feature
+select docid, collect_list(word_count) as features
 from word_counts
 group by docid
 ;
 ```
 
-| docid | feature |
+| docid | features |
 |:---:|:---|
 |1  | ["fruits:1","healthy:1","vegetables:1"] |
 |2  | ["apples:1","avocados:1","colds:1","flu:1","like:2","oranges:1"] |
@@ -80,15 +82,16 @@ with word_counts as (
     not is_stopword(word)
   group by
     docid, word
-)
-select
-  train_lda(feature, "-topics 2 -iter 20") as (label, word, lambda)
-from (
-  select docid, collect_set(word_count) as feature
+),
+input as (
+  select docid, collect_list(word_count) as features
   from word_counts
   group by docid
-  order by docid
-) t
+)
+select
+  train_lda(features, '-topics 2 -iter 20') as (label, word, lambda)
+from
+  input
 ;
 ```
 
@@ -99,20 +102,22 @@ Notice that `order by docid` ensures building a LDA model 
precisely in a single
 ```sql
 with word_counts as (
   -- same as above
+),
+input as (
+  select docid, collect_list(f) as features
+  from word_counts
+  group by docid
 )
 select
   label, word, avg(lambda) as lambda
 from (
   select
-    train_lda(feature, "-topics 2 -iter 20") as (label, word, lambda)
-  from (
-    select docid, collect_set(f) as feature
-    from word_counts
-    group by docid
-  ) t1
+    train_lda(features, '-topics 2 -iter 20') as (label, word, lambda)
+  from 
+    input
 ) t2
 group by label, word
-order by lambda desc
+-- order by lambda desc -- ordering is optional
 ;
 ```
 
@@ -155,7 +160,9 @@ with test as (
     docid,
     word,
     count(word) as value
-  from docs t1 LATERAL VIEW explode(tokenize(doc, true)) t2 as word
+  from
+    docs t1
+    LATERAL VIEW explode(tokenize(doc, true)) t2 as word
   where
     not is_stopword(word)
   group by
@@ -163,7 +170,7 @@ with test as (
 )
 select
   t.docid,
-  lda_predict(t.word, t.value, m.label, m.lambda, "-topics 2") as probabilities
+  lda_predict(t.word, t.value, m.label, m.lambda, '-topics 2') as probabilities
 from
   test t
   JOIN lda_model m ON (t.word = m.word)
@@ -183,8 +190,7 @@ Since the probabilities are sorted in descending order, a 
label of the most prom
 
 ```sql
 select docid, probabilities[0].label
-from topic
-;
+from topic;
 ```
 
 | docid | label |

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/c06378a8/docs/gitbook/clustering/plsa.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/clustering/plsa.md b/docs/gitbook/clustering/plsa.md
index 7cd3a9d..31cc08d 100644
--- a/docs/gitbook/clustering/plsa.md
+++ b/docs/gitbook/clustering/plsa.md
@@ -52,19 +52,23 @@ with word_counts as (
   select
     docid,
     feature(word, count(word)) as f
-  from docs t1 lateral view explode(tokenize(doc, true)) t2 as word
+  from 
+    docs t1
+       lateral view explode(tokenize(doc, true)) t2 as word
   where
     not is_stopword(word)
   group by
     docid, word
-)
-select
-       train_plsa(feature, "-topics 2 -eps 0.00001 -iter 2048 -alpha 0.01") as 
(label, word, prob)
-from (
-  select docid, collect_set(f) as feature
+),
+input as (
+  select docid, collect_list(f) as features
   from word_counts
   group by docid
-) t
+)
+select
+  train_plsa(features, '-topics 2 -eps 0.00001 -iter 2048 -alpha 0.01') as 
(label, word, prob)
+from 
+  input
 ;
 ```
 
@@ -90,7 +94,6 @@ from (
 |1|       colds  | 0.001978546|
 
 
-
 And prediction can be done as:
 
 ```sql
@@ -99,7 +102,9 @@ test as (
     docid,
     word,
     count(word) as value
-  from docs t1 LATERAL VIEW explode(tokenize(doc, true)) t2 as word
+  from 
+    docs t1
+       LATERAL VIEW explode(tokenize(doc, true)) t2 as word
   where
     not is_stopword(word)
   group by
@@ -108,20 +113,25 @@ test as (
 topic as (
   select
     t.docid,
-    plsa_predict(t.word, t.value, m.label, m.prob, "-topics 2") as 
probabilities
+    plsa_predict(t.word, t.value, m.label, m.prob, '-topics 2') as 
probabilities
   from
     test t
     JOIN plsa_model m ON (t.word = m.word)
   group by
     t.docid
 )
-select docid, probabilities, probabilities[0].label, m.words -- topic each 
document should be assigned
-from topic t
-join (
-  select label, collect_set(feature(word, prob)) as words
-  from plsa_model
-  group by label
-) m on t.probabilities[0].label = m.label
+select 
+  docid, 
+  probabilities, 
+  probabilities[0].label, 
+  m.words -- topic each document should be assigned
+from
+  topic t 
+  JOIN (
+    select label, collect_list(feature(word, prob)) as words
+    from plsa_model
+    group by label
+  ) m on t.probabilities[0].label = m.label
 ;
 ```
 
@@ -144,7 +154,7 @@ For the reasons that we mentioned above, we recommend you 
to first use LDA. Afte
 For training pLSA, we set a hyper-parameter `alpha` in the above example:
 
 ```sql
-SELECT train_plsa(feature, "-topics 2 -eps 0.00001 -iter 2048 -alpha 0.01") 
+SELECT train_plsa(feature, '-topics 2 -eps 0.00001 -iter 2048 -alpha 0.01') 
 ```
 
 This value controls **how much iterative model update is affected by the old 
results**.
@@ -162,7 +172,7 @@ In that case, you need to try different hyper-parameters to 
avoid overfitting as
 For instance, [20 newsgroups dataset](http://qwone.com/~jason/20Newsgroups/) 
which consists of 10906 realistic documents empirically requires the following 
options:
 
 ```sql
-SELECT train_plsa(features, "-topics 20 -iter 10 -s 128 -delta 0.01 -alpha 512 
-eps 0.1")
+SELECT train_plsa(features, '-topics 20 -iter 10 -s 128 -delta 0.01 -alpha 512 
-eps 0.1')
 ```
 
 Clearly, `alpha` is much larger than `0.01` which was used for the dummy data 
above. Let you keep in mind that an appropriate value of `alpha` highly depends 
on the number of documents and mini-batch size.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/c06378a8/docs/gitbook/eval/auc.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/eval/auc.md b/docs/gitbook/eval/auc.md
index 3fba0bb..8cad8f6 100644
--- a/docs/gitbook/eval/auc.md
+++ b/docs/gitbook/eval/auc.md
@@ -41,7 +41,9 @@ Once the rows are sorted by the probabilities in a descending 
order, AUC gives a
 
 In Hivemall, a function `auc(double score, int label)` provides a way to 
compute AUC for pairs of probability and truth label.
 
-For instance, following query computes AUC of the table which was shown above:
+## Sequential AUC computation on a single node
+
+For instance, the following query computes AUC of the table which was shown 
above:
 
 ```sql
 with data as (
@@ -68,6 +70,8 @@ This query returns `0.83333` as AUC.
 
 Since AUC is a metric based on ranked probability-label pairs as mentioned 
above, input data (rows) needs to be ordered by scores in a descending order.
 
+## Parallel approximate AUC computation
+
 Meanwhile, Hive's `distribute by` clause allows you to compute AUC in 
parallel: 
 
 ```sql
@@ -82,7 +86,8 @@ with data as (
   union all
   select 0.7 as prob, 1 as label
 )
-select auc(prob, label) as auc
+select 
+  auc(prob, label) as auc
 from (
   select prob, label
   from data

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/c06378a8/docs/gitbook/ft_engineering/pairing.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/ft_engineering/pairing.md 
b/docs/gitbook/ft_engineering/pairing.md
new file mode 100644
index 0000000..2959148
--- /dev/null
+++ b/docs/gitbook/ft_engineering/pairing.md
@@ -0,0 +1,19 @@
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one
+  or more contributor license agreements.  See the NOTICE file
+  distributed with this work for additional information
+  regarding copyright ownership.  The ASF licenses this file
+  to you under the Apache License, Version 2.0 (the
+  "License"); you may not use this file except in compliance
+  with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing,
+  software distributed under the License is distributed on an
+  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  KIND, either express or implied.  See the License for the
+  specific language governing permissions and limitations
+  under the License.
+-->
+        

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/c06378a8/docs/gitbook/ft_engineering/polynomial.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/ft_engineering/polynomial.md 
b/docs/gitbook/ft_engineering/polynomial.md
new file mode 100644
index 0000000..8f3d8cf
--- /dev/null
+++ b/docs/gitbook/ft_engineering/polynomial.md
@@ -0,0 +1,73 @@
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one
+  or more contributor license agreements.  See the NOTICE file
+  distributed with this work for additional information
+  regarding copyright ownership.  The ASF licenses this file
+  to you under the Apache License, Version 2.0 (the
+  "License"); you may not use this file except in compliance
+  with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing,
+  software distributed under the License is distributed on an
+  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  KIND, either express or implied.  See the License for the
+  specific language governing permissions and limitations
+  under the License.
+-->
+
+<!-- toc -->
+
+[Polynomial features](http://en.wikipedia.org/wiki/Polynomial_kernel) allows 
you to do [non-linear 
regression](https://class.coursera.org/ml-005/lecture/23)/classification with a 
linear model.
+
+> #### Caution
+>
+> Polynomial Features assumes normalized inputs because `x**n` easily becomes 
INF/-INF where `n` is large.
+
+# Polynomial Features
+
+As [a similar to one in 
Scikit-Learn](http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.PolynomialFeatures.html),
 `polynomial_feature(array<String> features, int degree [, boolean 
interactionOnly=false, boolean truncate=true])` is a function to generate 
polynomial and interaction features.
+
+```sql
+select polynomial_features(array("a:0.5","b:0.2"), 2);
+> ["a:0.5","a^a:0.25","a^b:0.1","b:0.2","b^b:0.040000003"]
+
+select polynomial_features(array("a:0.5","b:0.2"), 3);
+> 
["a:0.5","a^a:0.25","a^a^a:0.125","a^a^b:0.05","a^b:0.1","a^b^b:0.020000001","b:0.2","b^b:0.040000003","b^b^b:0.008"]
+
+-- interaction only
+select polynomial_features(array("a:0.5","b:0.2"), 3, true);
+> ["a:0.5","a^b:0.1","b:0.2"]
+
+select polynomial_features(array("a:0.5","b:0.2","c:0.3"), 3, true);
+> 
["a:0.5","a^b:0.1","a^b^c:0.030000001","a^c:0.15","b:0.2","b^c:0.060000002","c:0.3"]
+
+-- interaction only + no truncate
+select polynomial_features(array("a:0.5","b:1.0", "c:0.3"), 3, true, false);
+> ["a:0.5","a^b:0.5","a^b^c:0.15","a^c:0.15","b:1.0","b^c:0.3","c:0.3"]
+
+-- interaction only + truncate
+select polynomial_features(array("a:0.5","b:1.0","c:0.3"), 3, true, true);
+> ["a:0.5","a^c:0.15","b:1.0","c:0.3"]
+
+-- truncate
+select polynomial_features(array("a:0.5","b:1.0", "c:0.3"), 3, false, true);
+> 
["a:0.5","a^a:0.25","a^a^a:0.125","a^a^c:0.075","a^c:0.15","a^c^c:0.045","b:1.0","c:0.3","c^c:0.09","c^c^c:0.027000003"]
+
+-- do not truncate
+select polynomial_features(array("a:0.5","b:1.0", "c:0.3"), 3, false, false);
+> 
["a:0.5","a^a:0.25","a^a^a:0.125","a^a^b:0.25","a^a^c:0.075","a^b:0.5","a^b^b:0.5","a^b^c:0.15","a^c:0.15","a^c^c:0.045","b:1.0","b^b:1.0","b^b^b:1.0","b^b^c:0.3","b^c:0.3","b^c^c:0.09","c:0.3","c^c:0.09","c^c^c:0.027000003"]
+> 
+```
+
+_Note: `truncate` is used to eliminate unnecessary combinations._
+
+# Powered Features
+
+The `powered_features(array<String> features, int degree [, boolean 
truncate=true] )` is a function to generate polynomial features.
+
+```sql
+select powered_features(array("a:0.5","b:0.2"), 3);
+> ["a:0.5","a^2:0.25","a^3:0.125","b:0.2","b^2:0.040000003","b^3:0.008"]
+```
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-hivemall/blob/c06378a8/docs/gitbook/misc/prediction.md
----------------------------------------------------------------------
diff --git a/docs/gitbook/misc/prediction.md b/docs/gitbook/misc/prediction.md
index 8c17ec6..317d688 100644
--- a/docs/gitbook/misc/prediction.md
+++ b/docs/gitbook/misc/prediction.md
@@ -107,21 +107,23 @@ Below we list possible options for `train_regression` and 
`train_classifier`, an
 
 - Loss function: `-loss`, `-loss_function`
        - For `train_regression`
-               - SquaredLoss
-               - QuantileLoss
-               - EpsilonInsensitiveLoss
-               - SquaredEpsilonInsensitiveLoss
-               - HuberLoss
+               - SquaredLoss (synonym: squared)
+               - QuantileLoss (synonym: quantile)
+               - EpsilonInsensitiveLoss (synonym: epsilon_intensitive)
+               - SquaredEpsilonInsensitiveLoss (synonym: 
squared_epsilon_intensitive)
+               - HuberLoss (synonym: huber)
        - For `train_classifier`
-               - HingeLoss
-               - LogLoss
-               - SquaredHingeLoss
-               - ModifiedHuberLoss
-               - SquaredLoss
-               - QuantileLoss
-               - EpsilonInsensitiveLoss
-               - SquaredEpsilonInsensitiveLoss
-               - HuberLoss
+               - HingeLoss (synonym: hinge)
+               - LogLoss (synonym: log, logistic)
+               - SquaredHingeLoss (synonym: squared_hinge)
+               - ModifiedHuberLoss (synonym: modified_huber)
+               - The following losses are mainly designed for regression but 
can sometimes be useful in classification as well:
+                 - SquaredLoss (synonym: squared)
+                 - QuantileLoss (synonym: quantile)
+                 - EpsilonInsensitiveLoss (synonym: epsilon_intensitive)
+                 - SquaredEpsilonInsensitiveLoss (synonym: 
squared_epsilon_intensitive)
+                 - HuberLoss (synonym: huber)
+
 - Regularization function: `-reg`, `-regularization`
        - L1
        - L2
@@ -135,5 +137,9 @@ Additionally, there are several variants of the SGD 
technique, and it is also co
        - AdaGrad
        - AdaDelta
        - Adam
-       
+
+> #### Note
+>
+> Option values are case insensitive and you can use `sgd` or `rda`, or 
`huberloss`.
+
 In practice, you can try different combinations of the options in order to 
achieve higher prediction accuracy.
\ No newline at end of file

incubator-hivemall git commit: Close #88: [HIVEMALL-50] Add a description about Feature Pairing

Reply via email to