Github user takuti commented on the issue:
https://github.com/apache/incubator-hivemall/pull/149
### With linear terms
#### Hivemall
```sql
INSERT OVERWRITE TABLE criteo.ffm_model
SELECT
train_ffm(features, label, '-init_v random -max_init_value 0.5
-classification -iterations 15 -factors 4 -eta 0.2 -l2norm -optimizer adagrad
-lambda 0.00002 -cv_rate 0.0')
FROM (
SELECT
features, label
FROM
criteo.train_vectorized
CLUSTER BY rand(1)
) t
;
```
```
Iteration #2 | average loss=0.474651712453725, current cumulative
loss=753.2722676640616, previous cumulative loss=990.2550021169766, change
rate=0.23931485722999737, #trainingExamples=1587
Iteration #3 | average loss=0.4499051385165006, current cumulative
loss=713.9994548256865, previous cumulative loss=753.2722676640616, change
rate=0.05213627863954456, #trainingExamples=1587
Iteration #4 | average loss=0.4342257595710771, current cumulative
loss=689.1162804392994, previous cumulative loss=713.9994548256865, change
rate=0.03485041090467212, #trainingExamples=1587
Iteration #5 | average loss=0.4225120903723549, current cumulative
loss=670.5266874209271, previous cumulative loss=689.1162804392994, change
rate=0.026975988735198287, #trainingExamples=1587
Iteration #6 | average loss=0.41300825971798527, current cumulative
loss=655.4441081724426, previous cumulative loss=670.5266874209271, change
rate=0.022493630054453533, #trainingExamples=1587
Iteration #7 | average loss=0.40491514701335013, current cumulative
loss=642.6003383101867, previous cumulative loss=655.4441081724426, change
rate=0.019595522641995967, #trainingExamples=1587
Iteration #8 | average loss=0.3978014571916465, current cumulative
loss=631.310912563143, previous cumulative loss=642.6003383101867, change
rate=0.017568347033135524, #trainingExamples=1587
Iteration #9 | average loss=0.3914067263636397, current cumulative
loss=621.1624747390962, previous cumulative loss=631.310912563143, change
rate=0.016075182009517044, #trainingExamples=1587
Iteration #10 | average loss=0.3855609819906249, current cumulative
loss=611.8852784191217, previous cumulative loss=621.1624747390962, change
rate=0.014935216947661086, #trainingExamples=1587
Iteration #11 | average loss=0.3801467153362753, current cumulative
loss=603.2928372386689, previous cumulative loss=611.8852784191217, change
rate=0.01404256889894858, #trainingExamples=1587
Iteration #12 | average loss=0.3750791243746283, current cumulative
loss=595.2505703825351, previous cumulative loss=603.2928372386689, change
rate=0.01333061883005943, #trainingExamples=1587
Iteration #13 | average loss=0.37029474458756273, current cumulative
loss=587.657759660462, previous cumulative loss=595.2505703825351, change
rate=0.012755654676976761, #trainingExamples=1587
Iteration #14 | average loss=0.36574472099268607, current cumulative
loss=580.4368722153928, previous cumulative loss=587.657759660462, change
rate=0.012287572700888608, #trainingExamples=1587
Iteration #15 | average loss=0.3613904840032808, current cumulative
loss=573.5266981132066, previous cumulative loss=580.4368722153928, change
rate=0.011905126005885216, #trainingExamples=1587
Performed 15 iterations of 1,587 training examples on memory (thus 23,805
training updates in total)
```
> LogLoss: 0.4771035166468042
#### LIBFFM
```
$ ./ffm-train -k 4 -t 15 -l 0.00002 -r 0.2 -s 1 ../tr.sp model
First check if the text file has already been converted to binary format
(0.0 seconds)
Binary file NOT found. Convert text file to binary file (0.0 seconds)
iter tr_logloss tr_time
1 0.62043 0.0
2 0.47533 0.1
3 0.44968 0.1
4 0.43548 0.2
5 0.42261 0.2
6 0.41322 0.3
7 0.40489 0.3
8 0.39687 0.4
9 0.39085 0.4
10 0.38530 0.4
11 0.37965 0.5
12 0.37450 0.5
13 0.36937 0.6
14 0.36444 0.6
15 0.36031 0.7
$ ./ffm-predict ../va.sp model submission.csv
logloss = 0.47818
```
### Without linear terms (i.e., adding `-disable_wi` option)
#### Hivemall
```
Iteration #2 | average loss=0.539961924393562, current cumulative
loss=856.919574012583, previous cumulative loss=1651.6985545424677, change
rate=0.48118888179934516, #trainingExamples=1587
Iteration #3 | average loss=0.5106114115327627, current cumulative
loss=810.3403101024943, previous cumulative loss=856.919574012583, change
rate=0.05435663430113771, #trainingExamples=1587
Iteration #4 | average loss=0.4906722901321148, current cumulative
loss=778.6969244396662, previous cumulative loss=810.3403101024943, change
rate=0.03904950212686045, #trainingExamples=1587
Iteration #5 | average loss=0.4754916462118607, current cumulative
loss=754.6052425382229, previous cumulative loss=778.6969244396662, change
rate=0.030938457755922362, #trainingExamples=1587
Iteration #6 | average loss=0.46330291728471334, current cumulative
loss=735.2617297308401, previous cumulative loss=754.6052425382229, change
rate=0.025633949669257704, #trainingExamples=1587
Iteration #7 | average loss=0.453140805287918, current cumulative
loss=719.1344579919258, previous cumulative loss=735.2617297308401, change
rate=0.021934055706691043, #trainingExamples=1587
Iteration #8 | average loss=0.44439540937886607, current cumulative
loss=705.2555146842604, previous cumulative loss=719.1344579919258, change
rate=0.019299510895946, #trainingExamples=1587
Iteration #9 | average loss=0.4366611986545602, current cumulative
loss=692.9813222647871, previous cumulative loss=705.2555146842604, change
rate=0.017403894282157387, #trainingExamples=1587
Iteration #11 | average loss=0.42321511843877446, current cumulative
loss=671.6423929623351, previous cumulative loss=681.8770641514493, change
rate=0.015009554840872389, #trainingExamples=1587
Iteration #12 | average loss=0.4171781468097722, current cumulative
loss=662.0617189871085, previous cumulative loss=671.6423929623351, change
rate=0.01426454624606136, #trainingExamples=1587
Iteration #13 | average loss=0.411451696404218, current cumulative
loss=652.973842193494, previous cumulative loss=662.0617189871085, change
rate=0.013726630815504848, #trainingExamples=1587
Iteration #14 | average loss=0.40595767772793845, current cumulative
loss=644.2548345542383, previous cumulative loss=652.973842193494, change
rate=0.013352767103145282, #trainingExamples=1587
Iteration #15 | average loss=0.4006353270154049, current cumulative
loss=635.8082639734475, previous cumulative loss=644.2548345542383, change
rate=0.013110604884532947, #trainingExamples=1587
Performed 15 iterations of 1,587 training examples on memory (thus 23,805
training updates in total)
```
> LogLoss: 0.4757278678816663
#### LIBFFFM
```
$ ./ffm-train -k 4 -t 15 -l 0.00002 -r 0.2 -s 1 --disable-wi ../tr.sp model
First check if the text file has already been converted to binary format
(0.0 seconds)
Binary file found. Skip converting text to binary
iter tr_logloss tr_time
1 1.03199 0.1
2 0.53894 0.1
3 0.51018 0.1
4 0.49096 0.2
5 0.47549 0.2
6 0.46334 0.3
7 0.45313 0.3
8 0.44405 0.3
9 0.43662 0.4
10 0.42985 0.4
11 0.42337 0.5
12 0.41732 0.5
13 0.41140 0.6
14 0.40583 0.6
15 0.40049 0.6
$ ./ffm-predict ../va.sp model submission.csv
logloss = 0.47284
```
FFM w/o linear terms works slightly better in both Hivemall and LIBFFM.
---