[GitHub] incubator-hivemall issue #149: [WIP][HIVEMALL-201] Evaluate, fix and documen...

2018-05-28 Thread myui
Github user myui commented on the issue: https://github.com/apache/incubator-hivemall/pull/149 This kind of behavior could often be happen and Libffm's early stopping strategy is too aggressive. ``` 7 0.43239 0.46952 8 0.42362 0.46999

[GitHub] incubator-hivemall issue #149: [WIP][HIVEMALL-201] Evaluate, fix and documen...

2018-05-28 Thread takuti
Github user takuti commented on the issue: https://github.com/apache/incubator-hivemall/pull/149 Make sense as a compromise in terms of memory consumption. I'll note on documentation to clarify the fact that our `-early_stopping` option does not return the best of the best

[GitHub] incubator-hivemall issue #149: [WIP][HIVEMALL-201] Evaluate, fix and documen...

2018-05-28 Thread myui
Github user myui commented on the issue: https://github.com/apache/incubator-hivemall/pull/149 ``` iter tr_logloss va_logloss 1 0.49738 0.48776 2 0.47383 0.47995 3 0.46366 0.47480 4 0.45561

[GitHub] incubator-hivemall issue #149: [WIP][HIVEMALL-201] Evaluate, fix and documen...

2018-05-23 Thread myui
Github user myui commented on the issue: https://github.com/apache/incubator-hivemall/pull/149 It might be better to reconsider `eta0` when enabling `l2norm` by the default and by enlarging`max_init_size`. In my experience for FM, init random size should be small when the avg feature

[GitHub] incubator-hivemall issue #149: [WIP][HIVEMALL-201] Evaluate, fix and documen...

2018-05-22 Thread myui
Github user myui commented on the issue: https://github.com/apache/incubator-hivemall/pull/149 @takuti so then, better to enable l2_norm by the default and `-disable_l2norm` to disable l2 normalization. My concern is that L2 normalization performed worse for small datasets with

[GitHub] incubator-hivemall issue #149: [WIP][HIVEMALL-201] Evaluate, fix and documen...

2018-05-22 Thread takuti
Github user takuti commented on the issue: https://github.com/apache/incubator-hivemall/pull/149 I'll change default options and consider to implement early stopping option as you suggested. > What happens without `-l2norm` ? Once we drop instance-wise L2

[GitHub] incubator-hivemall issue #149: [WIP][HIVEMALL-201] Evaluate, fix and documen...

2018-05-22 Thread myui
Github user myui commented on the issue: https://github.com/apache/incubator-hivemall/pull/149 Also, it's better to revise default `-iters` from 1 to 10 (at least 10 iterations with early stopping). ---

[GitHub] incubator-hivemall issue #149: [WIP][HIVEMALL-201] Evaluate, fix and documen...

2018-05-22 Thread myui
Github user myui commented on the issue: https://github.com/apache/incubator-hivemall/pull/149 BTW, it might be better to implement `early stopping` using validation data. https://github.com/guestwalk/libffm We can use a similar approaches to `_validationRatio` used in

[GitHub] incubator-hivemall issue #149: [WIP][HIVEMALL-201] Evaluate, fix and documen...

2018-05-22 Thread myui
Github user myui commented on the issue: https://github.com/apache/incubator-hivemall/pull/149 @takuti Thank you for detailed verification. Let's disable linear term by the default. Remove `-disable_wi` and `-enable_wi` (alias `-linear_term` ) to enable linear term.

[GitHub] incubator-hivemall issue #149: [WIP][HIVEMALL-201] Evaluate, fix and documen...

2018-05-22 Thread takuti
Github user takuti commented on the issue: https://github.com/apache/incubator-hivemall/pull/149 ### With linear terms Hivemall ```sql INSERT OVERWRITE TABLE criteo.ffm_model SELECT train_ffm(features, label, '-init_v random -max_init_value 0.5

[GitHub] incubator-hivemall issue #149: [WIP][HIVEMALL-201] Evaluate, fix and documen...

2018-05-17 Thread myui
Github user myui commented on the issue: https://github.com/apache/incubator-hivemall/pull/149 @takuti I advice to check 2-3 updates to investigate how gradient updates differ. ---

[GitHub] incubator-hivemall issue #149: [WIP][HIVEMALL-201] Evaluate, fix and documen...

2018-05-17 Thread takuti
Github user takuti commented on the issue: https://github.com/apache/incubator-hivemall/pull/149 Note: I've extended LIBFFM code so it uses linear terms: https://github.com/takuti/criteo-ffm/commit/9aca61d93ed8f583025729206ed0dbfd54806a44 However, I cannot observe significant

[GitHub] incubator-hivemall issue #149: [WIP][HIVEMALL-201] Evaluate, fix and documen...

2018-05-17 Thread takuti
Github user takuti commented on the issue: https://github.com/apache/incubator-hivemall/pull/149 Evaluation has been conducted at: [takuti/criteo-ffm](https://github.com/takuti/criteo-ffm). See the repository for detail. As an example, I have used tiny data provided at