[GitHub] incubator-hivemall issue #66: [HIVEMALL-91] Implement Online LDA

2017-04-20 Thread myui
Github user myui commented on the issue:

https://github.com/apache/incubator-hivemall/pull/66
  
@takuti Merged w/ some refactoring. Great work! Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall issue #66: [HIVEMALL-91] Implement Online LDA

2017-04-20 Thread takuti
Github user takuti commented on the issue:

https://github.com/apache/incubator-hivemall/pull/66
  
**Note on the performance**

For 
[news20-multiclass](https://github.com/apache/incubator-hivemall/tree/master/core/src/test/resources/hivemall/classifier)
 data, I have translated [our Java test 
case](https://github.com/takuti/incubator-hivemall/blob/709848d5626f0df7e7361511224e0e9284b3484d/core/src/test/java/hivemall/topicmodel/OnlineLDAModelTest.java#L147-L223)
 to [Python scikit-learn 
implementation](https://github.com/takuti-sandbox/tmp/blob/57f740a3d0283e5586cc2cd170a8dd15b9cf96ac/python/lda/news20.py)
 w/ (almost) same setting.

In our Java code, unit test finishes in **8 sec** w/ approximately 30 
iterations. By contrast, the Python implementation takes around **15 sec** for 
30 iterations. Thus, even if `train_lda()` takes very long time for large-scale 
data, it should be natural. Hopefully, larger `-delta`, smaller `-iteration` or 
smaller `-eps` option could reduce running time (and end up w/ poor results).

* Python code actually creates and handles a 20-by-62061 huge, sparse 
matrix. It might be unfair, but Java code alternatively has many inefficient 
Map and Array accesses.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall issue #66: [HIVEMALL-91] Implement Online LDA

2017-04-20 Thread coveralls
Github user coveralls commented on the issue:

https://github.com/apache/incubator-hivemall/pull/66
  

[![Coverage 
Status](https://coveralls.io/builds/11159512/badge)](https://coveralls.io/builds/11159512)

Coverage increased (+1.04%) to 38.063% when pulling 
**97adc5ce3d22e10e485c4f190b0a488db69d99e5 on takuti:lda** into 
**bba252ac10fccda022b630e3137460dd8d2f9302 on apache:master**.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall issue #66: [HIVEMALL-91] Implement Online LDA

2017-04-20 Thread coveralls
Github user coveralls commented on the issue:

https://github.com/apache/incubator-hivemall/pull/66
  

[![Coverage 
Status](https://coveralls.io/builds/11159290/badge)](https://coveralls.io/builds/11159290)

Coverage increased (+1.3%) to 38.364% when pulling 
**d781b6602538577202fcb571b12b4ffd3e5ab92d on takuti:lda** into 
**bba252ac10fccda022b630e3137460dd8d2f9302 on apache:master**.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall issue #66: [HIVEMALL-91] Implement Online LDA

2017-04-18 Thread coveralls
Github user coveralls commented on the issue:

https://github.com/apache/incubator-hivemall/pull/66
  

[![Coverage 
Status](https://coveralls.io/builds/11139515/badge)](https://coveralls.io/builds/11139515)

Coverage increased (+0.9%) to 37.962% when pulling 
**3a78282afa0faedc678da237351c63105328b6d6 on takuti:lda** into 
**bba252ac10fccda022b630e3137460dd8d2f9302 on apache:master**.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall issue #66: [HIVEMALL-91] Implement Online LDA

2017-04-18 Thread coveralls
Github user coveralls commented on the issue:

https://github.com/apache/incubator-hivemall/pull/66
  

[![Coverage 
Status](https://coveralls.io/builds/11139395/badge)](https://coveralls.io/builds/11139395)

Coverage increased (+0.9%) to 37.948% when pulling 
**3a78282afa0faedc678da237351c63105328b6d6 on takuti:lda** into 
**bba252ac10fccda022b630e3137460dd8d2f9302 on apache:master**.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall issue #66: [HIVEMALL-91] Implement Online LDA

2017-04-18 Thread coveralls
Github user coveralls commented on the issue:

https://github.com/apache/incubator-hivemall/pull/66
  

[![Coverage 
Status](https://coveralls.io/builds/11122400/badge)](https://coveralls.io/builds/11122400)

Coverage increased (+1.2%) to 38.211% when pulling 
**9cf6a79dbaf7dbdf38176cf39023f4800f6d2b6a on takuti:lda** into 
**8aae974fc39cd16080acdf7e493152d7167aa9e7 on apache:master**.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall issue #66: [HIVEMALL-91] Implement Online LDA

2017-04-18 Thread takuti
Github user takuti commented on the issue:

https://github.com/apache/incubator-hivemall/pull/66
  
Updating docs and testing on EMR...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall issue #66: [HIVEMALL-91] Implement Online LDA

2017-04-18 Thread coveralls
Github user coveralls commented on the issue:

https://github.com/apache/incubator-hivemall/pull/66
  

[![Coverage 
Status](https://coveralls.io/builds/11121280/badge)](https://coveralls.io/builds/11121280)

Coverage increased (+0.9%) to 37.901% when pulling 
**7028a0f21a801242149328ecf359a6c023f8d7f9 on takuti:lda** into 
**8aae974fc39cd16080acdf7e493152d7167aa9e7 on apache:master**.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall issue #66: [HIVEMALL-91] Implement Online LDA

2017-04-18 Thread coveralls
Github user coveralls commented on the issue:

https://github.com/apache/incubator-hivemall/pull/66
  

[![Coverage 
Status](https://coveralls.io/builds/11121280/badge)](https://coveralls.io/builds/11121280)

Coverage increased (+0.9%) to 37.901% when pulling 
**7028a0f21a801242149328ecf359a6c023f8d7f9 on takuti:lda** into 
**8aae974fc39cd16080acdf7e493152d7167aa9e7 on apache:master**.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall issue #66: [HIVEMALL-91] Implement Online LDA

2017-04-17 Thread coveralls
Github user coveralls commented on the issue:

https://github.com/apache/incubator-hivemall/pull/66
  

[![Coverage 
Status](https://coveralls.io/builds/11108695/badge)](https://coveralls.io/builds/11108695)

Coverage increased (+1.2%) to 38.219% when pulling 
**e31fc1a6bb2d4b64fab49d8ec1c2c48304356655 on takuti:lda** into 
**8aae974fc39cd16080acdf7e493152d7167aa9e7 on apache:master**.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall issue #66: [HIVEMALL-91] Implement Online LDA

2017-04-14 Thread coveralls
Github user coveralls commented on the issue:

https://github.com/apache/incubator-hivemall/pull/66
  

[![Coverage 
Status](https://coveralls.io/builds/11084626/badge)](https://coveralls.io/builds/11084626)

Coverage increased (+0.9%) to 37.916% when pulling 
**72edf7fc75e3dd2dcb54dc0f743518f79e8112b7 on takuti:lda** into 
**8aae974fc39cd16080acdf7e493152d7167aa9e7 on apache:master**.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall issue #66: [HIVEMALL-91] Implement Online LDA

2017-04-13 Thread takuti
Github user takuti commented on the issue:

https://github.com/apache/incubator-hivemall/pull/66
  
@myui Updated the package name, and, since #63 has been merged, applied 
some modifications. Again, this PR is ready for review 👍 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall issue #66: [HIVEMALL-91] Implement Online LDA

2017-04-12 Thread myui
Github user myui commented on the issue:

https://github.com/apache/incubator-hivemall/pull/66
  
@takuti Could you change package from `hivemall.clustering` to 
`hivemall.topicmodel`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall issue #66: [HIVEMALL-91] Implement Online LDA

2017-04-12 Thread coveralls
Github user coveralls commented on the issue:

https://github.com/apache/incubator-hivemall/pull/66
  

[![Coverage 
Status](https://coveralls.io/builds/11066675/badge)](https://coveralls.io/builds/11066675)

Coverage increased (+1.1%) to 37.855% when pulling 
**75773bedffe1d21ad98f71ef54e062e6b42c81e2 on takuti:lda** into 
**ac1e2e8dba1073ca7b52f22faf17b9c13ffad4bd on apache:master**.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall issue #66: [HIVEMALL-91] Implement Online LDA

2017-04-12 Thread coveralls
Github user coveralls commented on the issue:

https://github.com/apache/incubator-hivemall/pull/66
  

[![Coverage 
Status](https://coveralls.io/builds/11050144/badge)](https://coveralls.io/builds/11050144)

Coverage increased (+1.4%) to 38.139% when pulling 
**c3807c080e09e6ded09b568d59b92e13989bf2f0 on takuti:lda** into 
**ac1e2e8dba1073ca7b52f22faf17b9c13ffad4bd on apache:master**.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall issue #66: [HIVEMALL-91] Implement Online LDA

2017-04-12 Thread coveralls
Github user coveralls commented on the issue:

https://github.com/apache/incubator-hivemall/pull/66
  

[![Coverage 
Status](https://coveralls.io/builds/11050144/badge)](https://coveralls.io/builds/11050144)

Coverage increased (+1.4%) to 38.139% when pulling 
**c3807c080e09e6ded09b568d59b92e13989bf2f0 on takuti:lda** into 
**ac1e2e8dba1073ca7b52f22faf17b9c13ffad4bd on apache:master**.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall issue #66: [HIVEMALL-91] Implement Online LDA

2017-04-11 Thread takuti
Github user takuti commented on the issue:

https://github.com/apache/incubator-hivemall/pull/66
  
Since 
[castLazyBinaryObject()](https://github.com/takuti/incubator-hivemall/blob/810f5409eba8dff131e5d3b44069fb1182fa46cc/core/src/main/java/hivemall/utils/hadoop/HiveUtils.java#L932-L939)
 is helpful to improve readability, plz merge #63 before this PR if possible.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall issue #66: [HIVEMALL-91] Implement Online LDA

2017-04-10 Thread coveralls
Github user coveralls commented on the issue:

https://github.com/apache/incubator-hivemall/pull/66
  

[![Coverage 
Status](https://coveralls.io/builds/11013730/badge)](https://coveralls.io/builds/11013730)

Coverage increased (+1.1%) to 37.846% when pulling 
**05a13dd25cd7b7da109ca5978abaf0c5cc9a4058 on takuti:lda** into 
**ac1e2e8dba1073ca7b52f22faf17b9c13ffad4bd on apache:master**.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall issue #66: [HIVEMALL-91] Implement Online LDA

2017-04-10 Thread takuti
Github user takuti commented on the issue:

https://github.com/apache/incubator-hivemall/pull/66
  
@myui Ready for review. 
[Usage](https://gist.github.com/takuti/d24324e76d4b2ec7dc4b1d50a4d192d8) has 
been updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---