Re: Document Categorizer based on Glove + LSTM (powered by DL4J)

Rodrigo Agerri Wed, 05 Jul 2017 00:26:56 -0700

+1 to Tommaso's comment. This would be very nice to have in the project.

R


On Wed, Jul 5, 2017 at 9:19 AM, Tommaso Teofili
<tommaso.teof...@gmail.com> wrote:
> thanks Thamme for bringing this to the list!
>
>
> Il giorno mer 5 lug 2017 alle ore 03:49 Thamme Gowda <tgow...@gmail.com> ha
> scritto:
>
>> Hello OpenNLP Devs,
>>
>> I am working with text classification using word embeddings like
>> Gloves/Word2Vec and LSTM networks.
>> It will be interesting to see if we can use it as document categorizer,
>> especially for sentiment analysis in OpenNLP.
>>
>> I have already raised a PR to the sandbox repo -
>> https://github.com/apache/opennlp-sandbox/pull/3
>>
>> This is first version, and I expect to receive feedback from Dev community
>> to make it work for everyone.
>>
>> Here are the design choices I have made for the initial version:
>>
>>    - Using pre-trained Gloves - I felt the glove vector format is clean,
>>    easily customizable in terms of dimensions and vocabulary size, and
>> (also I
>>    have been reading a lot about them from Stanford NLP group).
>>       - Training Gloves isnt hard either, we can do it using the original C
>>       library as well as by using DL4J.
>>       - Using DL4J's Multi layer networks with LSTM instead of reinventing
>>    this stuff again on JVM for OpenNLP
>>
>>
>> Please share your feedback here or on the github page
>> https://github.com/apache/opennlp-sandbox/pull/3 .
>>
>>
> I think the approach outlined here sounds good, I think we could
> incorporate the PR as soon as it implements the Doccat API.
> Then we may see whether and how it makes sense to adjust it to use other
> types of embeddings (e.g. paragraph vectors) and / or different network
> setups (e.g. more hidden layers, bidirectionalLSTM, etc.).
>
> Looking forward to see this move forward,
> Regards,
> Tommaso
>
>
>>
>> Thanks,
>> TG
>>
>>
>> --
>> *Thamme Gowda *
>> @thammegowda <https://twitter.com/thammegowda> |
>> http://scf.usc.edu/~tnarayan/
>> ~Sent via somebody's Webmail server
>>

Re: Document Categorizer based on Glove + LSTM (powered by DL4J)

Reply via email to