Hello again,

@Thamme, out of curiosity, do you have evaluation numbers on the
Stanford Large Movie Review dataset?

Best,

Rodrigo

On Wed, Jul 5, 2017 at 9:25 AM, Rodrigo Agerri <[email protected]> wrote:
> +1 to Tommaso's comment. This would be very nice to have in the project.
>
> R
>
> On Wed, Jul 5, 2017 at 9:19 AM, Tommaso Teofili
> <[email protected]> wrote:
>> thanks Thamme for bringing this to the list!
>>
>>
>> Il giorno mer 5 lug 2017 alle ore 03:49 Thamme Gowda <[email protected]> ha
>> scritto:
>>
>>> Hello OpenNLP Devs,
>>>
>>> I am working with text classification using word embeddings like
>>> Gloves/Word2Vec and LSTM networks.
>>> It will be interesting to see if we can use it as document categorizer,
>>> especially for sentiment analysis in OpenNLP.
>>>
>>> I have already raised a PR to the sandbox repo -
>>> https://github.com/apache/opennlp-sandbox/pull/3
>>>
>>> This is first version, and I expect to receive feedback from Dev community
>>> to make it work for everyone.
>>>
>>> Here are the design choices I have made for the initial version:
>>>
>>>    - Using pre-trained Gloves - I felt the glove vector format is clean,
>>>    easily customizable in terms of dimensions and vocabulary size, and
>>> (also I
>>>    have been reading a lot about them from Stanford NLP group).
>>>       - Training Gloves isnt hard either, we can do it using the original C
>>>       library as well as by using DL4J.
>>>       - Using DL4J's Multi layer networks with LSTM instead of reinventing
>>>    this stuff again on JVM for OpenNLP
>>>
>>>
>>> Please share your feedback here or on the github page
>>> https://github.com/apache/opennlp-sandbox/pull/3 .
>>>
>>>
>> I think the approach outlined here sounds good, I think we could
>> incorporate the PR as soon as it implements the Doccat API.
>> Then we may see whether and how it makes sense to adjust it to use other
>> types of embeddings (e.g. paragraph vectors) and / or different network
>> setups (e.g. more hidden layers, bidirectionalLSTM, etc.).
>>
>> Looking forward to see this move forward,
>> Regards,
>> Tommaso
>>
>>
>>>
>>> Thanks,
>>> TG
>>>
>>>
>>> --
>>> *Thamme Gowda *
>>> @thammegowda <https://twitter.com/thammegowda> |
>>> http://scf.usc.edu/~tnarayan/
>>> ~Sent via somebody's Webmail server
>>>

Reply via email to