Dear All,

We are happy to announce the release of our new toolkit “MultiVec” for
computing continuous representations for text at different granularity
levels (word-level or sequences of words). MultiVec includes Mikolov et al.
[2013b]’s word2vec features, Le and Mikolov [2014]’s paragraph vector (batch
and online) and Luong et al. [2015]’s model for bilingual distributed
representations. MultiVec also includes different distance measures between
words and sequences of words. The toolkit is written in C++ and is aimed at
being fast (in the same order of magnitude as word2vec), easy to use, and
easy to extend. It has been evaluated on several NLP tasks: the analogical
reasoning task, sentiment analysis, and crosslingual document
classification. The toolkit also includes C++ and Python libraries, that you
can use to query bilingual and monolingual models.

The project is fully open to future contributions. The code is provided on
the project webpage (https://github.com/eske/multivec) with installation
instructions and command-line usage examples.

When you use this toolkit, please cite:

@InProceedings{MultiVecLREC2016,
Title                    = {{MultiVec: a Multilingual and MultiLevel
Representation Learning Toolkit for NLP}},
Author                   = {Alexandre Bérard and Christophe Servan and
Olivier Pietquin and Laurent Besacier},
Booktitle                = {The 10th edition of the Language Resources and
Evaluation Conference (LREC 2016)},
Year                     = {2016},
Month                    = {May}
}

The paper is available here:
https://github.com/eske/multivec/raw/master/docs/Berard_and_al-MultiVec_a_Mu
ltilingual_and_Multilevel_Representation_Learning_Toolkit_for_NLP-LREC2016.p
df

Best regards,

Alexandre Bérard, Christophe Servan, Olivier Pietquin and Laurent Besacier

(With apologies for cross-posting)


_______________________________________________
Mt-list site list
[email protected]
http://lists.eamt.org/mailman/listinfo/mt-list

Reply via email to