Author: tommaso Date: Wed Oct 14 13:37:08 2015 New Revision: 1708601 URL: http://svn.apache.org/viewvc?rev=1708601&view=rev Log: minor improvements
Modified: labs/yay/trunk/core/src/main/java/org/apache/yay/core/BackPropagationLearningStrategy.java labs/yay/trunk/core/src/test/java/org/apache/yay/core/WordVectorsTest.java labs/yay/trunk/core/src/test/resources/word2vec/sentences.txt Modified: labs/yay/trunk/core/src/main/java/org/apache/yay/core/BackPropagationLearningStrategy.java URL: http://svn.apache.org/viewvc/labs/yay/trunk/core/src/main/java/org/apache/yay/core/BackPropagationLearningStrategy.java?rev=1708601&r1=1708600&r2=1708601&view=diff ============================================================================== --- labs/yay/trunk/core/src/main/java/org/apache/yay/core/BackPropagationLearningStrategy.java (original) +++ labs/yay/trunk/core/src/main/java/org/apache/yay/core/BackPropagationLearningStrategy.java Wed Oct 14 13:37:08 2015 @@ -147,7 +147,11 @@ public class BackPropagationLearningStra } } } - updatedParameters[l] = new Array2DRowRealMatrix(updatedWeights); + if (updatedParameters[l] != null) { + updatedParameters[l].setSubMatrix(updatedWeights, 0, 0); + } else { + updatedParameters[l] = new Array2DRowRealMatrix(updatedWeights); + } } return updatedParameters; } Modified: labs/yay/trunk/core/src/test/java/org/apache/yay/core/WordVectorsTest.java URL: http://svn.apache.org/viewvc/labs/yay/trunk/core/src/test/java/org/apache/yay/core/WordVectorsTest.java?rev=1708601&r1=1708600&r2=1708601&view=diff ============================================================================== --- labs/yay/trunk/core/src/test/java/org/apache/yay/core/WordVectorsTest.java (original) +++ labs/yay/trunk/core/src/test/java/org/apache/yay/core/WordVectorsTest.java Wed Oct 14 13:37:08 2015 @@ -146,7 +146,6 @@ public class WordVectorsTest { ObjectOutputStream os = new ObjectOutputStream(new FileOutputStream(new File("target/sg-vectors.bin"))); MatrixUtils.serializeRealMatrix(vectorsMatrix, os); - } private String hotDecode(Double[] doubles, List<String> vocabulary) { Modified: labs/yay/trunk/core/src/test/resources/word2vec/sentences.txt URL: http://svn.apache.org/viewvc/labs/yay/trunk/core/src/test/resources/word2vec/sentences.txt?rev=1708601&r1=1708600&r2=1708601&view=diff ============================================================================== --- labs/yay/trunk/core/src/test/resources/word2vec/sentences.txt (original) +++ labs/yay/trunk/core/src/test/resources/word2vec/sentences.txt Wed Oct 14 13:37:08 2015 @@ -1,8 +1,8 @@ The word2vec software of Tomas Mikolov and colleagues has gained a lot of traction lately and provides state-of-the-art word embeddings The learning models behind the software are described in two research papers We found the description of the models in these papers to be somewhat cryptic and hard to follow -While the motivations and presentation may be obvious to the neural-networks language-modeling crowd we had to struggle quite a bit to figure out the rationale behind the equations -This note is an attempt to explain the negative sampling equation in âDistributed Representations of Words and Phrases and their Compositionalityâ by Tomas Mikolov Ilya Sutskever Kai Chen Greg Corrado and Jeffrey Dean +While the motivations and presentation may be obvious to the neural-networks language-mofdeling crowd we had to struggle quite a bit to figure out the rationale behind the equations +This note is an attempt to explain the negative sampling equation in Distributed Representations of Words and Phrases and their Compositionality by Tomas Mikolov Ilya Sutskever Kai Chen Greg Corrado and Jeffrey Dean The departure point of the paper is the skip-gram model In this model we are given a corpus of words w and their contexts c We consider the conditional probabilities p(c|w) and given a corpus Text the goal is to set the parameters θ of p(c|w;θ) so as to maximize the corpus probability @@ -11,7 +11,7 @@ In this paper we present several extensi By subsampling of the frequent words we obtain significant speedup and also learn more regular word representations We also describe a simple alternative to the hierarchical softmax called negative sampling An inherent limitation of word representations is their indifference to word order and their inability to represent idiomatic phrases -For example the meanings of âCanadaâ and âAirâ cannot be easily combined to obtain âAir Canadaâ +For example the meanings of Canada and Air cannot be easily combined to obtain Air Canada Motivated by this example we present a simple method for finding phrases in text and show that learning good vector representations for millions of phrases is possible The similarity metrics used for nearest neighbor evaluations produce a single scalar that quantifies the relatedness of two words This simplicity can be problematic since two given words almost always exhibit more intricate relationships than can be captured by a single number @@ -23,4 +23,15 @@ Unsupervised word representations are ve However most of these models are built with only local context and one representation per word This is problematic because words are often polysemous and global context can also provide useful information for learning word meanings We present a new neural network architecture which 1) learns word embeddings that better capture the semantics of words by incorporating both local and global document context and 2) accounts for homonymy and polysemy by learning multiple embeddings per word -We introduce a new dataset with human judgments on pairs of words in sentential context and evaluate our model on it showing that our model outperforms competitive baselines and other neural language models \ No newline at end of file +We introduce a new dataset with human judgments on pairs of words in sentential context and evaluate our model on it showing that our model outperforms competitive baselines and other neural language models +Information Retrieval (IR) models need to deal with two difficult issues vocabulary mismatch and term dependencies +Vocabulary mismatch corresponds to the difficulty of retrieving relevant documents that do not contain exact query terms but semantically related terms +Term dependencies refers to the need of considering the relationship between the words of the query when estimating the relevance of a document +A multitude of solutions has been proposed to solve each of these two problems but no principled model solve both +In parallel in the last few years language models based on neural networks have been used to cope with complex natural language processing tasks like emotion and paraphrase detection +Although they present good abilities to cope with both term dependencies and vocabulary mismatch problems thanks to the distributed representation of words they are based upon such models could not be used readily in IR where the estimation of one language model per document (or query) is required +This is both computationally unfeasible and prone to over-fitting +Based on a recent work that proposed to learn a generic language model that can be modified through a set of document-specific parameters we explore use of new neural network models that are adapted to ad-hoc IR tasks +Within the language model IR framework we propose and study the use of a generic language model as well as a document-specific language model +Both can be used as a smoothing component but the latter is more adapted to the document at hand and has the potential of being used as a full document language model +We experiment with such models and analyze their results on TREC-1 to 8 datasets \ No newline at end of file --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@labs.apache.org For additional commands, e-mail: commits-h...@labs.apache.org