Author: tommaso
Date: Wed Oct 14 13:37:08 2015
New Revision: 1708601

URL: http://svn.apache.org/viewvc?rev=1708601&view=rev
Log:
minor improvements

Modified:
    
labs/yay/trunk/core/src/main/java/org/apache/yay/core/BackPropagationLearningStrategy.java
    labs/yay/trunk/core/src/test/java/org/apache/yay/core/WordVectorsTest.java
    labs/yay/trunk/core/src/test/resources/word2vec/sentences.txt

Modified: 
labs/yay/trunk/core/src/main/java/org/apache/yay/core/BackPropagationLearningStrategy.java
URL: 
http://svn.apache.org/viewvc/labs/yay/trunk/core/src/main/java/org/apache/yay/core/BackPropagationLearningStrategy.java?rev=1708601&r1=1708600&r2=1708601&view=diff
==============================================================================
--- 
labs/yay/trunk/core/src/main/java/org/apache/yay/core/BackPropagationLearningStrategy.java
 (original)
+++ 
labs/yay/trunk/core/src/main/java/org/apache/yay/core/BackPropagationLearningStrategy.java
 Wed Oct 14 13:37:08 2015
@@ -147,7 +147,11 @@ public class BackPropagationLearningStra
           }
         }
       }
-      updatedParameters[l] = new Array2DRowRealMatrix(updatedWeights);
+      if (updatedParameters[l] != null) {
+        updatedParameters[l].setSubMatrix(updatedWeights, 0, 0);
+      } else {
+        updatedParameters[l] = new Array2DRowRealMatrix(updatedWeights);
+      }
     }
     return updatedParameters;
   }

Modified: 
labs/yay/trunk/core/src/test/java/org/apache/yay/core/WordVectorsTest.java
URL: 
http://svn.apache.org/viewvc/labs/yay/trunk/core/src/test/java/org/apache/yay/core/WordVectorsTest.java?rev=1708601&r1=1708600&r2=1708601&view=diff
==============================================================================
--- labs/yay/trunk/core/src/test/java/org/apache/yay/core/WordVectorsTest.java 
(original)
+++ labs/yay/trunk/core/src/test/java/org/apache/yay/core/WordVectorsTest.java 
Wed Oct 14 13:37:08 2015
@@ -146,7 +146,6 @@ public class WordVectorsTest {
 
     ObjectOutputStream os = new ObjectOutputStream(new FileOutputStream(new 
File("target/sg-vectors.bin")));
     MatrixUtils.serializeRealMatrix(vectorsMatrix, os);
-
   }
 
   private String hotDecode(Double[] doubles, List<String> vocabulary) {

Modified: labs/yay/trunk/core/src/test/resources/word2vec/sentences.txt
URL: 
http://svn.apache.org/viewvc/labs/yay/trunk/core/src/test/resources/word2vec/sentences.txt?rev=1708601&r1=1708600&r2=1708601&view=diff
==============================================================================
--- labs/yay/trunk/core/src/test/resources/word2vec/sentences.txt (original)
+++ labs/yay/trunk/core/src/test/resources/word2vec/sentences.txt Wed Oct 14 
13:37:08 2015
@@ -1,8 +1,8 @@
 The word2vec software of Tomas Mikolov and colleagues has gained a lot of 
traction lately and provides state-of-the-art word embeddings
 The learning models behind the software are described in two research papers
 We found the description of the models in these papers to be somewhat cryptic 
and hard to follow
-While the motivations and presentation may be obvious to the neural-networks 
language-modeling crowd we had to struggle quite a bit to figure out the 
rationale behind the equations
-This note is an attempt to explain the negative sampling equation in 
“Distributed Representations of Words and Phrases and their 
Compositionality” by Tomas Mikolov Ilya Sutskever Kai Chen Greg Corrado and 
Jeffrey Dean
+While the motivations and presentation may be obvious to the neural-networks 
language-mofdeling crowd we had to struggle quite a bit to figure out the 
rationale behind the equations
+This note is an attempt to explain the negative sampling equation in 
Distributed Representations of Words and Phrases and their Compositionality by 
Tomas Mikolov Ilya Sutskever Kai Chen Greg Corrado and Jeffrey Dean
 The departure point of the paper is the skip-gram model
 In this model we are given a corpus of words w and their contexts c
 We consider the conditional probabilities p(c|w) and given a corpus Text the 
goal is to set the parameters θ of p(c|w;θ) so as to maximize the corpus 
probability
@@ -11,7 +11,7 @@ In this paper we present several extensi
 By subsampling of the frequent words we obtain significant speedup and also 
learn more regular word representations
 We also describe a simple alternative to the hierarchical softmax called 
negative sampling
 An inherent limitation of word representations is their indifference to word 
order and their inability to represent idiomatic phrases
-For example the meanings of “Canada” and “Air” cannot be easily 
combined to obtain “Air Canada”
+For example the meanings of Canada and Air cannot be easily combined to obtain 
Air Canada
 Motivated by this example we present a simple method for finding phrases in 
text and show that learning good vector representations for millions of phrases 
is possible
 The similarity metrics used for nearest neighbor evaluations produce a single 
scalar that quantifies the relatedness of two words
 This simplicity can be problematic since two given words almost always exhibit 
more intricate relationships than can be captured by a single number
@@ -23,4 +23,15 @@ Unsupervised word representations are ve
 However most of these models are built with only local context and one 
representation per word
 This is problematic because words are often polysemous and global context can 
also provide useful information for learning word meanings
 We present a new neural network architecture which 1) learns word embeddings 
that better capture the semantics of words by incorporating both local and 
global document context and 2) accounts for homonymy and polysemy by learning 
multiple embeddings per word
-We introduce a new dataset with human judgments on pairs of words in 
sentential context and evaluate our model on it showing that our model 
outperforms competitive baselines and other neural language models
\ No newline at end of file
+We introduce a new dataset with human judgments on pairs of words in 
sentential context and evaluate our model on it showing that our model 
outperforms competitive baselines and other neural language models
+Information Retrieval (IR) models need to deal with two difficult issues 
vocabulary mismatch and term dependencies
+Vocabulary mismatch corresponds to the difficulty of retrieving relevant 
documents that do not contain exact query terms but semantically related terms
+Term dependencies refers to the need of considering the relationship between 
the words of the query when estimating the relevance of a document
+A multitude of solutions has been proposed to solve each of these two problems 
but no principled model solve both
+In parallel in the last few years language models based on neural networks 
have been used to cope with complex natural language processing tasks like 
emotion and paraphrase detection
+Although they present good abilities to cope with both term dependencies and 
vocabulary mismatch problems thanks to the distributed representation of words 
they are based upon such models could not be used readily in IR where the 
estimation of one language model per document (or query) is required
+This is both computationally unfeasible and prone to over-fitting
+Based on a recent work that proposed to learn a generic language model that 
can be modified through a set of document-specific parameters we explore use of 
new neural network models that are adapted to ad-hoc IR tasks
+Within the language model IR framework we propose and study the use of a 
generic language model as well as a document-specific language model
+Both can be used as a smoothing component but the latter is more adapted to 
the document at hand and has the potential of being used as a full document 
language model
+We experiment with such models and analyze their results on TREC-1 to 8 
datasets
\ No newline at end of file



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@labs.apache.org
For additional commands, e-mail: commits-h...@labs.apache.org

Reply via email to