Hi Daniel The explanation was crisp and to the point. Thanks a lot.
-Arun On Tue, Feb 7, 2017 at 12:04 AM, Russ, Daniel (NIH/CIT) [E] < [email protected]> wrote: > I would like to answer your questions in reverse order… > 5. How Maximum entropy works ? > see A Maximum Entropy approach to NLP Berger, Della Pietra, Della Pietra. > In Journal of Computation Lingutistics 22:1 (just google it…) > In a nutshell, if you have no information all outcomes are equally > likely. Every training case (Berger calls these constraints) changes the > probability of an outcome. > > 4. What happenes during training? > (Assuming GIS training) Each predicate (feature, word) is assigned a > weight to each output (when it co-occurs with an output). Weights are > assigned to maximize the likelihood of correctly classifying a case > > 3. How a test case is classified ? for each predicate/output there is > a weight. For each predicate in your test case, the outcome with the > highest product of the weights is selected. Note that the output is > normalized so that the sum of all outputs is one. > > 2. I am guessing something like a running sum of the log (product of > weights for the predicates of the output for the training case > output)/(product of all the weight). You should check the code. > > 1. What is happening during each iteration [of the training] ? The > weight are initialize to a value of 0. Kind of useless ‘eh. So each > interation improves the values for the weights based on your training > data. For more info. Manning’s Foundations of Statistical Natural > Language Processing has a good description of GIS. > > Hope that helps. > > On 2/6/17, 12:38 PM, "Manoj B. Narayanan" <[email protected]> > wrote: > > Hi, > > I have been using Open NLP for a while now. I have been training models > with custom data along with predefined features as well as custom > features. > Could someone explain me/ guide me to some documentation of what is > happening internally. > > The thing I am particularly interested are : > 1. What is happening during each iteration ? > 2. How the log likelihood and probability is calculated at each step ? > 3. How a test case is classified ? > 4. What happens during training ? > 5. How Maximum entropy works ? > > Someone please guide me. > > Thanks. > Manoj > > > --
