[ 
https://issues.apache.org/jira/browse/NUTCH-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15223107#comment-15223107
 ] 

ASF GitHub Bot commented on NUTCH-2245:
---------------------------------------

Github user bhavyasanghavi commented on a diff in the pull request:

    https://github.com/apache/nutch/pull/101#discussion_r58303346
  
    --- Diff: 
src/plugin/scoring-similarity/src/java/org/apache/nutch/scoring/similarity/cosine/Model.java
 ---
    @@ -115,6 +126,7 @@ public static DocVector createDocVector(String content) 
{
           tStream.reset();
           while(tStream.incrementToken()) {
             String term = charTermAttribute.toString();
    +        LOG.info(term);
    --- End diff --
    
    Updated it. Thanks.


> Developed the NGram Model on the existing Unigram Cosine Similarity Model
> -------------------------------------------------------------------------
>
>                 Key: NUTCH-2245
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2245
>             Project: Nutch
>          Issue Type: New Feature
>          Components: plugin, scoring
>            Reporter: Bhavya Sanghavi
>            Assignee: Sujen Shah
>            Priority: Minor
>              Labels: memex
>             Fix For: 1.12
>
>
> Built on the existing unigram cosine similarity model by adding the Ngram 
> model, thus providing flexibility to the user to choose the window size for 
> scoring the similarity between webpages and the gold standard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to