[ 
https://issues.apache.org/jira/browse/SOLR-651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641784#action_12641784
 ] 

Grant Ingersoll commented on SOLR-651:
--------------------------------------

{quote}
a) if one set qt=tvrh tv=true does not make sense as the user is anyway 
requesting term vectors.
{quote}

qt=tvrh is setting the request handler.  tv=true is turning on the 
TermVectorComponent (TVC).  Those are two separate actions.  The TVC can be 
added to any ReqHandler.

{quote}
b) Repeating uniqueKeyFieldName for every record does not add any value.
{quote}

Good point.  I will move that out.

{quote}
c) Ideally what I would have liked is something below.
<response>
<lst name="termVectors">
<lst name="firstdoc">
<str name="term1">term1tf-idf</str>
<str name="term2">term1tf-idf</str>
....
{quote}

Can you make a case for that?  I have a couple of issues with it.  First, the 
term is incorrectly typed.  Presumably TF-IDF is a double.  Second, it requires 
Solr to do TF*IDF for every term, when not everyone will want that, thus it 
would be a wasted calculation.   I suppose it could be an option to have Solr 
do it, though.

Are you proposing not to return the other info or is this just in the case 
where tf = true and idf = true?

> A SearchComponent for fetching TF-IDF values
> --------------------------------------------
>
>                 Key: SOLR-651
>                 URL: https://issues.apache.org/jira/browse/SOLR-651
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.3
>            Reporter: Noble Paul
>            Assignee: Grant Ingersoll
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-651.patch, SOLR-651.patch, SOLR-651.patch
>
>
> A SearchComponent that can return TF-IDF vector for any given document in the 
> SOLR index
> Query : A Document Number / a query identifying a Document
> Response :  A Map of term vs.TF-IDF value of every term in the Selected
> Document
> Why ?
> Most of the Machine Learning Algorithms work on TFIDF representation of
> documents, hence adding a Request Handler proving the TFIDF representation
> will pave the way for incorporating Learning Paradigms to SOLR framework.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to