From you patch I see TermFreqVector which provides the information I want.
I also found FieldInvertState.getLength() which seems to be exactly what I
want. I'm after the word count (sum of tf for every term in the doc). I'm
just not sure whether FieldInvertState.getLength() returns just the
Hi Gabriele,
I'm not sure to understand your problem, but the TermVectorComponent may fit
your needs ?
http://wiki.apache.org/solr/TermVectorComponent
http://wiki.apache.org/solr/TermVectorComponentExampleEnabled
Ludovic.
-
Jouve
France.
--
View this message in context:
I had looked an term vectors but don't understand them to solve my problem.
Consider the following index entries:
t0, doc0, doc1
t1, doc0
From the 2nd entry we know that t1 is only present in doc0.
Now, my problem, given doc0 how can I know which terms occur in in (t0 and
t1) (without storing
sounds like the Luke request handler will get what you're after:
http://wiki.apache.org/solr/LukeRequestHandler
http://wiki.apache.org/solr/LukeRequestHandler#id
cheers,
rob
On Tue, Jul 5, 2011 at 3:59 PM, Gabriele Kahlout
gabri...@mysimpatico.com wrote:
Hello,
With an inverted
You can do this, kind of, but it's a lossy process. Consider indexing
the cat in the hat strikes back, with the, in being stopwords and
strikes getting stemmed to strike. At very best, you can reconstruct
that the original doc contained cat, hat, strike, back. Is
that sufficient?
And it's a very
Gabriele,
I created a patch that does this about a year ago. See
https://issues.apache.org/jira/browse/SOLR-1837. It was written for Solr
1.4 and is based upon the Document Reconstructor in Luke. The patch adds a
link to the main solr admin page to a docinspector page which will
reconstruct