Hi,
I am fairly new to Lucene and is now currently going through its source
code. I am currently trying to determine how Lucene calculate the frequency
of a term in each document located.
I encounter a method named readVInt() in IndexInput class. It seems
everytime it called this method it will
and variable length manner.
>
> Read here: http://lucene.apache.org/java/2_3_2/fileformats.html#VInt
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: [EMAIL PROTECTED]
>
>> -----Original Message-
>> F
The frequency is tracked at index time. It's simply a read at query
> time. See TermDocs.
> If you really want to understand more about the code internals of
> Lucene, I'd suggest stepping through more example queries with a
> debugger.
>
> -Yonik
>
> On Wed, Jul
Hi,
I am currently using Lucene to build a search engine and is trying to
understand better so I am going through its source code. I track it all the
way from the beginning till end, and has managed to located all the class
that calculate the score and return the results.
What I am missing is t
On Wed, Jul 2, 2008 at 10:30 PM, blazingwolf7 <[EMAIL PROTECTED]>
> wrote:
>> What I am missing is that I fail to locate the class that perform the
>> actual
>> comparison to determine if a query match any term in a document.
>
> You need to understand the inverted
h leads to the TermScorer.
>
> -Grant
>
>
> On Jul 2, 2008, at 10:04 PM, blazingwolf7 wrote:
>
>>
>> Hmmm, I don't think I get it. How is it tracked during index time? I
>> index my file earlier. Later I will open the index and perform a
>> se
I thought maybe I can
store some extra value into the .frq file then I will have no need to
continuously use the reader. Anyone can provide other suggestion? Thanks
Yonik Seeley wrote:
>
> On Thu, Jul 3, 2008 at 4:03 AM, blazingwolf7 <[EMAIL PROTECTED]>
> wrote:
>> Ah, thanks!
Hi,
I am currently working on retrieving url and contentLength of each document
found during the search. I want to retrieve it during the calculation of
score so that I can influence the score in some other way.
I used the methods from TermDocs and TermEnum to get the information.
However, the u
you
> do
> doc.add(new Field("url", "http://www.cnn.com";, Store.NO,
> Index.UN_TOKENIZED), it would create a token like "url:http://www.cnn.com";
> without breaking it to its parts. Is that what you're looking for?
>
> Shai
>
> On Fri,
erhaps I don't understand the entire scenario. When do you need to fetch
> the contentLength and URL? To what purpose?
>
> On Sun, Jul 6, 2008 at 4:26 AM, blazingwolf7 <[EMAIL PROTECTED]>
> wrote:
>
>>
>> No, I didn't store the contentLength. Just adding it
; But this does not work with your TermEnum solution.
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: [EMAIL PROTECTED]
>
>> -Original Message-
>> From: blazingwolf7 [mailto:[EMAIL PROTECTED]
>> Sent: Mond
ser%20List).
> This list is for developers of Lucene itself, not for users asking for
> help
> how to implement something specific with Lucene.
>
> Uwe
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: [EMAIL PROTE
Hi,
I want to use a Reader to read a document everytime a matching document is
found during search time. So basically, everytime during the calculation of
the score for a document, I will use the reader and retrieve some
information from the index. Will this lower the searching performance?
I m
13 matches
Mail list logo