readVInt, what is it for?

2008-07-02 Thread blazingwolf7
Hi, I am fairly new to Lucene and is now currently going through its source code. I am currently trying to determine how Lucene calculate the frequency of a term in each document located. I encounter a method named readVInt() in IndexInput class. It seems everytime it called this method it will

RE: readVInt, what is it for?

2008-07-02 Thread blazingwolf7
and variable length manner. > > Read here: http://lucene.apache.org/java/2_3_2/fileformats.html#VInt > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: [EMAIL PROTECTED] > >> -----Original Message- >> F

Re: readVInt, what is it for?

2008-07-02 Thread blazingwolf7
The frequency is tracked at index time. It's simply a read at query > time. See TermDocs. > If you really want to understand more about the code internals of > Lucene, I'd suggest stepping through more example queries with a > debugger. > > -Yonik > > On Wed, Jul

Class in Lucene that Perform Search

2008-07-02 Thread blazingwolf7
Hi, I am currently using Lucene to build a search engine and is trying to understand better so I am going through its source code. I track it all the way from the beginning till end, and has managed to located all the class that calculate the score and return the results. What I am missing is t

Re: Class in Lucene that Perform Search

2008-07-03 Thread blazingwolf7
On Wed, Jul 2, 2008 at 10:30 PM, blazingwolf7 <[EMAIL PROTECTED]> > wrote: >> What I am missing is that I fail to locate the class that perform the >> actual >> comparison to determine if a query match any term in a document. > > You need to understand the inverted

RE: readVInt, what is it for?

2008-07-03 Thread blazingwolf7
h leads to the TermScorer. > > -Grant > > > On Jul 2, 2008, at 10:04 PM, blazingwolf7 wrote: > >> >> Hmmm, I don't think I get it. How is it tracked during index time? I >> index my file earlier. Later I will open the index and perform a >> se

Re: Class in Lucene that Perform Search

2008-07-03 Thread blazingwolf7
I thought maybe I can store some extra value into the .frq file then I will have no need to continuously use the reader. Anyone can provide other suggestion? Thanks Yonik Seeley wrote: > > On Thu, Jul 3, 2008 at 4:03 AM, blazingwolf7 <[EMAIL PROTECTED]> > wrote: >> Ah, thanks!

Untokenized URL

2008-07-04 Thread blazingwolf7
Hi, I am currently working on retrieving url and contentLength of each document found during the search. I want to retrieve it during the calculation of score so that I can influence the score in some other way. I used the methods from TermDocs and TermEnum to get the information. However, the u

Re: Untokenized URL

2008-07-05 Thread blazingwolf7
you > do > doc.add(new Field("url", "http://www.cnn.com";, Store.NO, > Index.UN_TOKENIZED), it would create a token like "url:http://www.cnn.com"; > without breaking it to its parts. Is that what you're looking for? > > Shai > > On Fri,

Re: Untokenized URL

2008-07-06 Thread blazingwolf7
erhaps I don't understand the entire scenario. When do you need to fetch > the contentLength and URL? To what purpose? > > On Sun, Jul 6, 2008 at 4:26 AM, blazingwolf7 <[EMAIL PROTECTED]> > wrote: > >> >> No, I didn't store the contentLength. Just adding it

RE: Untokenized URL

2008-07-07 Thread blazingwolf7
; But this does not work with your TermEnum solution. > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: [EMAIL PROTECTED] > >> -Original Message- >> From: blazingwolf7 [mailto:[EMAIL PROTECTED] >> Sent: Mond

RE: Untokenized URL

2008-07-07 Thread blazingwolf7
ser%20List). > This list is for developers of Lucene itself, not for users asking for > help > how to implement something specific with Lucene. > > Uwe > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: [EMAIL PROTE

How effcient is IndexReader?

2008-07-07 Thread blazingwolf7
Hi, I want to use a Reader to read a document everytime a matching document is found during search time. So basically, everytime during the calculation of the score for a document, I will use the reader and retrieve some information from the index. Will this lower the searching performance? I m