Danish Qadri wrote: > > I caught this in my PostgreSQL logs: > > Sep 29 16:07:30 intranet2.globix.net postgres[18413]: [559702] DEBUG: > query: INSERT INTO dict4 (url_id,word,intag) VALUES(11566,'',-1413086976) > > > I'm using multi mode as you can see from the table name. I was suprised to > see a negative # for intag, what does it signify? A general feeling I got > was that the intag # recorded how many times a word occurs in a document, > and with what relevancy (weight). >
Well, intag in 3.2.x is a conbination of word section (not weight, weight appears at a search time when weight factors are being corresponded to sections) and it's position in the document. it is 32 bit number. -1413086976 is hex ABC60100 ABC60100 PPPPSSRR PPPP is a word position. In this case ABC6, decimal 43974. Big anough document :-) SS is a word section, the first section in this example, it means BODY if default sections are used. RR is currently reserved. We are going to store additional context in this byte, i.e. things like <H1> or <B>. Negative number is just because word position is big. We don't use UNSIGNED INT type because of portability. Not all of supported SQL servers have UNSIGNED data types. ___________________________________________ If you want to unsubscribe send "unsubscribe general" to [EMAIL PROTECTED]
