On Thursday, November 7, 2002, at 10:12 PM, Gilles Detillieux wrote:
I think there are some cases where that's true, but not necessarily in allKeep in mind that the form <meta name="foo" contents="bar"> is the definitive W3C standard, whereas the other form is an older, depreciated case. I don't see much HTML like this anymore. Whether we want to completely ignore them or not is hard to say.
cases, so I don't know how much you can optimize this. E.g., for certain
keyword tags we allow the form <meta foo="bar">, but the configurable
keyword names must be of the form <meta name="foo" contents="bar">.
I don't know that we'd want to fully generalize this, but I'm open to
suggestions/recommendations from others.
I'm sure there'd be a fair bit of discussion about this in the htdig-devAs well, it allows restricting word searches based on the "field" or tags that contain the words.
archives of 2-3 years ago. I don't think it ever got formally documented
elsewhere (yet). The reason was to allow "scoring on the fly".
The decision to put all headings into one factor was to reduce the numberNo, the flags never were supposed to be a single byte. There happen to be 8 bits currently defined, but more than this should be actually stored for custom fields (and ideally to keep the database format identical).
of bits the flag would take by 5, so the flags can fit in a single byte.
We're going to have to increase this anyway, to accomodate custom fields,
so it might make sense to reintroduce the distinction between heading
OTOH, there were 6 slots for headings under 3.1, and it seems like a huge waste of bits considering most won't be used--even with 3-bit encoding. Some other document formats also don't make much distinction between heading levels. Do people really think that markup beyond h1, h2 and h3 occurs? A lot of HTML I see these days uses <strong> or <b> or <i> tagging (or worse, <font>).
Keep in mind that every bit we add to the flags adds more space to every word. Right now, I've specified 8 bits, including author and URL text which aren't currently used.
-Geoff
-------------------------------------------------------
This sf.net email is sponsored by: See the NEW Palm Tungsten T handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en
_______________________________________________
htdig-dev mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/htdig-dev