Have a look at the source for org.apache.lucene.demo.html.HTMLParser.jj

It stores the META tags in a Properties object that you can access via the 
getMetaTags() method.

The Document(File f) method of org.apache.lucene.demo.HTMLDocument is the one making 
the Document objects to store in the index. It does not add the meta tags to the 
index. You will either need to modify that or create your own document objects and 
index using the HTMLParser class or some other tool that parses your HTML files for 
you.

Eric

-----Original Message-----
From: mchaput [mailto:[EMAIL PROTECTED]]
Sent: Monday, December 09, 2002 11:40 AM
To: Lucene Developers List
Subject: Re: HTMLDocument with META tags?


Otis Gospodnetic wrote:
> The HTMLParser.jj should already do that.
> 
> Otis

The version I have doesn't seem to (no "meta" in the source code at 
all), or is there a trick to getting them out? Or is it in a newer 
version of Lucene than I have?

Sorry to bother, but it would solve a lot of problems for me if it 
really is in there.

Cheers,

Matt



-- 
                       |
Matt Chaput           |   A l i a s | W a v e f r o n t
Information Designer  |   210 King St. E. Toronto, ON, Canada M5A 1J7
[EMAIL PROTECTED]    |   (416) 874-8268
                       |
"A goddamned ray of sunshine all the goddamned time" --Sparkle Hayter


--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>


--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to