Hi Otis, Thank you for your reply. Actually the parsing is done, I just use the html tag as field name - is that ok for Solr? By the way, can the attribute in fields be meaningful to Solr?
Vinci Otis Gospodnetic wrote: > > Hi Vinci, > > Maybe this answers most of your questions: Solr can't digest HTML - you > have to do HTML parsing outside of Solr, and feed it a document with > specific fields that match the schema. > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > ----- Original Message ---- > From: Vinci <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Tuesday, March 25, 2008 4:25:10 PM > Subject: Fields, Facets and Indexing html document > > > Hi all, > > I want to Solr to index my html document collection. After I read number > of > tutorial and google search, I have some questions... > 1. Can I index html document directly? > 2. what should I do on the default schema.xml for indexing html documents? > 3. Can fields to be defined by a combination of tag and attribute? > 4. Does it possible to use Highlighter to filter the search result? (e.g. > if > highlighting done after some marker tag, then the search result will get > lower ranking) > 5. Can facets do a statistic on the search result? > 6. Does facets have same meaning of fields? If not, what are there > different? > 7. Can facets/feature defined in another document? > > Thank you, > Vinci > -- > View this message in context: > http://www.nabble.com/Fields%2C-Facets-and-Indexing-html-document-tp16287762p16287762.html > Sent from the Solr - User mailing list archive at Nabble.com. > > > > > > -- View this message in context: http://www.nabble.com/Fields%2C-Facets-and-Indexing-html-document-tp16287762p16294666.html Sent from the Solr - User mailing list archive at Nabble.com.