Dear friends. Sorry, I posted to Solr. Any ideas on this question? Sincerely, Alex
-----Original Message----- From: Otis Gospodnetic [mailto:[email protected]] Sent: 30 August 2012 4:28 PM To: [email protected] Subject: Re: Extract footer/header text out of Word docs Hi Alex, I think you may get better help on the Tika mailing list - Solr uses Tika to parse rich text docs and extract text from them. I don't know if Tika can figure out what's from a header and a footer... Otis ---- Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm ----- Original Message ----- > From: Alex Cougarman <[email protected]> > To: "[email protected]" <[email protected]> > Cc: > Sent: Thursday, August 30, 2012 9:25 AM > Subject: Extract footer/header text out of Word docs > > Hi. Is it possible to specifically extract footer/header and body text out of > a > Word document using Solr? In other words, we'd like to index/store those > items in different Solr fields. > > Also, is it possible to search on specific styles within a Word document? Can > these attributes be indexed? Thanks. > > Sincerely, > Alex >
