Sure Karl, no problem. My initial assumption was that; when Solr is Setup to use Tika (Solr Cell) , content would be automatically extracted and indexed in Solr. But it looks like, field mapping needed to be defined in the ManifoldCF job.
The goal of the project I'm working on is to: 1-use Solr with Tika (to extract and index MULTIPLE formats of documents), 2-use ManifoldCF (to use active directory security to pull user information from a domain controller, store ACL for each indexed document), 3-perform secure searches on all the indexed documents based on logged in user credentials. One Caveat here is that, the file system I'm using is not a plain vanilla FS. It's StorHouse / RFS from FileTek. So, as I move along, I'll post my findings, and ask for suggestions. I already got your book, and can't wait to read the connector creation chapters ! Thanks, Kadri On Thu, Apr 21, 2011 at 5:58 AM, Karl Wright <daddy...@gmail.com> wrote: > Thanks for doing this. > > If you have suggestions as to how to modify the default behavior of > the Solr output connector given the recent release of Solr 3.1, please > consider creating a ticket in Apache JIRA that describes what you > think needs to happen. The output connector was designed to work with > the example configuration of Solr by default; I believe it would be > good to retain that ability. > > Karl > > On Wed, Apr 20, 2011 at 6:49 PM, Kadri Atalay <atalay.ka...@gmail.com> > wrote: > > I added the following field mapping into Manifold Job and now it's > indexing > > the document content also ! > > > > (fmap.content attr_content) > > > > Thanks ! > > > > > > On Wed, Apr 20, 2011 at 6:36 PM, Karl Wright <daddy...@gmail.com> wrote: > >> > >> The content is posted to the update request handler. It might be > >> helpful if you turn on some logging in Solr to see exactly what is > >> happening there. > >> > >> Karl > >> > >> On Wed, Apr 20, 2011 at 6:18 PM, Kadri Atalay <atalay.ka...@gmail.com> > >> wrote: > >> > I'm able to use Manifold and SharedDrive connector to index files into > >> > Solr. > >> > But, only information I see in the Solr is Author, Content_type,Name, > & > >> > last_modified. > >> > > >> > Can anyone tell me, how to index also the content into Solr ? > >> > > >> > Thanks in Advance ! > >> > > >> > Kadri > >> > > >> > PS. I'm using SolrCell (Tika) and manual update/extract is working > fine. > >> > > > > > >