I was refering to SOLR-1358. Anyway , SolrCell as an updateprocessor is a good idea
On Tue, Dec 8, 2009 at 4:47 PM, Grant Ingersoll <gsing...@apache.org> wrote: > > On Dec 8, 2009, at 12:22 AM, Noble Paul നോബിള് नोब्ळ् wrote: > >> Integrating Extraction w/ DIH is a better option. DIH makes it easier >> to do the mapping of fields etc. > > Which comment is this directed at? I'm lacking context here. > >> >> >> On Tue, Dec 8, 2009 at 4:59 AM, Grant Ingersoll <gsing...@apache.org> wrote: >>> >>> On Dec 7, 2009, at 3:51 PM, Chris Hostetter wrote: >>> >>>> >>>> ASs someone with very little knowledge of Solr Cell and/or Tika, I find >>>> myself wondering if ExtractingRequestHandler would make more sense as an >>>> extractingUpdateProcessor -- where it could be configured to take take >>>> either binary fields (or string fields containing URLs) out of the >>>> Documents, parse them with tika, and add the various XPath matching hunks >>>> of text back into the document as new fields. >>>> >>>> Then ExtractingRequestHandler just becomes a handler that slurps up it's >>>> ContentStreams and adds them as binary data fields and adds the other >>>> literal params as fields. >>>> >>>> Wouldn't that make things like SOLR-1358, and using Tika with >>>> URLs/filepaths in XML and CSV based updates fairly trivial? >>> >>> It probably could, but am not sure how it works in a processor chain. >>> However, I'm not sure I understand how they work all that much either. I >>> also plan on adding, BTW, a SolrJ client for Tika that does the extraction >>> on the client. In many cases, the ExtrReqHandler is really only designed >>> for lighter weight extraction cases, as one would simply not want to send >>> that much rich content over the wire. >> >> >> >> -- >> ----------------------------------------------------- >> Noble Paul | Systems Architect| AOL | http://aol.com > > -------------------------- > Grant Ingersoll > http://www.lucidimagination.com/ > > Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using > Solr/Lucene: > http://www.lucidimagination.com/search > > -- ----------------------------------------------------- Noble Paul | Systems Architect| AOL | http://aol.com