Here is my propousal Regards
On Tue, Jan 5, 2010 at 12:48 PM, Zacarias <zacar...@linebee.com> wrote: > Hi, I'm developing a directory monitor to add in a Sor implementation. > Tell me if it could be interesting for you we will be glad to share it with > the comunity. Also I would like your opinion about the propousal if it looks > ok for you and if you like to make any change or question it will be very > well welcome. > > Regards > Zacarias > www.linebee.com > > > 2009/12/8 Noble Paul നോബിള് नोब्ळ् <noble.p...@corp.aol.com> > > I was refering to SOLR-1358. Anyway , SolrCell as an updateprocessor >> is a good idea >> >> On Tue, Dec 8, 2009 at 4:47 PM, Grant Ingersoll <gsing...@apache.org> >> wrote: >> > >> > On Dec 8, 2009, at 12:22 AM, Noble Paul നോബിള് नोब्ळ् wrote: >> > >> >> Integrating Extraction w/ DIH is a better option. DIH makes it easier >> >> to do the mapping of fields etc. >> > >> > Which comment is this directed at? I'm lacking context here. >> > >> >> >> >> >> >> On Tue, Dec 8, 2009 at 4:59 AM, Grant Ingersoll <gsing...@apache.org> >> wrote: >> >>> >> >>> On Dec 7, 2009, at 3:51 PM, Chris Hostetter wrote: >> >>> >> >>>> >> >>>> ASs someone with very little knowledge of Solr Cell and/or Tika, I >> find myself wondering if ExtractingRequestHandler would make more sense as >> an extractingUpdateProcessor -- where it could be configured to take take >> either binary fields (or string fields containing URLs) out of the >> Documents, parse them with tika, and add the various XPath matching hunks of >> text back into the document as new fields. >> >>>> >> >>>> Then ExtractingRequestHandler just becomes a handler that slurps up >> it's ContentStreams and adds them as binary data fields and adds the other >> literal params as fields. >> >>>> >> >>>> Wouldn't that make things like SOLR-1358, and using Tika with >> URLs/filepaths in XML and CSV based updates fairly trivial? >> >>> >> >>> It probably could, but am not sure how it works in a processor chain. >> However, I'm not sure I understand how they work all that much either. I >> also plan on adding, BTW, a SolrJ client for Tika that does the extraction >> on the client. In many cases, the ExtrReqHandler is really only designed >> for lighter weight extraction cases, as one would simply not want to send >> that much rich content over the wire. >> >> >> >> >> >> >> >> -- >> >> ----------------------------------------------------- >> >> Noble Paul | Systems Architect| AOL | http://aol.com >> > >> > -------------------------- >> > Grant Ingersoll >> > http://www.lucidimagination.com/ >> > >> > Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using >> Solr/Lucene: >> > http://www.lucidimagination.com/search >> > >> > >> >> >> >> -- >> ----------------------------------------------------- >> Noble Paul | Systems Architect| AOL | http://aol.com >> > >