Here is my propousal

Regards



On Tue, Jan 5, 2010 at 12:48 PM, Zacarias <zacar...@linebee.com> wrote:

> Hi, I'm developing a directory monitor to add in a Sor implementation.
> Tell me if it could be interesting for you we will be glad to share it with
> the comunity. Also I would like your opinion about the propousal if it looks
> ok for you and if you like to make any change or question it will be very
> well welcome.
>
> Regards
> Zacarias
> www.linebee.com
>
>
> 2009/12/8 Noble Paul നോബിള്‍ नोब्ळ् <noble.p...@corp.aol.com>
>
> I was refering to SOLR-1358. Anyway , SolrCell as an updateprocessor
>> is a good idea
>>
>> On Tue, Dec 8, 2009 at 4:47 PM, Grant Ingersoll <gsing...@apache.org>
>> wrote:
>> >
>> > On Dec 8, 2009, at 12:22 AM, Noble Paul നോബിള്‍ नोब्ळ् wrote:
>> >
>> >> Integrating Extraction w/ DIH is a better option. DIH makes it easier
>> >> to do the mapping of fields etc.
>> >
>> > Which comment is this directed at?  I'm lacking context here.
>> >
>> >>
>> >>
>> >> On Tue, Dec 8, 2009 at 4:59 AM, Grant Ingersoll <gsing...@apache.org>
>> wrote:
>> >>>
>> >>> On Dec 7, 2009, at 3:51 PM, Chris Hostetter wrote:
>> >>>
>> >>>>
>> >>>> ASs someone with very little knowledge of Solr Cell and/or Tika, I
>> find myself wondering if ExtractingRequestHandler would make more sense as
>> an extractingUpdateProcessor -- where it could be configured to take take
>> either binary fields (or string fields containing URLs) out of the
>> Documents, parse them with tika, and add the various XPath matching hunks of
>> text back into the document as new fields.
>> >>>>
>> >>>> Then ExtractingRequestHandler just becomes a handler that slurps up
>> it's ContentStreams and adds them as binary data fields and adds the other
>> literal params as fields.
>> >>>>
>> >>>> Wouldn't that make things like SOLR-1358, and using Tika with
>> URLs/filepaths in XML and CSV based updates fairly trivial?
>> >>>
>> >>> It probably could, but am not sure how it works in a processor chain.
>>  However, I'm not sure I understand how they work all that much either.  I
>> also plan on adding, BTW, a SolrJ client for Tika that does the extraction
>> on the client.  In many cases, the ExtrReqHandler is really only designed
>> for lighter weight extraction cases, as one would simply not want to send
>> that much rich content over the wire.
>> >>
>> >>
>> >>
>> >> --
>> >> -----------------------------------------------------
>> >> Noble Paul | Systems Architect| AOL | http://aol.com
>> >
>> > --------------------------
>> > Grant Ingersoll
>> > http://www.lucidimagination.com/
>> >
>> > Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
>> Solr/Lucene:
>> > http://www.lucidimagination.com/search
>> >
>> >
>>
>>
>>
>> --
>> -----------------------------------------------------
>> Noble Paul | Systems Architect| AOL | http://aol.com
>>
>
>

Reply via email to