Re: Text search engine [OT]

Dermot Thu, 21 Aug 2008 05:22:23 -0700

2008/8/21 Marc van Driel <[EMAIL PROTECTED]>:
> Hope you enjoy it! I know the author appreciates feedback :)
>
> Cheers
>
> Raymond Wan schreef:
>>
>> Hi Marc,
>>
>> Yes, it seems we were both right :-).  From Dermot's first post, I guess
>> he was asking about Perl interfacing an IR system and why Perl isn't used to
>> build an IR system.  The MRS system demonstrates the first point, so thank
>> you for pointing it out -- I did not know about it, either!  The second part
>> has to do with Perl being an interpreted and not a compiled language; and it
>> is for that reason, I don't believe Perl could be used as an IR system
>> backend (partly from my own experience from writing text processing in Perl
>> and then giving up and doing it again in C/C++ because it was too slow :-)
>> ).
>>
>> Thanks for the link to the system -- it was of benefit to me, as well!
>>
>> Ray
>>
>> Marc van Driel wrote:
>>>
>>> Hi Ray,
>>>
>>> My interpretation of Dermots mail was that he was looking for a
>>> tex-retrieval system with a Perl interface, but I only read the last mail of
>>> the thread. MRS is written in C++ and originally designed to index and
>>> search the biodatabanks (usually this is semi-structured data), but is not
>>> bio-specific. There is a SOAP interface/webservice/WSDL for e.g. Perl. So,
>>> you can do a query, retrieve 1000 records (out of xxxxx records) and let
>>> Perl do what you want to do with those 1000 records. MRS has a boolean and
>>> ranked search mechanism. For more information visit with website
>>> (mrs.cmbi.ru.nl) or contact the author: [EMAIL PROTECTED] There is also
>>> a paper on the system:
>>> http://nar.oxfordjournals.org/cgi/content/full/33/suppl_2/W766?ijkey=1hM9Po54JADYz0b&keytype=ref
>>>
>>> Best regards,
>>>   Marc
>>>
>>> Raymond Wan schreef:
>>>>
>>>> Hi Marc,
>>>> (mailing list purposely removed)
>>>>
>>>> Thanks for the link!
>>>>
>>>> I think what Dermot was talking about is having a Perl system do the
>>>> underlying work?  But yes, if the underlying system is written in C/C++,
>>>> then Perl would be "fast" since it is merely acting as a gateway to the 
>>>> work
>>>> being done; in any case, it would mean that the text manipulation 
>>>> advantages
>>>> of Perl are still not being used?  Is that the case with MRS?
>>>>
>>>> Ray
>>>>
>>>>
>>>> Marc van Driel wrote:
>>>>>
>>>>> Hi Dermont/Ray,
>>>>>
>>>>> Please check out the MRS system (mrs.cmbi.ru.nl). It has a SOAP
>>>>> interface to perl and other languages, and is extremely fast in indexing 
>>>>> and
>>>>> retrieval. MRS is a generic tool and you can index yourself, but also
>>>>> dowload indexed bio-databanks. The source code is in C++ and is available 
>>>>> as
>>>>> well.
>>>>> Teaching material is available but tailored towards biologists.
>>>>>
>>>>> Best,
>>>>>
>>>>>   Marc


Yes I was a bit confused because I didn't understand why there wasn't
a pure Perl text search engine. I was aware of numerous Perl
interfaces to other API, Lucene, KinoSearch, OpenFTS and Swish-E but I
wasn't aware of how they fundamentally work. I also note that Postgres
has Tsearch. From the little bit of searching I've done Lucene seems
to have a great deal of support and there are a number of module that
use the Lucene API.  Of course a SOAP/REST interface would allow any
language access.

Again thanx for the useful sources.
Dp.

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Text search engine [OT]

Reply via email to