2008/8/21 Marc van Driel <[EMAIL PROTECTED]>: > Hope you enjoy it! I know the author appreciates feedback :) > > Cheers > > Raymond Wan schreef: >> >> Hi Marc, >> >> Yes, it seems we were both right :-). From Dermot's first post, I guess >> he was asking about Perl interfacing an IR system and why Perl isn't used to >> build an IR system. The MRS system demonstrates the first point, so thank >> you for pointing it out -- I did not know about it, either! The second part >> has to do with Perl being an interpreted and not a compiled language; and it >> is for that reason, I don't believe Perl could be used as an IR system >> backend (partly from my own experience from writing text processing in Perl >> and then giving up and doing it again in C/C++ because it was too slow :-) >> ). >> >> Thanks for the link to the system -- it was of benefit to me, as well! >> >> Ray >> >> Marc van Driel wrote: >>> >>> Hi Ray, >>> >>> My interpretation of Dermots mail was that he was looking for a >>> tex-retrieval system with a Perl interface, but I only read the last mail of >>> the thread. MRS is written in C++ and originally designed to index and >>> search the biodatabanks (usually this is semi-structured data), but is not >>> bio-specific. There is a SOAP interface/webservice/WSDL for e.g. Perl. So, >>> you can do a query, retrieve 1000 records (out of xxxxx records) and let >>> Perl do what you want to do with those 1000 records. MRS has a boolean and >>> ranked search mechanism. For more information visit with website >>> (mrs.cmbi.ru.nl) or contact the author: [EMAIL PROTECTED] There is also >>> a paper on the system: >>> http://nar.oxfordjournals.org/cgi/content/full/33/suppl_2/W766?ijkey=1hM9Po54JADYz0b&keytype=ref >>> >>> Best regards, >>> Marc >>> >>> Raymond Wan schreef: >>>> >>>> Hi Marc, >>>> (mailing list purposely removed) >>>> >>>> Thanks for the link! >>>> >>>> I think what Dermot was talking about is having a Perl system do the >>>> underlying work? But yes, if the underlying system is written in C/C++, >>>> then Perl would be "fast" since it is merely acting as a gateway to the >>>> work >>>> being done; in any case, it would mean that the text manipulation >>>> advantages >>>> of Perl are still not being used? Is that the case with MRS? >>>> >>>> Ray >>>> >>>> >>>> Marc van Driel wrote: >>>>> >>>>> Hi Dermont/Ray, >>>>> >>>>> Please check out the MRS system (mrs.cmbi.ru.nl). It has a SOAP >>>>> interface to perl and other languages, and is extremely fast in indexing >>>>> and >>>>> retrieval. MRS is a generic tool and you can index yourself, but also >>>>> dowload indexed bio-databanks. The source code is in C++ and is available >>>>> as >>>>> well. >>>>> Teaching material is available but tailored towards biologists. >>>>> >>>>> Best, >>>>> >>>>> Marc
Yes I was a bit confused because I didn't understand why there wasn't a pure Perl text search engine. I was aware of numerous Perl interfaces to other API, Lucene, KinoSearch, OpenFTS and Swish-E but I wasn't aware of how they fundamentally work. I also note that Postgres has Tsearch. From the little bit of searching I've done Lucene seems to have a great deal of support and there are a number of module that use the Lucene API. Of course a SOAP/REST interface would allow any language access. Again thanx for the useful sources. Dp. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/