RE: Lucene vs. MySQL Full-Text
I also question whether it could handle extreme volume with such good query speed. Has anyone done numbers with 1+ million documents? -Original Message- From: Daniel Naber [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 20, 2004 5:44 PM To: Lucene Users List Subject: Re: Lucene vs. MySQL Full-Text On Tuesday 20 July 2004 21:29, Tim Brennan wrote: Does anyone out there have anything more concrete they can add? Stemming is still on the MySQL TODO list: http://dev.mysql.com/doc/mysql/en/Fulltext_TODO.html Also, for most people it's easier to extend Lucene than MySQL (as MySQL is written in C(++?)) and there are more powerful queries in Lucene, e.g. fuzzy phrase search. Regards Daniel -- http://www.danielnaber.de - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene vs. MySQL Full-Text
I used the MySQL full text search to index about 70K business directory records. It became impossibly slow and I ended up creating my own text search engine similar in concept to Lucene but database driven. It worked much faster than the native MySQL full text search. Other limitations of MySQL MATCH syntax: - only 4 letter words and over are indexed (if you change this it searches VERY slowly) - the MATCH value figure returned is next to useless (it ranges wildly and is not normalized like Lucene values are) - cannot weight certain fields as more important than others. Really it is very limited. John. - Original Message - From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Friday, July 23, 2004 1:23 AM Subject: RE: Lucene vs. MySQL Full-Text I also question whether it could handle extreme volume with such good query speed. Has anyone done numbers with 1+ million documents? -Original Message- From: Daniel Naber [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 20, 2004 5:44 PM To: Lucene Users List Subject: Re: Lucene vs. MySQL Full-Text On Tuesday 20 July 2004 21:29, Tim Brennan wrote: Does anyone out there have anything more concrete they can add? Stemming is still on the MySQL TODO list: http://dev.mysql.com/doc/mysql/en/Fulltext_TODO.html Also, for most people it's easier to extend Lucene than MySQL (as MySQL is written in C(++?)) and there are more powerful queries in Lucene, e.g. fuzzy phrase search. Regards Daniel -- http://www.danielnaber.de - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Lucene vs. MySQL Full-Text
Depending on what MySQL Full-text search support you probably will lose some of the advance things you get for free from Lucene, such as proximity search, wildcard search, search term and search field boosting, scoring of the documents, etc. Afterall it depends on what you need to do. In our dev team we are actually currently having a mini debate over whether to use lucene for our project or write something from scratch that's based on a DB. We need really good performance. I feel lucene can do our job very well, some of our guys feel using a DB based search can give us greater performance on the type of search we do. Anson -Original Message- From: Florian Sauvin [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 21, 2004 8:55 AM To: Lucene Users List Subject: Re: Lucene vs. MySQL Full-Text On Jul 20, 2004, at 12:29 PM, Tim Brennan wrote: Someone came into my office today and asked me about the project I am trying to Lucene for -- why aren't you just using a MySQL full-text index to do that -- after thinking about it for a few minutes, I realized I don't have a great answer. MySQL builds inverted indexes for (in theory) doing the same type of lookup that lucene does. You'd maybe have to build some kind of a layer on the front to mimic Lucene's analyzers, but that wouldn't be too hard My only experience with MySQLfulltext is trivial test apps -- but the MySQL world does have some significant advantages (its a known quantity from an operations perspective, etc). Does anyone out there have anything more concrete they can add? --tim I'd say that MySQL full text is much slower if you have a lot of data... that is one of the reasons we started using lucene (We had a mysql db to do the search), it's way faster! -- Florian - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene vs. MySQL Full-Text
Interestingly (and ironically) enough, the project I'm currently working on requires full-text searching of Word and PDF resumes. SQL Server is already the required database as well, so we are leveraging the full-text indexing capabilities it has. There is a special trick to drop a BLOB into a table which also has a file extension and mime type columns, and have SQL Server index it with its Index Server capabilities. Lucene was not needed, and we made the pragmatic (simplest that worked well) choice. My recommendation would be to implement something rather than debate it - and if it is good enough, leave it alone, if not then try a different approach :) Erik On Jul 21, 2004, at 7:29 AM, Anson Lau wrote: Depending on what MySQL Full-text search support you probably will lose some of the advance things you get for free from Lucene, such as proximity search, wildcard search, search term and search field boosting, scoring of the documents, etc. Afterall it depends on what you need to do. In our dev team we are actually currently having a mini debate over whether to use lucene for our project or write something from scratch that's based on a DB. We need really good performance. I feel lucene can do our job very well, some of our guys feel using a DB based search can give us greater performance on the type of search we do. Anson -Original Message- From: Florian Sauvin [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 21, 2004 8:55 AM To: Lucene Users List Subject: Re: Lucene vs. MySQL Full-Text On Jul 20, 2004, at 12:29 PM, Tim Brennan wrote: Someone came into my office today and asked me about the project I am trying to Lucene for -- why aren't you just using a MySQL full-text index to do that -- after thinking about it for a few minutes, I realized I don't have a great answer. MySQL builds inverted indexes for (in theory) doing the same type of lookup that lucene does. You'd maybe have to build some kind of a layer on the front to mimic Lucene's analyzers, but that wouldn't be too hard My only experience with MySQLfulltext is trivial test apps -- but the MySQL world does have some significant advantages (its a known quantity from an operations perspective, etc). Does anyone out there have anything more concrete they can add? --tim I'd say that MySQL full text is much slower if you have a lot of data... that is one of the reasons we started using lucene (We had a mysql db to do the search), it's way faster! -- Florian - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene vs. MySQL Full-Text
On Tuesday 20 July 2004 21:29, Tim Brennan wrote: Does anyone out there have anything more concrete they can add? Stemming is still on the MySQL TODO list: http://dev.mysql.com/doc/mysql/en/Fulltext_TODO.html Also, for most people it's easier to extend Lucene than MySQL (as MySQL is written in C(++?)) and there are more powerful queries in Lucene, e.g. fuzzy phrase search. Regards Daniel -- http://www.danielnaber.de - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene vs. MySQL Full-Text
On Jul 20, 2004, at 12:29 PM, Tim Brennan wrote: Someone came into my office today and asked me about the project I am trying to Lucene for -- why aren't you just using a MySQL full-text index to do that -- after thinking about it for a few minutes, I realized I don't have a great answer. MySQL builds inverted indexes for (in theory) doing the same type of lookup that lucene does. You'd maybe have to build some kind of a layer on the front to mimic Lucene's analyzers, but that wouldn't be too hard My only experience with MySQLfulltext is trivial test apps -- but the MySQL world does have some significant advantages (its a known quantity from an operations perspective, etc). Does anyone out there have anything more concrete they can add? --tim I'd say that MySQL full text is much slower if you have a lot of data... that is one of the reasons we started using lucene (We had a mysql db to do the search), it's way faster! -- Florian - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene and Mysql
Hi, You should create a Lucene Document for each record in your table. Make each of the columns that contains text a field on the Document object. Also store the primary key of the record as a field. Here's a very basic article I wrote about using Lucene: http://builder.com.com/5100-6389-5054799.html Jeff - Original Message - From: Stefan Trcko [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Tuesday, December 16, 2003 2:30 PM Subject: Lucene and Mysql Hello I'm new to Lucene. I want users can search text which is stored in mysql database. Is there any tutorial how to implement this kind of search feature. Best regards, Stefan - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Lucene and Mysql
Hi. You read out all the relevant fields from MySQL and assign the primary key as an indentifier of your Lucene documents. During search, you retrieve the identifier from the Lucene searcher and query the database to present the full text. Best regards, Gregor -Original Message- From: Stefan Trcko [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 16, 2003 9:31 PM To: [EMAIL PROTECTED] Subject: Lucene and Mysql Hello I'm new to Lucene. I want users can search text which is stored in mysql database. Is there any tutorial how to implement this kind of search feature. Best regards, Stefan - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Lucene and Mysql
You would just take the items from mysql database and create a document for each record. Then index all the documents. -Original Message- From: Stefan Trcko [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 16, 2003 3:31 PM To: [EMAIL PROTECTED] Subject: Lucene and Mysql Hello I'm new to Lucene. I want users can search text which is stored in mysql database. Is there any tutorial how to implement this kind of search feature. Best regards, Stefan - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: lucene vs mysql
Le ven 11/10/2002 à 21:03, Cédric Grun a écrit : I'm currently using mysql for storing file which I index previously with lucene. I've seen there is a new function in mysql which consists in full text searching. I'd like to know which is best between mysql full text search and lucene search. Before using Lucene, we looked into Mysql for full text search functions. Extracted from Documentation http://www.mysql.com/doc/en/Fulltext_Search.html :-) MySQL uses a very simple parser to split text into words. A ``word'' is any sequence of characters consisting of letters, numbers, `'', and `_'. Any ``word'' that is present in the stopword list or is just too short (3 characters or less) is ignored. For us, it was the main reason we decided to not use it: tuning what is indexed through the Analyzing phase is really a plus in the case of big number of documents. We tried to use mysql also with a third party indexer using it as index storage (mnoGoSearch). The complex requests we had to do (specific to our case, that's right) was heavily loading MySQL on the platform we use (IBM AIX on multi-proc). With Lucene, the same service is 100x more robust (and I'm conservative). From my point of view, I will use Mysql full text search in conjunction with other sql features (joins, aggregation function). If you think about creating documents from several sources, store them in Mysql in order to search them with MySQL MATCH function, I would say that it will cost you the same development to use Lucene with the same or better performance, and better tunning possibilities. We are also heavily using MySQL for pure database searches and we are happy with the performances/price ratio, I promise :-) Remy -- E-mail : [EMAIL PROTECTED] Kelkoo RD Director (http://www.kelkoo.com/) -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: lucene vs mysql
I have tried/used both. Full text retrieval in MySQL is simple and works good for simple IR usage. If you want to use more advance features, use Lucene Ronald - Original Message - From: Cédric Grun [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Friday, October 11, 2002 9:03 PM Subject: lucene vs mysql I'm currently using mysql for storing file which I index previously with lucene. I've seen there is a new function in mysql which consists in full text searching. I'd like to know which is best between mysql full text search and lucene search. I'm also interesting to know how lucene perform lock on index. Can I index several files simultaneously using thread, or must I wait one file is indexed for indexing the second ? thanks __ Etudiant: Wanadoo t'offre le Pack eXtense Haut Débit soit 150,92 euros d'économies ! Clique ici : http://www.ifrance.com/_reloc/mail.etudiant -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: lucene vs mysql
MySQL - not sure. Let us know if you test both. Locking - sequential. Otis --- Cédric_Grun [EMAIL PROTECTED] wrote: I'm currently using mysql for storing file which I index previously with lucene. I've seen there is a new function in mysql which consists in full text searching. I'd like to know which is best between mysql full text search and lucene search. I'm also interesting to know how lucene perform lock on index. Can I index several files simultaneously using thread, or must I wait one file is indexed for indexing the second ? thanks __ Etudiant: Wanadoo t'offre le Pack eXtense Haut Débit soit 150,92 euros d'économies ! Clique ici : http://www.ifrance.com/_reloc/mail.etudiant -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] __ Do you Yahoo!? Faith Hill - Exclusive Performances, Videos More http://faith.yahoo.com -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]