Re: [PHP] Searching on AlphaNumeric Content Only
- Original Message > From: Paul M Foster > To: php-general@lists.php.net > Sent: Friday, September 4, 2009 9:15:08 PM > Subject: Re: [PHP] Searching on AlphaNumeric Content Only > > On Fri, Sep 04, 2009 at 08:15:41PM -0400, Robert Cummings wrote: > > > Paul M Foster wrote: > >> On Fri, Sep 04, 2009 at 04:22:18PM -0400, Eddie Drapkin wrote: > >> > >>> On Fri, Sep 4, 2009 at 12:02 PM, Lupus > >>> Michaeliswrote: > >> > >>> if you're on shared hosting (but then again I am of the opinion > >>> that those who choose to run with shared hosting dig their own graves > >>> in more ways than one). > >> > >> Any time you or someone else would like to backstop me in setting up a > >> dedicated server on rackspace or somewhere else, for free or really > >> cheap, you let me know! Otherwise, those of us with less than complete > >> expertise in server setup are stuck with shared hosting. ;-} > > > > For a few hundred bucks a year you can get a VPS (Virtual Private > > Server), which gives you root access. once you get a couple of clients > > on it, it will pay for itself year after year. If you have clients that > > don't use much resources, you can put a few on and even make a profit on > > the hosting alone. > > Oh sure. Well aware of it. The problem is not finding a VPS or working > with root access or anything like that. The problem is expertise. POP3, > SMTP, SSH, HTTP, DNS, firewall security while still allowing access to > outward facing servers, etc. It's more expertise than most people have, > including me. The servers I run are internal and don't have to deal with > the rigors of the internet, and only serve a couple of people. Setting > up multiple domains under an Apache server is black magic to me, for > instance. And then there's backups, and > what-do-I-do-if-the-server-fails, etc. Sheesh. I stand corrected. You paid for peace of mind :D > > Paul > > -- > Paul M. Foster > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Searching on AlphaNumeric Content Only
On Fri, Sep 04, 2009 at 08:15:41PM -0400, Robert Cummings wrote: > Paul M Foster wrote: >> On Fri, Sep 04, 2009 at 04:22:18PM -0400, Eddie Drapkin wrote: >> >>> On Fri, Sep 4, 2009 at 12:02 PM, Lupus >>> Michaelis wrote: >> >>> if you're on shared hosting (but then again I am of the opinion >>> that those who choose to run with shared hosting dig their own graves >>> in more ways than one). >> >> Any time you or someone else would like to backstop me in setting up a >> dedicated server on rackspace or somewhere else, for free or really >> cheap, you let me know! Otherwise, those of us with less than complete >> expertise in server setup are stuck with shared hosting. ;-} > > For a few hundred bucks a year you can get a VPS (Virtual Private > Server), which gives you root access. once you get a couple of clients > on it, it will pay for itself year after year. If you have clients that > don't use much resources, you can put a few on and even make a profit on > the hosting alone. Oh sure. Well aware of it. The problem is not finding a VPS or working with root access or anything like that. The problem is expertise. POP3, SMTP, SSH, HTTP, DNS, firewall security while still allowing access to outward facing servers, etc. It's more expertise than most people have, including me. The servers I run are internal and don't have to deal with the rigors of the internet, and only serve a couple of people. Setting up multiple domains under an Apache server is black magic to me, for instance. And then there's backups, and what-do-I-do-if-the-server-fails, etc. Sheesh. Paul -- Paul M. Foster -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Searching on AlphaNumeric Content Only
Paul M Foster wrote: On Fri, Sep 04, 2009 at 04:22:18PM -0400, Eddie Drapkin wrote: On Fri, Sep 4, 2009 at 12:02 PM, Lupus Michaelis wrote: if you're on shared hosting (but then again I am of the opinion that those who choose to run with shared hosting dig their own graves in more ways than one). Any time you or someone else would like to backstop me in setting up a dedicated server on rackspace or somewhere else, for free or really cheap, you let me know! Otherwise, those of us with less than complete expertise in server setup are stuck with shared hosting. ;-} For a few hundred bucks a year you can get a VPS (Virtual Private Server), which gives you root access. once you get a couple of clients on it, it will pay for itself year after year. If you have clients that don't use much resources, you can put a few on and even make a profit on the hosting alone. Cheers, Rob. -- http://www.interjinn.com Application and Templating Framework for PHP -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Searching on AlphaNumeric Content Only
On Fri, Sep 04, 2009 at 04:22:18PM -0400, Eddie Drapkin wrote: > On Fri, Sep 4, 2009 at 12:02 PM, Lupus > Michaelis wrote: > if you're on shared hosting (but then again I am of the opinion > that those who choose to run with shared hosting dig their own graves > in more ways than one). Any time you or someone else would like to backstop me in setting up a dedicated server on rackspace or somewhere else, for free or really cheap, you let me know! Otherwise, those of us with less than complete expertise in server setup are stuck with shared hosting. ;-} Paul -- Paul M. Foster -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Searching on AlphaNumeric Content Only
- Original Message > From: Robert Cummings > To: Eddie Drapkin > Cc: Lupus Michaelis ; php-general@lists.php.net > Sent: Friday, September 4, 2009 1:36:08 PM > Subject: Re: [PHP] Searching on AlphaNumeric Content Only > > Eddie Drapkin wrote: > > On Fri, Sep 4, 2009 at 12:02 PM, Lupus > > Michaeliswrote: > >> Ashley Sheridan a écrit : > >> > >>> You'll have far greater performance issues if you retrieve all those > >>> records and attempt to do the same thing inside of PHP... > >> It's why I speak about « avoiding » and not « bannishing ». Like can be > >> usefull, I used to use it. But it is not the a good answer to all problems. > >> The problem with like operator is it can't use the index (or in a very > >> limited way). So I try to warn about it. > >> > >> So said, I never submit an all-retrieving method. I know it isn't the > >> solution too. > >> > >> -- > >> Mickaël Wolff aka Lupus Michaelis > >> http://lupusmic.org > >> > >> -- > >> PHP General Mailing List (http://www.php.net/) > >> To unsubscribe, visit: http://www.php.net/unsub.php > >> > >> > > > > So far, in this thread, there've been a few solutions: > > 1) LIKE in SQL. > > 2) REGEXP in SQL. > > 3) PCRE in PHP > > 4) Other fetch all methods in PHP. > > > > The one thing that I'm seeing as a consistent agreement is that the > > performance hit of whichever of the aforementioned measures is going > > to be enough to be considering something else. I briefly mentioned - > > I apologize for the brevity of that email because I was in a hurry - > > that a legitimate full text search engine is the right solution to > > this problem. The only problem with deploying a full text search > > engine is going to be the difficulty in the deployment and perhaps > > issues if you're on shared hosting (but then again I am of the opinion > > that those who choose to run with shared hosting dig their own graves > > in more ways than one). > > > > What a full text search engine gives you is flexibility in your > > searches, such that the initial question, when I read it, I thought > > "Oh, someone will tell him to use Sphinx or Solr as both have special > > filters for word seperation and would handle this without any special > > instruction." Instead, this is never even brought up! > > > > Why was using a full text search engine to do this sort of thing - not > > to mention the other benefits that it would bring (responsiveness and > > flexibility in searching, speed, decreased use of MySQL, etc. etc.) - > > rejected so offhandedly? I can't actually think of a better way to do > > this without requiring a whole heap of overhead, either processing or > > programming. > > I've just sort of stopped in on this thread... but why isn't MySQL's > FULLTEXT engine being considered? If I remember correctly, it's only on available on MyISAM table type. I don't think the OP mentioned that he's using MyISAM or MySQL for that matter. > > Cheers, > Rob. > -- > http://www.interjinn.com > Application and Templating Framework for PHP > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Searching on AlphaNumeric Content Only
Eddie Drapkin wrote: On Fri, Sep 4, 2009 at 12:02 PM, Lupus Michaelis wrote: Ashley Sheridan a écrit : You'll have far greater performance issues if you retrieve all those records and attempt to do the same thing inside of PHP... It's why I speak about « avoiding » and not « bannishing ». Like can be usefull, I used to use it. But it is not the a good answer to all problems. The problem with like operator is it can't use the index (or in a very limited way). So I try to warn about it. So said, I never submit an all-retrieving method. I know it isn't the solution too. -- Mickaël Wolff aka Lupus Michaelis http://lupusmic.org -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php So far, in this thread, there've been a few solutions: 1) LIKE in SQL. 2) REGEXP in SQL. 3) PCRE in PHP 4) Other fetch all methods in PHP. The one thing that I'm seeing as a consistent agreement is that the performance hit of whichever of the aforementioned measures is going to be enough to be considering something else. I briefly mentioned - I apologize for the brevity of that email because I was in a hurry - that a legitimate full text search engine is the right solution to this problem. The only problem with deploying a full text search engine is going to be the difficulty in the deployment and perhaps issues if you're on shared hosting (but then again I am of the opinion that those who choose to run with shared hosting dig their own graves in more ways than one). What a full text search engine gives you is flexibility in your searches, such that the initial question, when I read it, I thought "Oh, someone will tell him to use Sphinx or Solr as both have special filters for word seperation and would handle this without any special instruction." Instead, this is never even brought up! Why was using a full text search engine to do this sort of thing - not to mention the other benefits that it would bring (responsiveness and flexibility in searching, speed, decreased use of MySQL, etc. etc.) - rejected so offhandedly? I can't actually think of a better way to do this without requiring a whole heap of overhead, either processing or programming. I've just sort of stopped in on this thread... but why isn't MySQL's FULLTEXT engine being considered? Cheers, Rob. -- http://www.interjinn.com Application and Templating Framework for PHP -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Searching on AlphaNumeric Content Only
On Fri, Sep 4, 2009 at 12:02 PM, Lupus Michaelis wrote: > Ashley Sheridan a écrit : > >> You'll have far greater performance issues if you retrieve all those >> records and attempt to do the same thing inside of PHP... > > It's why I speak about « avoiding » and not « bannishing ». Like can be > usefull, I used to use it. But it is not the a good answer to all problems. > The problem with like operator is it can't use the index (or in a very > limited way). So I try to warn about it. > > So said, I never submit an all-retrieving method. I know it isn't the > solution too. > > -- > Mickaël Wolff aka Lupus Michaelis > http://lupusmic.org > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > > So far, in this thread, there've been a few solutions: 1) LIKE in SQL. 2) REGEXP in SQL. 3) PCRE in PHP 4) Other fetch all methods in PHP. The one thing that I'm seeing as a consistent agreement is that the performance hit of whichever of the aforementioned measures is going to be enough to be considering something else. I briefly mentioned - I apologize for the brevity of that email because I was in a hurry - that a legitimate full text search engine is the right solution to this problem. The only problem with deploying a full text search engine is going to be the difficulty in the deployment and perhaps issues if you're on shared hosting (but then again I am of the opinion that those who choose to run with shared hosting dig their own graves in more ways than one). What a full text search engine gives you is flexibility in your searches, such that the initial question, when I read it, I thought "Oh, someone will tell him to use Sphinx or Solr as both have special filters for word seperation and would handle this without any special instruction." Instead, this is never even brought up! Why was using a full text search engine to do this sort of thing - not to mention the other benefits that it would bring (responsiveness and flexibility in searching, speed, decreased use of MySQL, etc. etc.) - rejected so offhandedly? I can't actually think of a better way to do this without requiring a whole heap of overhead, either processing or programming. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Searching on AlphaNumeric Content Only
Ashley Sheridan a écrit : You'll have far greater performance issues if you retrieve all those records and attempt to do the same thing inside of PHP... It's why I speak about « avoiding » and not « bannishing ». Like can be usefull, I used to use it. But it is not the a good answer to all problems. The problem with like operator is it can't use the index (or in a very limited way). So I try to warn about it. So said, I never submit an all-retrieving method. I know it isn't the solution too. -- Mickaël Wolff aka Lupus Michaelis http://lupusmic.org -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Searching on AlphaNumeric Content Only
On Fri, 2009-09-04 at 17:00 +0200, Lupus Michaelis wrote: > Ashley Sheridan a écrit : > > What's wrong with using the wildcards that are built into most SQL > > variants? > >Performance issues. Like is an operator to avoid when possible. > > -- > Mickaël Wolff aka Lupus Michaelis > http://lupusmic.org > You'll have far greater performance issues if you retrieve all those records and attempt to do the same thing inside of PHP... Thanks, Ash http://www.ashleysheridan.co.uk -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Searching on AlphaNumeric Content Only
Ashley Sheridan a écrit : What's wrong with using the wildcards that are built into most SQL variants? Performance issues. Like is an operator to avoid when possible. -- Mickaël Wolff aka Lupus Michaelis http://lupusmic.org -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Searching on AlphaNumeric Content Only
> > stripping, stemming, spelling corrections ? > ... uhm, that's probably why they invented regular expressions, isn't it? > > As I said, at the end of the day, this will be a manual slow, potentially > wrong implementation of what we already have and use on daily basis. If you've got a regular-expression-based method in mind that simply nails the OP's problem, please share. I'm still not seeing how "regular expressions" is a sufficient answer to the OP's problem, which is basically fuzzy search. My sense is that regular expressions are for situations where you basically know just what you're searching for, but don't really know where it falls in your search space. The OP, on the other hand, is building a system where he won't know just what he's searching for -- all he'll know is that his search key is "sort of like" the thing he actually needs to find. You might be able to squeeze this problem, or at least some part of it, into a regex-based solution, but I don't think it's a natural fit. Ben -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] Searching on AlphaNumeric Content Only
stripping, stemming, spelling corrections ? ... uhm, that's probably why they invented regular expressions, isn't it? As I said, at the end of the day, this will be a manual slow, potentially wrong implementation of what we already have and use on daily basis. But obviously, everybody is free to create his own problems, no doubts about that. Regards > Has anyone considered deploying an actual search engine (Solr, Sphinx, > etc.), as they will take care of the stripping, stemming, spelling > corrections, etc? _ With Windows Live, you can organize, edit, and share your photos. http://www.microsoft.com/middleeast/windows/windowslive/products/photo-gallery-edit.aspx
Re: [PHP] Searching on AlphaNumeric Content Only
On Thu, Sep 03, 2009 at 12:12:40PM -0700, sono...@fannullone.us wrote: > Thanks to everyone who has responded. After reading everyone's > response, I think I have a very simple way to solve my "problem". > > Using my original example, if someone wants to find item # > 4D-2448-7PS, no matter what they type in, I'll take the input, strip > out all non-alphanumeric characters to make it 4D24487PS, add the > wildcard character between each of the remaining characters like so, > 4*D*2*4*4*8*7*P*S, and then do the search. Your expression, if used to directly search in your SQL table, won't work. The '*' character isn't a valid wildcard for SQL. In PostgreSQL, the wildcard for any number of characters is '%', and for a single character is '_'. I don't know that MySQL understands this same convention. And who knows about Oracle. As others have mentioned, it would be ideal (though not very "normalized") to create a new table column which contains the alphanumerics without the punctuation characters ('-'). In nearly any SQL dialect, you could do a simple SELECT using LIKE to find your item, if you're searching on this extra field. If you want do the searching in PHP, then it becomes more complicated. You'll have to strip out the dashes from the user input, and then query all the keys from your table, and test them using a regular expression. As mentioned before, this is time-consuming for a large table. Here's something else to consider: Could there ever be two items which only differ by the placement of their dashes? Like 4D-2448-7PS versus 4D2-44-87PS? If not, then you should store the item number without punctuation, and use that as the primary key on your table. Have an "extra" field which shows the item number with dashes. You can use this extra field in printing inventory labels or whatever (I don't recall the context of your original post). Paul -- Paul M. Foster -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Searching on AlphaNumeric Content Only
- Original Message > From: "sono...@fannullone.us" > To: PHP General List > Sent: Thursday, September 3, 2009 12:12:40 PM > Subject: Re: [PHP] Searching on AlphaNumeric Content Only > > Thanks to everyone who has responded. After reading everyone's response, > I > think I have a very simple way to solve my "problem". > > Using my original example, if someone wants to find item # 4D-2448-7PS, > no > matter what they type in, I'll take the input, strip out all non-alphanumeric > characters to make it 4D24487PS, add the wildcard character between each of > the > remaining characters like so, 4*D*2*4*4*8*7*P*S, and then do the search. The correct wildcard syntax to work in any DB (Oracle, MySQL, MSSQL, etc), is % and not * if I remember correctly. Searching like this is ok but won't be efficient when you have a lot of rows. As for external file processing txt, csv, etc... I recommend you create a separate mechanism for it since each storage medium is meant for different purposes. txt (both delimited and fix formatted) and csv are usually meant for importing/exporting between various RDBMS types and different companies. They're not mean for fast searching of data. I suggest you think about the amount of the data you have to deal with 1st and how often will the search be done on that data. It's probably easier and faster just to import the ascii into db and do you search on db if you have to work with any ascii. As for adding another field to the db, perhaps your project just started? If so, wouldn't it be better to do it with the future in mind so later you won't have to go back and redesign the db and modify the codes because now you have over 100k rows to search and the search occurs just about every other hits? That time you now have could be used for code optimizing for better performance, add more features/functionalities to the site, etc... :) Trust me, searching the db table with over 200k rows and return the results with multi-table joins based 1 criteria isn't fun. Keep in mind that you shouldn't keep the users waiting more than 5 seconds. Only exception to that rule is data mining where you'll have millions of rows to work with ;) Then it's no longer your problem. It's the DBA :D Regards, Tommy > > Still being new at this, it seems to be the simplest approach, or is my > thinking flawed? This also keeps me from having to add another field in the > db > to search on. > > BTW, this solution needs to work with any db, even ASCII files, so it has > to > happen in PHP. > > Thanks again, > Frank > > --PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Searching on AlphaNumeric Content Only
On Thu, Sep 3, 2009 at 3:17 PM, Ashley Sheridan wrote: > On Thu, 2009-09-03 at 12:12 -0700, sono...@fannullone.us wrote: >> Thanks to everyone who has responded. After reading everyone's >> response, I think I have a very simple way to solve my "problem". >> >> Using my original example, if someone wants to find item # >> 4D-2448-7PS, no matter what they type in, I'll take the input, strip >> out all non-alphanumeric characters to make it 4D24487PS, add the >> wildcard character between each of the remaining characters like so, >> 4*D*2*4*4*8*7*P*S, and then do the search. >> >> Still being new at this, it seems to be the simplest approach, or is >> my thinking flawed? This also keeps me from having to add another >> field in the db to search on. >> >> BTW, this solution needs to work with any db, even ASCII files, so it >> has to happen in PHP. >> >> Thanks again, >> Frank >> > For speed you might want to consider an extra field in the DB in the > future. If the database gets larger, or your query needs to join several > tables together, then things will take a noticeable speed hit. I had a > similar issue myself where I had to search for names based on > mis-spellings of them. In the end I searched with metaphone tags on an > extra field in the DB set up for that purpose, but it was the only way > to do it that didn't affect the speed of the site. > > Thanks, > Ash > http://www.ashleysheridan.co.uk > > > > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > > Has anyone considered deploying an actual search engine (Solr, Sphinx, etc.), as they will take care of the stripping, stemming, spelling corrections, etc? -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Searching on AlphaNumeric Content Only
On Thu, 2009-09-03 at 12:12 -0700, sono...@fannullone.us wrote: > Thanks to everyone who has responded. After reading everyone's > response, I think I have a very simple way to solve my "problem". > > Using my original example, if someone wants to find item # > 4D-2448-7PS, no matter what they type in, I'll take the input, strip > out all non-alphanumeric characters to make it 4D24487PS, add the > wildcard character between each of the remaining characters like so, > 4*D*2*4*4*8*7*P*S, and then do the search. > > Still being new at this, it seems to be the simplest approach, or is > my thinking flawed? This also keeps me from having to add another > field in the db to search on. > > BTW, this solution needs to work with any db, even ASCII files, so it > has to happen in PHP. > > Thanks again, > Frank > For speed you might want to consider an extra field in the DB in the future. If the database gets larger, or your query needs to join several tables together, then things will take a noticeable speed hit. I had a similar issue myself where I had to search for names based on mis-spellings of them. In the end I searched with metaphone tags on an extra field in the DB set up for that purpose, but it was the only way to do it that didn't affect the speed of the site. Thanks, Ash http://www.ashleysheridan.co.uk -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Searching on AlphaNumeric Content Only
Thanks to everyone who has responded. After reading everyone's response, I think I have a very simple way to solve my "problem". Using my original example, if someone wants to find item # 4D-2448-7PS, no matter what they type in, I'll take the input, strip out all non-alphanumeric characters to make it 4D24487PS, add the wildcard character between each of the remaining characters like so, 4*D*2*4*4*8*7*P*S, and then do the search. Still being new at this, it seems to be the simplest approach, or is my thinking flawed? This also keeps me from having to add another field in the db to search on. BTW, this solution needs to work with any db, even ASCII files, so it has to happen in PHP. Thanks again, Frank -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] Searching on AlphaNumeric Content Only
> Even if the REGEXP has to change with every query? Ben, it does not matter, this is not a PHP problem but a DB structure/select/insert/update problem. Whatever REGEXP you use, a REGEXP is what you need to solve this problem, certainly not a PHP loop over each row with operations for each rows. These things are OK if you do not know REGEXP or REGEXP MySQL syntax, but in this case you should ask for the correct REGEXP rather than talk about performances, obviously slower outside MySQL and via a runtime interpreted language as PHP is, or other solutions which aim is to end up with something that just emulate a select with REGEXP. Did you get my point? Finally, when I say "you" I mean generally speaking :) Regards _ With Windows Live, you can organize, edit, and share your photos. http://www.microsoft.com/middleeast/windows/windowslive/products/photo-gallery-edit.aspx
Re: [PHP] Searching on AlphaNumeric Content Only
> What's wrong with using the wildcards that are built into most SQL > variants? > > SELECT * FROM table WHERE item_id LIKE '%#abcdef' > > Will select all records where the item_id field ends in '#abcdef' That works if you know the user is always going to enter the last 7 characters of the product id, but that's not how the OP characterized the problem. The OP talked about search strings where multiple characters had been omitted from different parts of the product id. Ben -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Searching on AlphaNumeric Content Only
> Excuse me? Somebody suggested a PHP loop to solve a query problem and you are > saying that REGEXP should not be used? > MySQL caches queries and 100 SELECT with a REGEXP will cost zero after the > first one if nothing changed inside the table. Even if the REGEXP has to change with every query? Performance aside, I think REGEXP() could be used here, but not in the way you've suggested. As the OP has described his table, your regex ("^[a-zA-Z0-9]+$") won't match any rows, because all of his product IDs have non-alphanumeric characters in them. Suppose this table: pk | prod_id 1 | 07-ABCD-98 2 | 98-ZCXQ-21 And now suppose the OP's scenario, where a user tries to search on product id, but enters "07ABCD98". If the aim is to use REGEXP() to return row 1, I suppose you could intersperse the search string with ".?" sequences and end up with this query: SELECT * FROM table WHERE prod_id REGEXP '^0.?7.?A.?B.?C.?D.?9.?8$' I think just stripping the alphanumeric characters would end up being more flexible, though. -Ben -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] Searching on AlphaNumeric Content Only
> Indeed you could do it via a regexp, but that uses up quite some memory. > Every time you do a SELECT. You can simply add a table column with the > stripped value and let the table update itself (with an ON UPDATE ON > INSERT trigger, which takes the input value for the itemID and strips it > once). > > When doing this on inputting the value into the database, you save > yourself the pain (and performance) of doing it on every SELECT-query. Excuse me? Somebody suggested a PHP loop to solve a query problem and you are saying that REGEXP should not be used? MySQL caches queries and 100 SELECT with a REGEXP will cost zero after the first one if nothing changed inside the table. At the same time an internal REGEXP is faster than everything else has to move out and be parsed after via, probably, the same REGEXP engine. Try some bench. This problem, imho, is a non-problem, at least not a PHP problem. How MySQL optimizes internally REGEXPs is not PHP problem as well. It's like to create a loop to read byte after byte because file_get_contents could be memory greedy (if you do that with 1 Gb of file you are you doing wrong in any case, logs need to be split as example) or avoid MATCH AGAINST in query if we have too many rows because of performances problem (table could be slipt as well to optimize performances) ... and these practices to avoid native solutions are a bit hilarious, imho. Regards _ Drag n’ drop—Get easy photo sharing with Windows Live™ Photos. http://www.microsoft.com/windows/windowslive/products/photos.aspx
RE: [PHP] Searching on AlphaNumeric Content Only
Which DB? If it is MySQL, as example, you can simply use REGEXP syntax "^[a-zA-Z0-9]+$" via SELECT Regards > From: sono...@fannullone.us > To: php-general@lists.php.net > Date: Wed, 2 Sep 2009 20:47:15 -0700 > Subject: [PHP] Searching on AlphaNumeric Content Only > > Is there is a way to search only for the alphanumeric content of > field in a db? I have an itemID field that contains item #'s that > include dashes, forward slashes, etc, and I want people to be able to > search for an item # even if they don't enter the punctuation exactly. > > Here's an example: let's say there is an itemID of 4D-2448-7PS but > someone omits the dashes and searches on 4D24487PS. Is it possible in > PHP to have the find be successful, even if the search criteria > doesn't exactly match what's stored in the field? > > If this is possible, I'd appreciate it if someone could just point me > in the right direction so I can read up on it. > > Thanks, > Frank > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > _ Drag n’ drop—Get easy photo sharing with Windows Live™ Photos. http://www.microsoft.com/windows/windowslive/products/photos.aspx
Re: [PHP] Searching on AlphaNumeric Content Only
On Wed, 2009-09-02 at 21:30 -0700, Ben Dunlap wrote: > >Is there is a way to search only for the alphanumeric content of > > field in a db? I have an itemID field that contains item #'s that include > > dashes, forward slashes, etc, and I want people to be able to search for an > > item # even if they don't enter the punctuation exactly. > > Not sure if there's anything specifically PHP-ish that will help you > here, but I would be inclined to start by storing a stripped-down > version of the item ID (alphanumeric characters only) in a separate > column in the database table. > > Then, when a user enters some search data, I would remove > non-alphanumeric characters, if any, from the user's input, and then > search the stripped column with this normalized version of the input. > > If you want even fuzzier matching (inadvertent transpositions or an > omitted character or two OK, for example), you might read about > Levenshtein distance: > > http://en.wikipedia.org/wiki/Levenshtein_distance > > PHP has a levenshtein function but you'll have to figure out a way to > use it efficiently with your data set. Or, if Levenshtein isn't quite > right for your needs, the article above might at least point you in a > useful direction. > > Ben > What's wrong with using the wildcards that are built into most SQL variants? SELECT * FROM table WHERE item_id LIKE '%#abcdef' Will select all records where the item_id field ends in '#abcdef' Thanks, Ash http://www.ashleysheridan.co.uk -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Searching on AlphaNumeric Content Only
> Is there is a way to search only for the alphanumeric content of > field in a db? I have an itemID field that contains item #'s that include > dashes, forward slashes, etc, and I want people to be able to search for an > item # even if they don't enter the punctuation exactly. Not sure if there's anything specifically PHP-ish that will help you here, but I would be inclined to start by storing a stripped-down version of the item ID (alphanumeric characters only) in a separate column in the database table. Then, when a user enters some search data, I would remove non-alphanumeric characters, if any, from the user's input, and then search the stripped column with this normalized version of the input. If you want even fuzzier matching (inadvertent transpositions or an omitted character or two OK, for example), you might read about Levenshtein distance: http://en.wikipedia.org/wiki/Levenshtein_distance PHP has a levenshtein function but you'll have to figure out a way to use it efficiently with your data set. Or, if Levenshtein isn't quite right for your needs, the article above might at least point you in a useful direction. Ben -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Searching on AlphaNumeric Content Only
Hi, It's definitely possible to do when you do it in PHP, but not sure about on the database side. You could read all records into memory and then iterate over it with something like: $toSearch = "4D24487PS" $charsToIgnore = array('-','+',...); foreach ($items as $k=>$item) { $itemVal = str_replace($charsToIgnore, '', $item); if (strcmp(str_replace($charsToIgnore, '', $toSearch), $itemVal) == 0) { $return = $item; break; } } This however might use a lot of memory, but if your DB is a manageable size it should be ok. You can probably optimise it by iterating over a db result set instead of reading everything into an array. Cheers, Tim ++Tim Hinnerk Heuer++ http://www.ihostnz.com 2009/9/3 >Is there is a way to search only for the alphanumeric content of > field in a db? I have an itemID field that contains item #'s that include > dashes, forward slashes, etc, and I want people to be able to search for an > item # even if they don't enter the punctuation exactly. > >Here's an example: let's say there is an itemID of 4D-2448-7PS but > someone omits the dashes and searches on 4D24487PS. Is it possible in PHP > to have the find be successful, even if the search criteria doesn't exactly > match what's stored in the field? > >If this is possible, I'd appreciate it if someone could just point > me in the right direction so I can read up on it. > > Thanks, > Frank > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > >
[PHP] Searching on AlphaNumeric Content Only
Is there is a way to search only for the alphanumeric content of field in a db? I have an itemID field that contains item #'s that include dashes, forward slashes, etc, and I want people to be able to search for an item # even if they don't enter the punctuation exactly. Here's an example: let's say there is an itemID of 4D-2448-7PS but someone omits the dashes and searches on 4D24487PS. Is it possible in PHP to have the find be successful, even if the search criteria doesn't exactly match what's stored in the field? If this is possible, I'd appreciate it if someone could just point me in the right direction so I can read up on it. Thanks, Frank -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php