Re: [PHP-DB] Working with large datasets
-- From: "Jack van Zanen" Sent: Tuesday, October 11, 2011 12:09 PM To: "Jason Pruim" Cc: "Thompson, Jimi" ; "Bastien" ; Subject: Re: [PHP-DB] Working with large datasets Hi You need to index the right fields. even on a laptop a select from 8 million rows with two rows returned should take a few seconds max only. The first time you run the query the data has to come from disk, second time you run same query you'd expect that data to sit in cache and be very quick. Jack van Zanen - This e-mail and any attachments may contain confidential material for the sole use of the intended recipient. If you are not the intended recipient, please be aware that any disclosure, copying, distribution or use of this e-mail or any attachment is prohibited. If you have received this e-mail in error, please contact the sender and delete all copies. Thank you for your cooperation On Tue, Oct 11, 2011 at 10:39 AM, Jason Pruim wrote: Jason Pruim li...@pruimphotography.com On Oct 10, 2011, at 5:27 PM, Thompson, Jimi wrote: > I really think that you should try running it from the command line and see what the issues are. Get both Apache and php out of the way. I've seen some PHP scripts use up all the file handles (OS limit) even on a 64 bit server when they start doing complex things with data sets. > > If it works ok without PHP/Apache then you can start lookig at PHP and APache. > > ISOLATE the issue not complicate it > > My 2 cents, Hi Jimi, I've done it from the command line a few times, first time I run a simple: SELECT * FROM Main WHERE state="test"; it takes: 2 rows in set (1 min 44.20 sec) after that initial run it's pretty responsive, usually around 0.01 seconds. If I do a select based on the new-york state, which is all of my almost 9 million records after the initial run, it returns it all VERY quickly: 25000 rows in set (0.02 sec) So commandline is running fine after the initial... I'm leaning towards either a problem with web server, PHP setup, or my PHP code... I've used the code many times before but never on such a large dataset... When I pull the pagination out completely It's pretty much the same result... fetching the info for new-york works just fine but not "test" One thing I am noticing right now though is the fact that when I switch over to using the test state Right now it's not displaying anything... Not even able to view the source... Okay... Enough rambling right now... Need to do some more checking before I can come to a conclusion :) -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php __ Information from ESET NOD32 Antivirus, version of virus signature database 6532 (20111010) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com __ Information from ESET NOD32 Antivirus, version of virus signature database 6532 (20111010) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DB] Working with large datasets
Exactly. That was my first guess - that his commandline request is first having to download 8M records which can take a long time. The OP's fear of "overhead from apache..." is not only unfounded, but would most definitely improve his response by simply running the query on the server and avoid the download. -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DB] Working with large datasets
Hi You need to index the right fields. even on a laptop a select from 8 million rows with two rows returned should take a few seconds max only. The first time you run the query the data has to come from disk, second time you run same query you'd expect that data to sit in cache and be very quick. Jack van Zanen - This e-mail and any attachments may contain confidential material for the sole use of the intended recipient. If you are not the intended recipient, please be aware that any disclosure, copying, distribution or use of this e-mail or any attachment is prohibited. If you have received this e-mail in error, please contact the sender and delete all copies. Thank you for your cooperation On Tue, Oct 11, 2011 at 10:39 AM, Jason Pruim wrote: > > Jason Pruim > li...@pruimphotography.com > > > > On Oct 10, 2011, at 5:27 PM, Thompson, Jimi wrote: > > > I really think that you should try running it from the command line and > see what the issues are. Get both Apache and php out of the way. I've seen > some PHP scripts use up all the file handles (OS limit) even on a 64 bit > server when they start doing complex things with data sets. > > > > If it works ok without PHP/Apache then you can start lookig at PHP and > APache. > > > > ISOLATE the issue not complicate it > > > > My 2 cents, > > Hi Jimi, > > I've done it from the command line a few times, first time I run a simple: > SELECT * FROM Main WHERE state="test"; it takes: 2 rows in set (1 min 44.20 > sec) > after that initial run it's pretty responsive, usually around 0.01 > seconds. > > If I do a select based on the new-york state, which is all of my almost 9 > million records after the initial run, it returns it all VERY quickly: > 25000 rows in set (0.02 sec) > > So commandline is running fine after the initial... I'm leaning towards > either a problem with web server, PHP setup, or my PHP code... I've used the > code many times before but never on such a large dataset... > > When I pull the pagination out completely It's pretty much the same > result... fetching the info for new-york works just fine but not "test" > > One thing I am noticing right now though is the fact that when I switch > over to using the test state Right now it's not displaying anything... > Not even able to view the source... > > Okay... Enough rambling right now... Need to do some more checking before I > can come to a conclusion :) > > > > -- > PHP Database Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > >
Re: [PHP-DB] Working with large datasets
Jason Pruim li...@pruimphotography.com On Oct 10, 2011, at 5:27 PM, Thompson, Jimi wrote: > I really think that you should try running it from the command line and see > what the issues are. Get both Apache and php out of the way. I've seen some > PHP scripts use up all the file handles (OS limit) even on a 64 bit server > when they start doing complex things with data sets. > > If it works ok without PHP/Apache then you can start lookig at PHP and > APache. > > ISOLATE the issue not complicate it > > My 2 cents, Hi Jimi, I've done it from the command line a few times, first time I run a simple: SELECT * FROM Main WHERE state="test"; it takes: 2 rows in set (1 min 44.20 sec) after that initial run it's pretty responsive, usually around 0.01 seconds. If I do a select based on the new-york state, which is all of my almost 9 million records after the initial run, it returns it all VERY quickly: 25000 rows in set (0.02 sec) So commandline is running fine after the initial... I'm leaning towards either a problem with web server, PHP setup, or my PHP code... I've used the code many times before but never on such a large dataset... When I pull the pagination out completely It's pretty much the same result... fetching the info for new-york works just fine but not "test" One thing I am noticing right now though is the fact that when I switch over to using the test state Right now it's not displaying anything... Not even able to view the source... Okay... Enough rambling right now... Need to do some more checking before I can come to a conclusion :) -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DB] Working with large datasets
On Oct 10, 2011, at 4:27 PM, Thompson, Jimi wrote: I really think that you should try running it from the command line and see what the issues are. Get both Apache and php out of the way. I've seen some PHP scripts use up all the file handles (OS limit) even on a 64 bit server when they start doing complex things with data sets. If it works ok without PHP/Apache then you can start lookig at PHP and APache. ISOLATE the issue not complicate it My 2 cents, Jimi From: Bastien [phps...@gmail.com] Sent: Monday, October 10, 2011 4:19 PM To: Jason Pruim Cc: php-db@lists.php.net Subject: Re: [PHP-DB] Working with large datasets On 2011-10-10, at 11:30 AM, Jason Pruim wrote: Hey everyone, I am working with a database that has close to 8 million records in it and it will be growing. I have a state field in the data, and I am attempting to test some query's on it, all but 2 records right now have the same state. My test info won't get pulled up... I believe it keeps timing out the connection. Is there any advice for working with large datasets? I'm wanting this to be able to load quickly. Thanks in advance! Jason Pruim li...@pruimphotography.com -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php Assuming mysql, what is the my.conf set for? Check that you are using the large dataset one. By default it's usually a small one. That will give you more memory and sort spaces work with the data. We routinely handle 8-10mm records and it's not tough. The tricks are 1: ensure enough sort space 2: ensure enough memory for large sets 3: ensure about php memory for results 4: try to add additional filters to reduce the data sets. A cardinality of two on a status will always return tons of records and you want to reduce that, maybe with a date range Bastien Koert 905-904-0334 -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php Hi Jason, I'd start with your max execution and max memory first and go from there before you change much on your server. maybe try a .httaccess file in the directory of the php file that makes the call to allow only that file the extra memory and execution? (I believe I am correct in my thinking that you can specify max memory and max execution per directory or even per file with an .httaccess, please correct me if I am wrong) I think that and the indexing of your database would help. After that I would go with retrieving sets of info at a time instead of retrieving the whole database at once. Or like Bastien stated, narrowing it to certain date ranges. Then you can create a pagination if your displaying results. HTH, Karl DeSaulniers Design Drumm http://designdrumm.com -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP-DB] Working with large datasets
I really think that you should try running it from the command line and see what the issues are. Get both Apache and php out of the way. I've seen some PHP scripts use up all the file handles (OS limit) even on a 64 bit server when they start doing complex things with data sets. If it works ok without PHP/Apache then you can start lookig at PHP and APache. ISOLATE the issue not complicate it My 2 cents, Jimi From: Bastien [phps...@gmail.com] Sent: Monday, October 10, 2011 4:19 PM To: Jason Pruim Cc: php-db@lists.php.net Subject: Re: [PHP-DB] Working with large datasets On 2011-10-10, at 11:30 AM, Jason Pruim wrote: > Hey everyone, > > > I am working with a database that has close to 8 million records in it and it > will be growing. I have a state field in the data, and I am attempting to > test some query's on it, all but 2 records right now have the same state. > > My test info won't get pulled up... I believe it keeps timing out the > connection. > > Is there any advice for working with large datasets? I'm wanting this to be > able to load quickly. > > Thanks in advance! > > > Jason Pruim > li...@pruimphotography.com > > > > > -- > PHP Database Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > Assuming mysql, what is the my.conf set for? Check that you are using the large dataset one. By default it's usually a small one. That will give you more memory and sort spaces work with the data. We routinely handle 8-10mm records and it's not tough. The tricks are 1: ensure enough sort space 2: ensure enough memory for large sets 3: ensure about php memory for results 4: try to add additional filters to reduce the data sets. A cardinality of two on a status will always return tons of records and you want to reduce that, maybe with a date range Bastien Koert 905-904-0334 -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DB] Working with large datasets
On 2011-10-10, at 11:30 AM, Jason Pruim wrote: > Hey everyone, > > > I am working with a database that has close to 8 million records in it and it > will be growing. I have a state field in the data, and I am attempting to > test some query's on it, all but 2 records right now have the same state. > > My test info won't get pulled up... I believe it keeps timing out the > connection. > > Is there any advice for working with large datasets? I'm wanting this to be > able to load quickly. > > Thanks in advance! > > > Jason Pruim > li...@pruimphotography.com > > > > > -- > PHP Database Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > Assuming mysql, what is the my.conf set for? Check that you are using the large dataset one. By default it's usually a small one. That will give you more memory and sort spaces work with the data. We routinely handle 8-10mm records and it's not tough. The tricks are 1: ensure enough sort space 2: ensure enough memory for large sets 3: ensure about php memory for results 4: try to add additional filters to reduce the data sets. A cardinality of two on a status will always return tons of records and you want to reduce that, maybe with a date range Bastien Koert 905-904-0334 -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP-DB] Working with large datasets
You have a field in your WHERE clause that isn't indexed - you need an index. Try something like this: ALTER TABLE `Database`.`Table` ADD INDEX `state`(`state`); Think about it - you're asking for the rows that have a certain value in the 'state' field. If you don't provide the database with an index, it has to do a full table scan to retrieve the results. Toby -Original Message- From: Jason Pruim [mailto:pru...@gmail.com] Sent: Monday, October 10, 2011 2:37 PM To: Jim Giner Cc: php-db@lists.php.net Subject: Re: [PHP-DB] Working with large datasets RIght now though I only have 1 state inputed to work with though. I may need to just increase the max execution time as well... But it still runs too slowly ... Even from the commandline searching for a simple: SELECT * from Table WHERE state="test"; takes 56.96 seconds to search and returns only 2 records with 4 columns... Could this just be a hardware problem? Here is the structure of the table Im working with: ++-+--+-+-++ | Field | Type| Null | Key | Default | Extra | ++-+--+-+-++ | ID| int(11)| NO | PRI | NULL| auto_increment | | phone| text | NO | MUL | NULL|| | config | text | NO | | NULL|| | state | text | YES || NULL|| ++-+--+-+-++ -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DB] Working with large datasets
When you say machine, do you mean the client that you're sitting at, or the machine that hosts the data? As for doing it thru a web server, the amount of time Apache, et al, would consume is miniscule. The web interface would not be involved in the reading of the data or the processing, just the output-ing of the results. :) -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DB] Working with large datasets
Jason Pruim pru...@gmail.com On Oct 10, 2011, at 2:42 PM, Jim Giner wrote: > I don't do command line stuff so I may not be right in my thinking. If you > are running a php query from a client, does the query get executed on the > database server, or does all the data have to come down to you to be > queried? When you do it from the commandline it gets executed directly on the machine. When you do it from PHP (Or something else) It gets executed first on the server and then pipped down to you. Basically commandline should be faster by a long shot since it's being executed directly. But when you run from a web browser, you have to throw in the overhead for apache, or what ever web server you're using. -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DB] Working with large datasets
I don't do command line stuff so I may not be right in my thinking. If you are running a php query from a client, does the query get executed on the database server, or does all the data have to come down to you to be queried? -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DB] Working with large datasets
Jason Pruim pru...@gmail.com On Oct 10, 2011, at 2:20 PM, Jim Giner wrote: > > ""Toby Hart Dyke"" wrote in message > news:00da01cc8768$ca9e9200$5fdbb600$@hartdyke.com... >> >> It sounds as though you don't have an index on the right field. 8 million >> records should be no problem if you have the right indexes applied, and >> you're not trying to do anything too complicated. >> >> Toby > > I doubt that the State field is a primary index, or that it would be used as > one, which means that it could be a secondary one. If it is - that would be > a pretty long record itself and could be the problem therein. With > virtually all the records tied to one secondary key it is mostly a worthless > secondary index. I'd try removing it and seeing what happens. Actually it will be in the end if they keep going with the site... It's a "Report who called you and why" type of site. So for SEO purposes I'm changing the links from: index.php?phone=X&state=NY to: /new-york/XX RIght now though I only have 1 state inputed to work with though. I may need to just increase the max execution time as well... But it still runs too slowly ... Even from the commandline searching for a simple: SELECT * from Table WHERE state="test"; takes 56.96 seconds to search and returns only 2 records with 4 columns... Could this just be a hardware problem? Here is the structure of the table Im working with: ++-+--+-+-++ | Field | Type| Null | Key | Default | Extra | ++-+--+-+-++ | ID| int(11)| NO | PRI | NULL| auto_increment | | phone| text | NO | MUL | NULL|| | config | text | NO | | NULL|| | state | text | YES || NULL|| ++-+--+-+-++ I'm starting to lean more towards it being a problem with hardware though... I'm going to try and get the specs of the machine it's running on... (Not my host :)) -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DB] Working with large datasets
""Toby Hart Dyke"" wrote in message news:00da01cc8768$ca9e9200$5fdbb600$@hartdyke.com... > > It sounds as though you don't have an index on the right field. 8 million > records should be no problem if you have the right indexes applied, and > you're not trying to do anything too complicated. > > Toby I doubt that the State field is a primary index, or that it would be used as one, which means that it could be a secondary one. If it is - that would be a pretty long record itself and could be the problem therein. With virtually all the records tied to one secondary key it is mostly a worthless secondary index. I'd try removing it and seeing what happens. -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP-DB] Working with large datasets
It sounds as though you don't have an index on the right field. 8 million records should be no problem if you have the right indexes applied, and you're not trying to do anything too complicated. Toby -Original Message- From: Jason Pruim [mailto:li...@pruimphotography.com] Sent: Monday, October 10, 2011 11:30 AM To: php-db@lists.php.net Subject: [PHP-DB] Working with large datasets Hey everyone, I am working with a database that has close to 8 million records in it and it will be growing. I have a state field in the data, and I am attempting to test some query's on it, all but 2 records right now have the same state. My test info won't get pulled up... I believe it keeps timing out the connection. Is there any advice for working with large datasets? I'm wanting this to be able to load quickly. Thanks in advance! Jason Pruim li...@pruimphotography.com -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP-DB] Working with large datasets
Hey everyone, I am working with a database that has close to 8 million records in it and it will be growing. I have a state field in the data, and I am attempting to test some query's on it, all but 2 records right now have the same state. My test info won't get pulled up... I believe it keeps timing out the connection. Is there any advice for working with large datasets? I'm wanting this to be able to load quickly. Thanks in advance! Jason Pruim li...@pruimphotography.com -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php