I think at this point, as I've been working on it the past few hours, I can safely say that it is NOT going into an endless loop. It's just dying at a certain point. Interestingly, it always seems to dies at the same point. When indexing a particular site, I noticed that it was dying on a certain url. I also noticed that it was trying to index some .jpg and .gif files. I put in a filter so it wouldn't do anything with those binary files, and then re ran the app. Now, with a lot less files to go through and hence a lot less work to do, it still died on the same url as before. I then cleared the database and put in that specific URL and started there. It indexed that fine and did what it was supposed to do. No matter what though, if I start at the root of the site, or anywhere else, it dies right there. I've also noticed that when it gets to the point where it's dying the hard drive does a massive write. Interestingly, this happens on the boot drive, and not the raid array where the database, php, and the webserver live. I've got a fairly sizable swapfile on both drives, and 1.5 gig of memory. I can't imagine it's a memory problem, but you never know.

I believe I have the right conditions specified, but I do plan to go and review all of that both in the app and in the server environment. As for the status of the code, I'm not sure yet. I need to make something from this, and I haven't quite figured out yet how people make money from open source without charging a fortune for support. I'd rather charge less up front and support it for free, but we'll see what happens.


Matthew Moldvan wrote:

Even if the system is working correctly the first couple times, it may go
into an endless loop if you do not specify the right conditions, for any
programming application ...

I am very curious about this project ... is it open source?  If so, I'd be
interested in taking a look at how you implemented it.

Matthew Moldvan.

System Administrator,
Trilogy International, Inc.

-----Original Message-----
From: Nicholas Fitzgerald [mailto:[EMAIL PROTECTED]
Sent: Wednesday, March 12, 2003 7:58 AM
Subject: Re: [PHP-DB] Re: Real Killer App!

Well, I'm not locking them out exactly, but for good reason. When a url is first submitted it goes into the database with a checksum value of 0 and a date of 0000-00-00. If the checksum is 0 the spider will process that url and update the record with the proper info. If the checksum is not 0, then it checks the date. If the date is passed the date for reindexing then it goes ahead and updates the record, it also checks against the checksum to see if the url has changed, in which case it updates.

It does look like it's going into an endless loop, but the strange thing is that it goes through the loop successfully a couple of times first. That's what's got me confused.


Nelson Goforth wrote:

Do you "lock out" the URLs that have already been indexed? I'm wondering if your system is going into an endless loop?

Reply via email to