I've done screenscrapes that took an hour. Mostly, it was because the target server was sloooooow. So I just wrote the page so that it would do X scrapes, then push some JS to redirect to the next batch. It also had the advantage of letting me restart in the middle if the process hung.
--Ben Michael Dinowitz wrote: > That's why I'm confused. I've done complex screen scraping on ebay and > others and it never takes more than a few seconds per request, even big > ones. Maybe I write tighter regex, but the point is that it should never > take hours. In addition, most places that you'd want data from have one form > or another of web services which can speed everything up even more. > I just gave a presentation on web services at CFUNITED and the power of 7 > (and 6.1) allows access to just about everything that web services have (6.0 > didn't allow custom soap headers). I'd like to hear a lot more about the app > and issues as this sounds like something that should not be any sort of > 'weight' on a machine. > And yes, async processing is the bomb!!! I gave a presentation on using it > for user logging and it is so sweet that Macromedia should allow it for > every version of CF, not just enterprise. > > > >>>I've worked on some big systems in the past and on some >>>lousy hardware and I >>>can say that: >>>1. CF can handle 2 hour templates >>>2. CF should never have to handle 2 hour templates >>>3. With some tighter coding and caching, just about >>>anything can be sped up. >> >>Well it sounded to me like he was describing "screen-scraping" their >>web-pages, which is why I recommended the Asynch CFML gateway... >>although any given piece of software can always be optimized for >>performance, I'd expect the gains from optimizing a "screen-scrape" >>routine wouldn't be much in comparison to the total time for the >>process. >> >>s. isaac dealey 954.522.6080 >>new epoch : isn't it time for a change? >> >>add features without fixtures with >>the onTap open source framework >> >>http://www.fusiontap.com >>http://coldfusion.sys-con.com/author/4806Dealey.htm >> >> >> >> >> > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| Discover CFTicket - The leading ColdFusion Help Desk and Trouble Ticket application http://www.houseoffusion.com/banners/view.cfm?bannerid=48 Message: http://www.houseoffusion.com/lists.cfm/link=i:5:162809 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/5 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:5 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.5 Donations & Support: http://www.houseoffusion.com/tiny.cfm/54
