> -----Original Message-----
> From: Meli Helmut [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, August 10, 2005 2:16 PM
> To: CF-Talk
> Subject: Re: Fetching a website?
> 
> >> I saw wget already. I know why to invent the wheel again, if
> >> it already exists... but I would like to write something like
> >> this on my one...
> >
> >I submit that CF is a spectacularly poor choice for doing this, though.
> >
> Could you also tell me please why CF is a poor choice to do this and which
> Script Language would you recommend??

The problem isn't one of capability, it's one of performance.

The main problem is that "plain" CF is a serialized, procedural language -
you can only do one thing at a time.  An app like the one you describe, to
be fast, really needs to have good control over threading and process
management (when a single web page might require several dozen HTTP requests
doing them in serial just kills an apps performance).

Using the ASynch gateway you can launch multiple processes at the same time
helps immensely, but each of these requests is a "full request" in CF - all
of the overhead related to any CF request is there.

A more "atomic" language (like C or Java) can launch threads in a very
efficient way that allows you to get your specific job done with as little
overhead as possible.

In general you're looking at a well written CF application being orders of
magnitude slower than a well written Java or C application.  (Although, of
course, a great CF app can still beat a crappy Java or C application.)

As for scripting languages... none of them are perfectly suited this kind of
highly concurrent task.  Script languages are optimized for use, not for
performance generally.

But you COULD build it in a lot of languages.  JavaScript or VBScript
(leveraging COM objects in a Windows Scripting Host or HTA Container) could
do it.
 
PERL may be the best choice but you'd really need to add on some of the more
esoteric extensions to get the request performance up.  PERL is very good at
parsing text, but it's not the parsing so much as the HTTP traffic that
worries me.  So PERL may not be significantly faster than CF on such a job.

In the right container (something that wasn't security limited to a host
site for example) pretty much any scripting language with the ability to do
HTTP and string manipulation could do it.  Python and TCL come to mind.
Heck, you COULD do this in Lingo if you wanted.  ;^)

Jim Davis




~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Logware (www.logware.us): a new and convenient web-based time tracking 
application. Start tracking and documenting hours spent on a project or with a 
client with Logware today. Try it for free with a 15 day trial account.
http://www.houseoffusion.com/banners/view.cfm?bannerid=67

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:214448
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations & Support: http://www.houseoffusion.com/tiny.cfm/54

Reply via email to