What would be really useful to me would be the ability to asynchronously
thread sections of my CF code. So, here's my recommendation. If I can get
some feedback from y'all, then I'll throw it at the development team at
Allaire/Macromedia and see what they think of it. I wish I'd thought to do
this a few months ago before the con, but ... whatever, moving on.
I propose a <CFASYNC> tag, and its sub-tag <CFTHREAD>. It's really simple
from a CFML point of view: An entire block of code is wrapped in a <CFASYNC>
tag, and any sub-blocks that can run independently are wrapped in <CFTHREAD>
tags. For example:
<CFASYNC>
<CFTHREAD>
<CFQUERY DATASOURCE="ds1" NAME="Data1">
...
</CFQUERY>
</CFTHREAD>
<CFTHREAD>
<CFQUERY DATASOURCE="ds2" NAME="Data2">
...
</CFQUERY>
</CFTHREAD>
</CFASYNC>
This example shows two queries, both of which could take a while, that could
run at once instead of sequentially. If each query took 5 seconds, the
current CF would produce a page in 10 seconds. With CFASYNC, it would take
just 5 seconds. The same could be done for simultanous HTTP requests, etc.
Each thread executes simultanously, but the page request doesn't exit the
closing CFASYNC tag until all the threads have completed. Optional
TIMEOUT="n" attributes in both tags (the one in CFASYNCH overriding the ones
in the CFTHREADs) would probably be a good idea to kill off errant threads.
Okay, some background and explanation ...
I've been thinking about how to make a couple of our web apps better,
faster, stronger, etc., and the one thing that I just can't seem to program
my way around is our intranet home page. It uses a "portal"-style interace
where users can chose from a couple dozen different windows. Many of the
windows gather data from external sources and some are even real-time
displays of such external data. (User registration graphs, web hits,
Slashdot headlines; standard portal-style stuff.) Each window is
individually smart about its own retrieval and caching and knows how long
before it needs to refresh itself. (1 hour for Slashdot, 5 minutes for user
registration graphs, 1 minute for firewall statistics, etc.) I don't want
to set up a scheduled task for each window, as I don't want to tax our
incoming bandwidth any more than I have to. (There's no point in retrieving
Slashdot 24x a day if no one is in the office to see it.) So, each window
only refreshes its data when it absolutely has to.
So the problem is this: the first person in every morning inevitably has
quite a wait while everything goes out and reloads, especially if they have
more than a couple of the windows chosen. If any of the source sites is
slow then there is even the possibility of timing out the request. The
application handles this okay, but the user does not. I don't want to
arbitrarily start putting timeouts in my CFHTTP tags.
Some of the existing tags that don't allow you to specify a return variable
name and instead return into a global variable (like CFHTTP.FileContent and
CFQUERY.ExecutionTime) will need to be altered to include a VARIABLE
attribute to prevent re-entrancy problems. That is, if I have 3 threads
with CFHTTP requests which I'll be managing in-memory (instead of the FILE=
attribute), using CFHTTP.FileContent isn't reentrant. If two requests
finish at the exact same time, I've got no guarantee that the following
<CFSET foo=CFHTTP.FileContent> lines will be executed in the right order.
(Plus, it's pretty lame using global variables like that. Those attributes
should be implemented whether or not CFASYNC is.)
This also introduces the problem of locking for Page/Request-level
variables. Assuming that Macromedia/Allaire doesn't get their stuff
together and finally fix the memory corruption problem, this can be worked
around with adding SCOPE="REQUEST" to the CFLOCK tag. (Lame!)
Another consideration is this: where do all these threads come from? Do you
eat into the existing shared thread pool that is set in the CF Admin? Or do
you create a second shared thread pool? I'd actually advocate a second
shared thread pool. That way, if an administrator wanted to disable
threading s/he could just set the "In-Request Thread Pool Count" to zero,
which effective single-threads the entire request. You're no worse off than
when you started. Also, in a hosting situation, you don't get unknown
developers eating into your primary thread pool using CFHTTP to download
pr0n from 15 different web sites at once.
As for workarounds ... currently it would be astoundingly difficult. The
only want I can think to do it would be to:
1. Put each of my threads in a unique template file.
2. Code each thread to use Session/Application variables in place of Request
variables.
3. Write a CFX tag that spawns a process that calls the command line version
of CF with the thread template, and then returns immediately, not waiting
for CF to return output.
4. Repeat Step 3 for each thread.
5. Pause the calling template for 1 second. (CFLOCK trick.)
6. CFLOOP over step 6 until all threads have finished or a timeout is
reached. (You can tell when threads are finished because they define some
unique Application/Session variables. Good locking is essential here.)
7. Continue on with my code.
Of course, this solution is far from elegant and has a host of problems.
(Not the least of which is that you are eating into your global thread
pool.) This is why I want a better solution.
So, what do y'all think? Is there a demand for something like this? Is
this something we should (nicely) pressure/ask/beg Macromedia/Allaire to
implement for us?
--
Rick Osborne
Certified Advanced Cold Fusion Developer, Web Guru, and Large-Egoed
Programmer
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Structure your ColdFusion code with Fusebox. Get the official book at
http://www.fusionauthority.com/bkinfo.cfm
FAQ: http://www.thenetprofits.co.uk/coldfusion/faq
Archives: http://www.mail-archive.com/[email protected]/
Unsubscribe: http://www.houseoffusion.com/index.cfm?sidebar=lists