> I'm writing the 2nd edition of Advanced Perl for ORA; one of the
> many changes is a chapter on POE. The outline is essentially complete, and
> will discuss the POE architecture, the "parts" at length, and build at
> least one application.
>
> I'd like the folks whose brainchild this is (Rocco et all) to have fair
> input, but that will come later. For now I'm looking for input on the
> application itself. What types of programs do you feel not only show off
> the event-driven "paradigm", but also provide good introductions to
> using POE? I've written a few toys, but want something for this chapter
> that will be both useful, and have a decent "wow" factor.
>
> The ideas I've gotten from several POE users so far include:
>
> * A locking server
> * A port forwarder
> * A provisioning server; message queues, timed events, etc.
> * A recursive HTML-sucker; 'wget -r' == 'POE + Parse::RecDescent'
Well, I'm using it successfully for my site(s) whose father is
www.exitexchange.com. We are essentially an equivalent type of service to
linkexchange, bannerexchange, etc., only our service doesn't use BANNERS, it
uses "Exits." An exit is triggered by the javascript that our code on your
page runs, when a user "exits" (window.onunload) your page, we open a new
window and send it to the background. For every two exits you generate, you
earn one exposure (appearance in the Exit window). Of course, we stay in
business by selling some of this exit traffic to paid advertisers.
Well, keeping track of: who gets how many hits at what time in what category
how many times/day expressed as a porportion of their total hits_due over
the estimated daily site traffic, is kinda complex.
First, we tried using tables, triggers, and stored procedures in the
database. But that degenerated VERY quickly into a full-time balancing act
between me, the database, the web servers, and the table locks. It was
AWFUL. Every day, I would have to run the daily queue shuffle (generate the
list of hits to be served that day) and PRAY that I didn't lock up the
service (which needed a complete webserver, stats daemon, and database
restart). By the 4th or 5th month, I'd gotten it close, and didn't see any
lockups for weeks. Except, of course, when I did... just as I stopped
worrying so much.
Enter POE.
In about a single month's time (remember, now, I'm the sole geek for a whole
dot-com, and every day I field tech support requests from trivial to severe,
and was developing between this), I developed my queue daemon.
The queue daemon (QD), on startup, reads from a specially prepared view in
the database, indicating how many hits/day per acctid are desired, the urls
to redirect to, account categories, and other sundry details into a set of
in-memory queues. It initializes a TCP server to take exit requests, and
waits there.
When an exit request comes in, the QD goes thru a selection heuristic based
on the acctid of the request (among other things), and chooses an account
and its url to serve back to the mod_perl enabled webserver that made the
request in the first place.
Simultaneously, it starts a simple Web server so I can ask the QD how its
numbers look. This is simple in form and function, and F***ING BEAUTIFUL on
how nice it is to have and how easy it was to put in.
BUT WAIT! There's more!... :-)
As a part of the original development of the ExitExchange system, I
developed a program called the Count Daemon whose SOLE purpose was to accept
UDP packets from the web server indicating hits and their types (raw,
unique, exit, exposure, etc), cache them, and update the database
periodically with the new, incoming stats.
Well, with the old system, people's credits were decremented at the time the
exit was requested, right? But my people didn't like that, because we
really had no way to guarantee that the request for the exit was accompanied
by the exposure (showing somebody in the Exit Window). Oh, shure, if the
javascript/html part was written correctly, they should be within oh, 97% or
so of each other. But we figured it was more appropriate to decrement on
the actual exposure. This means, since the QD is responsible for updating
hits due to the database, that the UDP packets that were being sent to the
CountD had to be routed through the QD first. But we only wanted to act on
hits marked as "Exposures."
A bother, right? Not with POE. Add one more UDP Socketfactory to startup,
and one "got line" function. Then, point the webserver's configured Count
Daemon at the QD! When an incoming UDP packet comes in, the got_line()
function scans it with a simple regex to to see if it's an exposure. If so,
we yeild->("mark_exposure", $acctid). In ANY case, transmit the same packet
over to the REAL CountD, for recording as a hit.
And that's not all... you'll also receive these beautiful STEAK KNI... oh
wait, this is POE.
So every X number of hits, the QD starts a FLUSH process, scanning thru the
complete list of accounts internally for any accounts that have received
hits. That list is iterated through, decrementing each user as appropriate.
Since the nature of internet traffic is non-linear, this is a HUGE
performance savings: let's say acctid A get's 100 hits in that, oh, 1000.
Account B gets 40, and account C gets 1. Well, updating account C isn't any
better or worse than it could have been. But accounts A and B are 100 and
40 times more efficient, respectively.
Oh, wait... did I mention we're doing this at the same time as we're serving
exits and bouncing hits back to the count daemon? :-)
Also, every hour, the database is scanned again for any accounts whose urls
or active/inactive states have been changed by our customer support staff.
And FINALLY, every 24 hours, the view with that days worth of hits is
re-read, to serve again!
The last change I made was to add some PROPER daemonization: setuid/setgid,
fork, syslog, and so forth, from Lincoln Stein's "Network Programming with
Perl."
Next time I get a chance, I'm going to add a proper SIGHUP handler so I can
make changes and start the program again w/o losing the changes that have
happened since system reload time (ie the snapshot view is designed to load
at 11am. If I reload the qd at 3pm, that's 4 hours lost of hits that have
already been served...). And I'm going to re-tune part of our selection
heuristics to be a little more accurate to what we, as a company, would like
to see.
How's that for a real-world app? :-)
L8r,
Rob
#!/usr/bin/perl -w
use Disclaimer qw/:standard/;