So, I've got a nice bunch of Bayesian filters to do spam detection,
tweet categorization and link canonicalization and classification.
The stuff runs great on now, but I'm looking to
share the load for other properties I'm developing for other locations/
verticals.  In this load sharing, I want one processor doing the link-
work, and one processor doing the tweet processing work across all
properties. (of course there will be N-machines doing the work, but I
want to only do the work once per...)

So the ideal thing would be some way to emit the applicable metadata
as annotations in a new tweet in the tweet stream, placing the new
"classification, typing & labeling" information on the NEW tweet. When
I create that tweet, I would make it "in reply to" the original tweet
being classified to easily link the two.

It SEEMS like this is the ideal solution, in general, to the post-
mutability of tweet annotations... just tweet another tweet with the
annotations that you want to apply to the original tweet, set the in-
reply-to-tweet-id and go about business. When that new tweet is seen
by the "in the know" application, it knows to apply the metadata
retroactively to the original tweet in whatever manner it wish...
think things like a "read flag", a "star rating", a "sentiment
analysis", etc... heck, you could even track triggered "trouble ticket
numbers" like this...

I don't mind someone else seeing all these tweets (honest, don't
care), but I wonder how Twitter will feel about what are essentially
just automated tweets used as a broadcast communication channel.
These tweets would not be very interesting in themselves because the
tweet message would be essentially irrelevant. How do I keep from
triggering spam filters, and how do I get over the tweets-per-day
limits for this sort of work?

So, any ideas?

Reply via email to