[pubsubhubbub] Re: PSHB Firehose

Ian Kallen Thu, 22 Oct 2009 11:51:57 -0700

The scenario I had mind was one where the flags origins were in themetadata; end users whose identity is untrusted can't be trusted toprovide useful flags as reliably as a trusted partner's administrativeflags. Back when I worked on a service that handled torrents of realtime data, we regularly found a high volume of sites from differentservices (blogger, livejournal/vox, wordpress, etc) that we didn't wantto index. The changes.xml, 6a atom stream and pingomatic updates thatare, in essence, firehouses, have no feedback mechanism. There's no wayto tell the service that originated them that, 'this update element wegot from you is one we're declining to process. HTH.' Sure, if I hadtime (occasionally) I'd send emails to folks I knew at the differentservices; a manual and ad hoc process; meanwhile resources continued tobe wasted, at both ends, on noise. It seems like there's an opportunityto build quality/utility feedback into the ecosystem and I think it'd bea shame to let that opportunity pass.

Since I'm not working on a PSHB application ATM, this is almost moot forme. But if I were consuming or providing PSHB for firehose access I'dwant a programmatic feedback mechanism built in.


John Panzer wrote:

Blogger would also like to know if a feed is getting spam flags
downstream of the hub.  It could be useful info.  More so if the
subscriber / reporter has a verified identity and reputation of
course.  Don't forget about intentional false positive attacks.

Gets complex quickly, will always be optional, thus perhaps a separate
spec or extension is better.

On Monday, October 19, 2009, Ian Kallen <[email protected]> wrote:

That's not a systemic problem, that's a terms-of-service problem. If I were 
operating a hub, I wouldn't re-publish the feedback nor offer any guarantee 
that an item is distributed to subscribers. If the subscribers I serve decided 
a site is a feed-scraper/affiliate/pills/pr0n site they don't like, that's 
between us.

I'm not a proponent of YAGNI complexity. I'm assuming folks are interested in directing 
creative energies outside of spam filtering, PSHB should be informed by the failings of 
XMLRPC pings, SMTP and the other spam-infected data streams we've all experienced with 
protocols of days gone by that were too "simple" to account for these issues. 
It'd be a damn shame if hubs later end up having to bolt on awkward afterthoughts (SPF, 
sender ID, etc), that's why I'm raising it now.

Bob Wyman wrote:


A major problem with "feedback" is that folk may be offended when their data is 
flagged as spam. This can lead to lawsuits and expensive demands to know who flagged the 
content... This risk is one reason why spam is usually dropped silently.

bob wyman


On Oct 19, 2009 6:31 PM, "Ian Kallen" <[email protected] 
<mailto:[email protected]>> wrote:

igrigorik wrote: > There are a few things that would have to be resolved: > - 
Spam. Just as with t...

It'd be nice to see feedback mechanism built into this ecosystem lest
the hubs simply become open relays. Right now, services consuming the
SixApart atomstream have no way to programaticly tell SixApart "We
received updates for source X but found it to be spam". Similarly,
sources downstream from pingomatic can't tell PoM how (un)useful a ping
it relayed was; today > 90% of the old-school XMLRPC pings out there are
for a spam site and each of the services downstream from ping relayers
have their own mechanisms for dealing with it but no feedback recourse.
I'd hate to see a PSHB fail to learn from these lessons.

This gets hairy if the hub re-publishes the feedback; naturally spammers
would love to know when their publishes are detected as spam so they
adapt their tactics. But if the hub simply knows that all of its
destinations rejected items from a source, it can decide whether or not
to continue relaying its content. I would imagine that hubs would
advertise the repertoire of feedback they'll accept and provisioning
feedback to the hub would be optional.

--
Ian Kallen
blog: http://www.arachna.com/roller/spidaman
tweetz: http://twitter.com/spidaman
vox: 925.385.8426





--
Ian Kallen
blog: http://www.arachna.com/roller/spidaman
tweetz: http://twitter.com/spidaman
vox: 925.385.8426



--
Ian Kallen
blog: http://www.arachna.com/roller/spidaman
tweetz: http://twitter.com/spidaman
vox: 925.385.8426

[pubsubhubbub] Re: PSHB Firehose

Reply via email to