[pubsubhubbub] Re: PSHB Firehose

John Panzer Thu, 22 Oct 2009 13:24:38 -0700

While this is not the primary use case, I think that Salmon (
http://salmon-protocol.org) would actually be a good fit for "spam flag"
actions.  It's intended to handle things like "likes" and ratings, it
provides endpoint discovery and identity (of service and of flagger).  The
provenance problem is essentially the same whether you're processing "flag
as spam" or "like this" actions.
All of which reminds me that I should send an intro/announcement email to
this list about Salmon, as it has some PubSubHubbub in it.


(Disclaimer:  I'm evangelizing Salmon in general right now.)
--
John Panzer / Google
[email protected] / abstractioneer.org <http://www.abstractioneer.org/> /
@jpanzer



On Thu, Oct 22, 2009 at 11:51 AM, Ian Kallen <[email protected]> wrote:

>
> The scenario I had mind was one where the flags origins were in the
> metadata; end users whose identity is untrusted can't be trusted to provide
> useful flags as reliably as a trusted partner's administrative flags. Back
> when I worked on a service that handled torrents of real time data, we
> regularly found a high volume of sites from different services (blogger,
> livejournal/vox, wordpress, etc) that we didn't want to index. The
> changes.xml, 6a atom stream and pingomatic updates that are, in essence,
> firehouses, have no feedback mechanism. There's no way to tell the service
> that originated them that, 'this update element we got from you is one we're
> declining to process. HTH.' Sure, if I had time (occasionally) I'd send
> emails to folks I knew at the different services; a manual and ad hoc
> process; meanwhile resources continued to be wasted, at both ends, on noise.
> It seems like there's an opportunity to build quality/utility feedback into
> the ecosystem and I think it'd be a shame to let that opportunity pass.
>
> Since I'm not working on a PSHB application ATM, this is almost moot for
> me. But if I were consuming or providing PSHB for firehose access I'd want a
> programmatic feedback mechanism built in.
>
>
> John Panzer wrote:
>
>> Blogger would also like to know if a feed is getting spam flags
>> downstream of the hub.  It could be useful info.  More so if the
>> subscriber / reporter has a verified identity and reputation of
>> course.  Don't forget about intentional false positive attacks.
>>
>> Gets complex quickly, will always be optional, thus perhaps a separate
>> spec or extension is better.
>>
>> On Monday, October 19, 2009, Ian Kallen <[email protected]> wrote:
>>
>>
>>> That's not a systemic problem, that's a terms-of-service problem. If I
>>> were operating a hub, I wouldn't re-publish the feedback nor offer any
>>> guarantee that an item is distributed to subscribers. If the subscribers I
>>> serve decided a site is a feed-scraper/affiliate/pills/pr0n site they don't
>>> like, that's between us.
>>>
>>> I'm not a proponent of YAGNI complexity. I'm assuming folks are
>>> interested in directing creative energies outside of spam filtering, PSHB
>>> should be informed by the failings of XMLRPC pings, SMTP and the other
>>> spam-infected data streams we've all experienced with protocols of days gone
>>> by that were too "simple" to account for these issues. It'd be a damn shame
>>> if hubs later end up having to bolt on awkward afterthoughts (SPF, sender
>>> ID, etc), that's why I'm raising it now.
>>>
>>> Bob Wyman wrote:
>>>
>>>
>>> A major problem with "feedback" is that folk may be offended when their
>>> data is flagged as spam. This can lead to lawsuits and expensive demands to
>>> know who flagged the content... This risk is one reason why spam is usually
>>> dropped silently.
>>>
>>> bob wyman
>>>
>>>
>>> On Oct 19, 2009 6:31 PM, "Ian Kallen" <[email protected] <mailto:
>>> [email protected]>> wrote:
>>>
>>> igrigorik wrote: > There are a few things that would have to be resolved:
>>> > - Spam. Just as with t...
>>>
>>> It'd be nice to see feedback mechanism built into this ecosystem lest
>>> the hubs simply become open relays. Right now, services consuming the
>>> SixApart atomstream have no way to programaticly tell SixApart "We
>>> received updates for source X but found it to be spam". Similarly,
>>> sources downstream from pingomatic can't tell PoM how (un)useful a ping
>>> it relayed was; today > 90% of the old-school XMLRPC pings out there are
>>> for a spam site and each of the services downstream from ping relayers
>>> have their own mechanisms for dealing with it but no feedback recourse.
>>> I'd hate to see a PSHB fail to learn from these lessons.
>>>
>>> This gets hairy if the hub re-publishes the feedback; naturally spammers
>>> would love to know when their publishes are detected as spam so they
>>> adapt their tactics. But if the hub simply knows that all of its
>>> destinations rejected items from a source, it can decide whether or not
>>> to continue relaying its content. I would imagine that hubs would
>>> advertise the repertoire of feedback they'll accept and provisioning
>>> feedback to the hub would be optional.
>>>
>>> --
>>> Ian Kallen
>>> blog: http://www.arachna.com/roller/spidaman
>>> tweetz: http://twitter.com/spidaman
>>> vox: 925.385.8426
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Ian Kallen
>>> blog: http://www.arachna.com/roller/spidaman
>>> tweetz: http://twitter.com/spidaman
>>> vox: 925.385.8426
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>
>
> --
> Ian Kallen
> blog: http://www.arachna.com/roller/spidaman
> tweetz: http://twitter.com/spidaman
> vox: 925.385.8426
>
>
>

[pubsubhubbub] Re: PSHB Firehose

Reply via email to