Well... We can as well publish a firehose but currently our business model is not aimed at that.
I was not talking about a service rather a technology which could take the crawling business to another level by aggregating hundreds of hubs and creating something which effectively can deliver tb/s bandwidth by having decentralized servers and data. We are still limited you know by our infrastructure even though it have gb/s to the internet. Since the realtime web is currently still very small all of us need to poll something even you I presume to be able to create a pub/sub arch. In that remark our companies are quite similar, you chose to aggregate and publish your data and make a business of it. We aggregate and refine the data and make business out of that. Dont take me wrong I really like what you do but i am not looking for a data supplier at this time ( might change though ). But if I would look in the data supplier direction you are currently in my/our top ten list :) Skickat från min iPhone On Oct 21, 5:03 pm, Julien Genestoux <[email protected]> wrote: > Hum...http://superfeedr.com? > > "Putting ressources in common" is definetely one of the key reasons why we > built superfeedr. More about that there > :http://blog.superfeedr.com/gospel/something-stupid/ > > And yes, we have a firehose available. > > Julien > > -- > Julien Genestoux, > > http://twitter.com/julien51http://superfeedr.com > > +1 (415) 254 7340 > +33 (0)9 70 44 76 29 > > On Wed, Oct 21, 2009 at 5:26 AM, Marcus Herou > <[email protected]>wrote: > > > Feedtree looks cool.... but updated 2006 ? > > > On Wed, Oct 21, 2009 at 2:20 PM, Nick Johnson (Google) < > > [email protected]> wrote: > > >> On Wed, Oct 21, 2009 at 1:14 PM, Alexis Richardson < > >> [email protected]> wrote: > > >>> Hmmm ... gossiptorrent? > > >> Feedtree. > > >>> On Wed, Oct 21, 2009 at 7:23 AM, Marcus Herou > >>> <[email protected]> wrote: > > >>> > Hi. > > >>> > We host a search app which is based on feeds of blogs/twitter/forums/ > >>> > news etc. We are as you are mentioning polling everything like crazy > >>> > and it seems like a total waste of everyones resources. > > >>> > So this means that subscribing to something which would potentially > >>> > deliver the material to us would be great not just for us but as well > >>> > all sites we are crawling. > > >>> > However who would like to open up a firehose for free for everyone to > >>> > consume ? It will for sure consume a lot of bandwidth and a few > >>> > subscribers will consume most of the bandwidth with this model. > >>> > I thought of something that might solve this issue. Consider the > >>> > following: > > >>> > 1) > >>> > * Charge for the bandwidth (wordpress.com does this with flat fee) > > >>> > 2) > >>> > * Everyone that have firehose consuming needs should as well start a > >>> > hub to show good faith and morale. > >>> > * Add support in firehose enabled hubs to share state (with a > >>> > master ?) > >>> > * A firehose enabled hub can subscribe to a master hub which makes > >>> > sure that the subscriber as well fulfils some form of contract (i.e. > >>> > actually updating/delivering feeds) > >>> > * Each firehose enabled hub must be public and everyone can subscribe > >>> > to the feeds like as of current. > >>> > * To share load equally (morale part) then subscribers should > >>> > subscribe to a loadbalanced dns name or some form of delegate > >>> > lb.pshb.com = master hub > >>> > Example 1: lb.pshb.com resolves to pshb.tailsweep.com > >>> > pshb.google.com, effectively DNS-roundrobin > >>> > Example 2: lb.pshb.com delegates to any active master connected hub > >>> > in some way. > > >>> > This might be too complex to implement and bottlenecks occur at the > >>> > master but systems like Hadoop have bottlenecks in terms of the > >>> > NameNode (master) and it seems to perform just perfect so it can be > >>> > done. However each firehose hub probably need to persist each feed for > >>> > a certain amount of time before purging it. > > >>> > Anyway this was just a thought. We at Tailsweep probably could help in > >>> > making this happen if there exists some interest. > > >>> > Cheers > > >>> > //Marcus > > >>> > On Oct 20, 8:41 pm, Bob Wyman <[email protected]> wrote: > >>> >> On Tue, Oct 20, 2009 at 11:22 AM, igrigorik <[email protected]> wrote: > >>> >> > Specifically, if we treat 'firehose' as any bundle of > >>> >> > feeds (all, or some), then a hub could define > >>> >> > multiple firehose streams. > > >>> >> There should be no question that there is tremendous utility in being > >>> able > >>> >> to compose all sorts of "bundles" of topics into distinct feeds. It is > >>> >> probably also the case that we can identify some number of such > >>> bundles that > >>> >> would be useful to a large number of subscribers. On the other hand, > >>> many > >>> >> bundles will be very specific and only useful to one or a small number > >>> of > >>> >> subscribers. In fact, I think what we'll see is that once we have the > >>> core > >>> >> PSHB defined, we'll then see innovation in the definition of "down > >>> stream" > >>> >> services whose function is precisely to build and deliver such > >>> bundles. Some > >>> >> of these services will aggregate groups of topics while others will > >>> focus > >>> >> instead on creating content-based streams -- they will bundle together > >>> >> individual entries based on the content of those entries rather than > >>> simply > >>> >> combining all entries from some set of topics. > > >>> >> I think we should be careful not to force too much of the burden of > >>> bundling > >>> >> or aggregating into the core PSHB hub specification. If we want to > >>> address > >>> >> the challenges of building bundles or aggregations, I think it best to > >>> do so > >>> >> in secondary or companion specifications. This will keep the core > >>> cleaner > >>> >> and easy to understand while also allowing the core to be deployed > >>> without > >>> >> being delayed by discussions over non-core issues. > > >>> >> Having argued against making the core more complicated by extending it > >>> to > >>> >> include creating aggregate topics, I still suggest that it would be > >>> useful > >>> >> to have the core system define a common means to obtain a pure > >>> "firehose" > >>> >> feed of all topics. The current hub spec works for people who only > >>> want > >>> >> "none or some" of the topics served by the hub. I suggest that we > >>> expand > >>> >> this to have hubs know how to provide "none, some or all" of the > >>> topics. > >>> >> The reason for adding support of "all topics" is that we know, without > >>> much > >>> >> question, that such an "all topics" feed will be required by many of > >>> the > >>> >> downstream services that we will one day be relying on to create more > >>> finely > >>> >> defined aggregations. Given that this specific feed will be commonly > >>> >> required, it would be best if we had a common mechanism for a > >>> downstream > >>> >> service/subscriber to request that feed and that we set some > >>> expectations > >>> >> for how that feed will be formatted and delivered (i.e. Atom entries, > >>> >> persistent connections, chunked content model, ...). It would be very > >>> >> cumbersome for a downstream filtering/aggregating service to need to > >>> puzzle > >>> >> through service specific mechanisms for discovering how to obtain a > >>> firehose > >>> >> feed of "all topics" from many different hubs. > > >>> >> bob wyman > > >>> >> On Tue, Oct 20, 2009 at 11:22 AM, igrigorik <[email protected]> wrote: > > >>> >> > Right, so how does the smart hub aggregate the feeds? Does it then > >>> >> > have to crawl to find the list? That wouldn't be very useful. Having > >>> >> > said that... > > >>> >> > +1 For 'smart, aggregating hub generating a synthetic feed' > >>> >> > +1 For XRD discovery of the firehose endpoint. > > >>> >> > Thinking a bit more about the firehose, what about making it more > >>> >> > flexible. Specifically, if we treat 'firehose' as any bundle of > >>> feeds > >>> >> > (all, or some), then a hub could define multiple firehose streams. > >>> For > >>> >> > example, at PostRank we classify feeds by topic, so if someone > >>> wanted > >>> >> > to subscribe to "Technology", we could expose that as a firehose so > >>> >> > the user doesn't have to subscribe to every feed in that topic. In > >>> >> > essence, a firehose stream is then any bundle of feeds. > > >>> >> > This may be overloading the hub spec but the overall mechanics would > >>> >> > be: > >>> >> > - A (super)user can declare a firehose endpoint > >>> >> > - A (super)user is then able to add or remove subscriptions from > >>> the > >>> >> > firehose to create arbitrary aggregation streams > >>> >> > - A subscriber uses XRD to discover the available aggregation > >>> streams > >>> >> > - Firehose with 'all' feeds is a special case of the above, where > >>> all > >>> >> > feeds are present > > >>> >> > This definitely adds more complexity into the hub... The alternative > >>> >> > is of course for the publisher to create a syndicated feed and > >>> publish > >>> >> > that directly as a standalone feed. Still trying to weight the up/ > >>> >> > downsides in my head, but want to put it out there as an idea. > > >>> >> > -------- > >>> >> > Ilya Grigorik > >>> >> > postrank.com > > >> -- > >> Nick Johnson, Developer Programs Engineer, App Engine > >> Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number: > >> 368047 > > > -- > > Marcus Herou CTO and co-founder Tailsweep AB > > +46702561312 > > [email protected] > >http://www.tailsweep.com/
