It's not libcurl, per se, that's the problem, but just using curl(1) from a
shell script doesn't necessarily give you the control that you might desire
to build a stable client. You may want to set a connect timeout, socket
timeout, perform parsing before persisting to disk, use overlapping
connections to update predicates without data loss, provide an estimate for
the count parameter, deal with events from downstream processes, rotate
journal files, or any number of other things that are non-obvious
or difficult to do with a single threaded shell script wrapping a curl(1)
process.

You require periodic reconnections to deal with predicate change on your
end, or operational bumps (server restarts after code deploys) on our end.
Most of the blogged Streaming API examples I've seen contain no error
handling and would cause you to get locked out and your IP address
eventually banned.

-John Kalucki
http://twitter.com/jkalucki
Infrastructure, Twitter Inc.


On Wed, Jun 30, 2010 at 2:47 AM, Mark Linsey <mjlin...@gmail.com> wrote:

> I am just getting started with the streaming API, and I was a bit
> puzzled by this line in the documentation:
>
> "While a client can be built around cycling connections, perhaps using
> curl for transport, the overall reliability will tend to be poor due
> to operational gotchas. Save curl for debugging and build upon a
> persistent process that does not require periodic reconnections."
>
> I am not at all familiar with the internals of how libcurl works, so
> maybe I'm missing something quite obvious, but can't curl/libcurl keep
> a persistent connection? Why does using it require periodic
> reconnections? In fact, many examples around the web of how to consume
> the streaming API seem to use libcurl. (I'm using Python so have been
> looking at this example in particular:
>
> http://arstechnica.com/open-source/guides/2010/04/tutorial-use-twitters-new-real-time-stream-api-in-python.ars
> )
>

Reply via email to