Re: Avro at Flurry

Anthony Watkins Thu, 12 Jul 2012 15:02:45 -0700

Hi Doug,

Thanks for the kind words. You are correct, we must maintain our endpoints
for older versions until all clients have upgraded. In our case this is a
constraint set by the environment. Our sdk is integrated into developer
apps and once those apps are distributed among their customer base there is
no mechanism to update the client. Therefore, any endpoint distributed with
our sdk must be maintained. We did previously use one endpoint across
several versions of our custom binary protocol. The issue for us was the
code supporting that endpoint grew in size and had many code paths, which
made regression testing and maintenance more difficult. This is inevitable
at some point, but we felt the element of accepting and parsing requests
should stay concise.

That said, having Avro handle the marshaling of requests to the appropriate
encoders/decoders would meet our need for a streamlined communication
protocol while providing us a single endpoint. I would be very interested
in using such a feature within our server infrastructure. Also, I will
certainly take another look at the handshake mechanism. Again part of our
consideration was the existing infrastructure we have setup for
communication between our clients and server. I certainly hope my comments
on the embedded RPC mechanism weren't taken as a criticism by anyone.

As you probably know Doug, Flurry is a huge proponent of the Hadoop family.
We wouldn't be able to support the scale we're at today without
Hadoop/Hbase and we think Avro is a great addition to our stack.

Best Regards,

Anthony

--
Anthony Watkins
Director of Partner Integration
www.flurry.com

On Thu, Jul 12, 2012 at 4:52 PM, Doug Cutting <[email protected]> wrote:

> On Thu, Jul 12, 2012 at 1:11 PM, Anthony Watkins <[email protected]>
> wrote:
> > http://tech.flurry.com/apache-avro-at-flurry.
>
> Nice article!  Thanks for writing it.
>
> Your idea of using different endpoints for different versions of the
> protocol is an interesting one.  For example, a protocol's fingerprint
> might be part of the URL path where requests are made.   If I
> understand correctly, this requires that an endpoint exists for the
> exact version of the protocol that every client uses.  As you upgrade
> clients, the endpoints for older versions must be kept until all
> clients are upgraded.  Is that right?
>
> Note that if the server already has the client's protocol version,
> then Avro's standard handshake mechanism does not transmit the
> protocol, only hashes of it.  Even when protocols do not match they're
> only transmitted by the first client to connect to the server with
> that version.  After that the server caches it.  However the handshake
> does add ~32 bytes to each connection, for the MD5 of both the
> client's and the servers protocol.
>
> The Java handshake implementation could be made more extensible.
> Currently the server is initially only aware of one version of the
> protocol, its own.  But it could instead be started with a set of
> known protocol versions that clients might use.  If a client connects
> using a known, compatible version then the server could accept it and
> also use the client's version to serialize responses, avoiding the
> transmission of any protocols.  If this sounds useful to someone,
> please file an issue in Jira.
>
> Doug
>

Re: Avro at Flurry

Reply via email to