Justin Santa Barbara wrote:
What about Philip's point on existing Hadoop interfaces? Any plans for how
we'll generate the Protocol object for these?
I'm hoping to use reflection initially for this. That's the motivation
for my renewed interest in AVRO-80.
https://issues.apache.org/jira/browse/AVRO-80
My rationale is that I don't want to assume we'll move Hadoop onto Avro
overnight. So I'd like to move things in a way that's easy to maintain
in a branch/patch. If we can get reflection to work, then we only need
to update two places per Hadoop protocol: where it calls RPC.getProxy()
and RPC.getServer(). Then we can start looking at performance. We
cannot commit Avro-based Hadoop RPC until performance is adequate, and
we don't want to have to maintain a patch that changes many central data
structures in Hadoop while we're testing and improving performance,
since that might take time.
Once we've committed Hadoop to using Avro, then we can consider,
protocol-by-protocol, replacing Hadoop's Writable objects with Avro
generated objects. Until then, the protocol will be defined implicitly
by Java through reflection.
Note that if performance is inadequate due to reflection itself, rather
than the client/server implementations, we might resort to byte-code
modification to accelerate it.
https://issues.apache.org/jira/browse/AVRO-143
This would also be a temporary approach. Longer-term we should move
Hadoop to use protocols declared in .avpr files and generated classes.
But I don't think that's practical in the short-term.
My current short-term goal is to try to get Avro's reflection to the
point where it can implement NamenodeProtocol.
Does this make sense?
Doug