I agree that it makes sense. It also seems that for the goal of seamlessly moving Hadoop interfaces to use Avro, that doesn't _require_ that the parameter names be correct, only that they be consistent. So, if we don't have paranamer or Java-annotations, we can simply fall back to auto-generated names like string1, string2, int1, like in Avro-164. That way we can get a simple build (I really hate complicated builds, and it'll probably encourage avro adoption if we don't need build-system changes)
If in future someone changes the method signature of hadoop interface in a way that would break serialization compatibility, then that might be the time to require annotations (my preference) or paranamer, or that they move to a .avpr! It would suck a little that this would effectively bake in names like 'string2' into the contract for Hadoop interfaces, but if we can make the barrier low enough, maybe someone would do a little work on the Hadoop interfaces. For me, a java annotation or manually adding the paranamer metadata meets that threshold, adding a step to the build probably doesn't. On the performance issue, I don't think reflection is actually that bad on Java 6 if you cache the Method/Field objects. But I think we should allow external helper objects for serialization. The helper would implement the get/set field contract, but would act on a 'target object'. This would allow us to use existing classes (which could have extra methods/logic). It would also let us use the existing Hadoop objects unchanged. We would have an implementation that works using runtime Reflection, and we could also code-generate these just like we do in the Specific case. If people think this is a good idea I'll open a JIRA ticket for it (and maybe even work on it at the hackat...@digg!) Justin On Thu, Nov 12, 2009 at 10:43 PM, Philip Zeyliger <[email protected]>wrote: > Makes sense to me. I think it may be useful to check in the .avpr files > that are induced on the way, to let folks start trying to use different > clients for certain operations. > > -- Philip > > On Wed, Nov 11, 2009 at 11:21 AM, Doug Cutting <[email protected]> wrote: > > > Justin Santa Barbara wrote: > > > >> What about Philip's point on existing Hadoop interfaces? Any plans for > >> how > >> we'll generate the Protocol object for these? > >> > > > > I'm hoping to use reflection initially for this. That's the motivation > for > > my renewed interest in AVRO-80. > > > > https://issues.apache.org/jira/browse/AVRO-80 > > > > My rationale is that I don't want to assume we'll move Hadoop onto Avro > > overnight. So I'd like to move things in a way that's easy to maintain > in a > > branch/patch. If we can get reflection to work, then we only need to > update > > two places per Hadoop protocol: where it calls RPC.getProxy() and > > RPC.getServer(). Then we can start looking at performance. We cannot > > commit Avro-based Hadoop RPC until performance is adequate, and we don't > > want to have to maintain a patch that changes many central data > structures > > in Hadoop while we're testing and improving performance, since that might > > take time. > > > > Once we've committed Hadoop to using Avro, then we can consider, > > protocol-by-protocol, replacing Hadoop's Writable objects with Avro > > generated objects. Until then, the protocol will be defined implicitly > by > > Java through reflection. > > > > Note that if performance is inadequate due to reflection itself, rather > > than the client/server implementations, we might resort to byte-code > > modification to accelerate it. > > > > https://issues.apache.org/jira/browse/AVRO-143 > > > > This would also be a temporary approach. Longer-term we should move > Hadoop > > to use protocols declared in .avpr files and generated classes. But I > don't > > think that's practical in the short-term. > > > > My current short-term goal is to try to get Avro's reflection to the > point > > where it can implement NamenodeProtocol. > > > > Does this make sense? > > > > Doug > > >
