[
https://issues.apache.org/jira/browse/HDFS-2060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068614#comment-13068614
]
Todd Lipcon commented on HDFS-2060:
-----------------------------------
We had a bit of discussion about this at the contributors meeting a few weeks
ago (the week of the summit). My takeaways from that meeting were:
- Several people expressed an opinion that it would be nicer to not have
protobuf-specific code in any HDFS classes. Sidd described the approach used in
MR2. If I understood him correctly, it uses a class structure like:
{code}
interface FooWireType {
long getBlah();
void setBlah(long x);
... getters and setters ...
... serialization/deseriailization stuff?...
}
class FooWireTypeProtoImpl implements FooWireType {
// wraps FooWireProto, which is the generated class
}
interface WireTypeFactory {
FooWireType createFooType();
BarWireType createBarWireType();
}
class WireTypeProtoFactory implements WireTypeFactory {
// returns *ProtoImpl implementations
}
{code}
The upside of this approach is that it would be possible to switch
serialization mechanisms (eg to avro or thrift) without changing any of the
code in the DFS layer -- just need to implement a different WireTypeFactory.
The downside of this approach is that it requires a bunch of boilerplate
interfaces and classes to be constructed. It would be possible to do this via
code-gen, but no one has a working code generator at this point.
- I argued that, while the above is nicer, it would be more expedient in the
short term to just implement this based on protobufs. I already summarized my
reasoning [in this
comment|https://issues.apache.org/jira/browse/HDFS-2058?focusedCommentId=13047289&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13047289].
The one-sentence version is that we need to move forward ASAP on this, and
having something that works now is better than taking months to do something
slightly more general.
So, I would like to propose moving forward with the approach I outlined in this
JIRA and the demonstration patch. I can commit time to doing this. If others
find the approach unsatisfactory and can commit time to doing the more general
mechanism on trunk in the short term, that would be great, but I don't want to
put off client compatibility much longer. I also don't think we should move
forward with the general mechanism until we have a reasonable code-gen
infrastructure ready -- it's just too much boilerplate to write and maintain.
> DFS client RPCs using protobufs
> -------------------------------
>
> Key: HDFS-2060
> URL: https://issues.apache.org/jira/browse/HDFS-2060
> Project: Hadoop HDFS
> Issue Type: New Feature
> Affects Versions: 0.23.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Attachments: hdfs-2060-getblocklocations.txt
>
>
> The most important place for wire-compatibility in DFS is between clients and
> the cluster, since lockstep upgrade is very difficult and a single client may
> want to talk to multiple server versions. So, I'd like to focus this JIRA on
> making the RPCs between the DFS client and the NN/DNs wire-compatible using
> protocol buffer based serialization.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira