[jira] [Commented] (YARN-1871) We should eliminate writing *PBImpl code in YARN
[ https://issues.apache.org/jira/browse/YARN-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14276086#comment-14276086 ] Wangda Tan commented on YARN-1871: -- Making it un-assigned since I don't have bandwidth to do this now. We should eliminate writing *PBImpl code in YARN Key: YARN-1871 URL: https://issues.apache.org/jira/browse/YARN-1871 Project: Hadoop YARN Issue Type: Improvement Components: api Affects Versions: 2.4.0 Reporter: Wangda Tan Attachments: YARN-1871.demo.patch Currently, We need write PBImpl classes one by one. After running find . -name *PBImpl*.java | xargs wc -l under hadoop source code directory, we can see, there're more than 25,000 LOC. I think we should improve this, which will be very helpful for YARN developers to make changes for YARN protocols. There're only some limited patterns in current *PBImpl, * Simple types, like string, int32, float. * List? types * Map? types * Enum types Code generation should be enough to generate such PBImpl classes. Some other requirements are, * Leave other related code alone, like service implemention (e.g. ContainerManagerImpl). * (If possible) Forward compatibility, developpers can write their own PBImpl or genereate them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1871) We should eliminate writing *PBImpl code in YARN
[ https://issues.apache.org/jira/browse/YARN-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14008650#comment-14008650 ] Binglin Chang commented on YARN-1871: - Good idea on eliminating PBImpl code, some comments: bq. Make record class become a non-abstract class, add simple getters/setters implementation. I like we use simple getter/setter implementation, rather than current complex builder/proto/field mixed cache It's better to remain the old API unchanged, and hide implementation. if we can generate PBImpl toProto/toRecord, why can't we generate simple getters/setters? bq. serialization a record to Proto type using reflection We'd better to generate code, reflection can be used on test code, but I'm afraid using reflection to ser/de-ser in rpc code is not acceptable. bq. There are only some limited patterns in current *PBImpl There are some complex situations: read only type/property, generic types, recursive types and name mismatch in record/proto. When doing YARN-2051, I found some situations hard to automate and need special treatment. We should eliminate writing *PBImpl code in YARN Key: YARN-1871 URL: https://issues.apache.org/jira/browse/YARN-1871 Project: Hadoop YARN Issue Type: Improvement Components: api Affects Versions: 2.4.0 Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-1871.demo.patch Currently, We need write PBImpl classes one by one. After running find . -name *PBImpl*.java | xargs wc -l under hadoop source code directory, we can see, there're more than 25,000 LOC. I think we should improve this, which will be very helpful for YARN developers to make changes for YARN protocols. There're only some limited patterns in current *PBImpl, * Simple types, like string, int32, float. * List? types * Map? types * Enum types Code generation should be enough to generate such PBImpl classes. Some other requirements are, * Leave other related code alone, like service implemention (e.g. ContainerManagerImpl). * (If possible) Forward compatibility, developpers can write their own PBImpl or genereate them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1871) We should eliminate writing *PBImpl code in YARN
[ https://issues.apache.org/jira/browse/YARN-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946460#comment-13946460 ] Wangda Tan commented on YARN-1871: -- Some possible methods to eliminate writing PBImpl source code in my head, 1. Using Java annotation processor (RetentionPolicy=SOURCE), an example is [google auto|https://github.com/google/auto] project. We can put an annotation in record classes, like {code} @GeneratePBImpl (protoclass=“org.apache.hadoop.yarn.proto.YarnProtos.ApplicationIdProto”) public abstract class ApplicationId { ... } {code} Then we can implement a GeneratePBImpl annotation processor to generate PBImpl code when compiling. 2. Using ProtocolBuffer parser directly parsing .proto and generate PBImpl code We can get message description, fields, types to get fields in .proto file and generate code by using PB parser. But unfortunately, PB doesn’t provide a java-based parser, we need write a c-based program using such parsers (see [issue-263|https://code.google.com/p/protobuf/issues/detail?id=263]) 3. Similar to @AtMostOnce annotation, make the ser-de as a runtime behavior. In this method, we don’t need generate PBImpl source code or classes, we can create an RetentionPolicy=RUNTIME annotation processor, mark record classes, such as, {code} @RecordClass (protoclass=“org.apache.hadoop.yarn.proto.YarnProtos.ApplicationIdProto”) public abstract class ApplicationId { ... } {code} Similar to annotation, when we need serialize/deserialize this class, we will check if is it a “record class” or not in runtime. If yes, we can simply use its getters/setters and PB generated class (*Proto) doing serialization/deserialization. Any other thoughts on this? Hope to get your ideas. We should eliminate writing *PBImpl code in YARN Key: YARN-1871 URL: https://issues.apache.org/jira/browse/YARN-1871 Project: Hadoop YARN Issue Type: Improvement Components: api Affects Versions: 2.4.0 Reporter: Wangda Tan Assignee: Wangda Tan Currently, We need write PBImpl classes one by one. After running find . -name *PBImpl*.java | xargs wc -l under hadoop source code directory, we can see, there're more than 25,000 LOC. I think we should improve this, which will be very helpful for YARN developers to make changes for YARN protocols. There're only some limited patterns in current *PBImpl, * Simple types, like string, int32, float. * List? types * Map? types * Enum types Code generation should be enough to generate such PBImpl classes. Some other requirements are, * Leave other related code alone, like service implemention (e.g. ContainerManagerImpl). * (If possible) Forward compatibility, developpers can write their own PBImpl or genereate them. -- This message was sent by Atlassian JIRA (v6.2#6252)