[jira] [Commented] (YARN-1871) We should eliminate writing *PBImpl code in YARN

2015-01-13 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14276086#comment-14276086
 ] 

Wangda Tan commented on YARN-1871:
--

Making it un-assigned since I don't have bandwidth to do this now.

 We should eliminate writing *PBImpl code in YARN
 

 Key: YARN-1871
 URL: https://issues.apache.org/jira/browse/YARN-1871
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.4.0
Reporter: Wangda Tan
 Attachments: YARN-1871.demo.patch


 Currently, We need write PBImpl classes one by one. After running find . 
 -name *PBImpl*.java | xargs wc -l under hadoop source code directory, we 
 can see, there're more than 25,000 LOC. I think we should improve this, which 
 will be very helpful for YARN developers to make changes for YARN protocols.
 There're only some limited patterns in current *PBImpl,
 * Simple types, like string, int32, float.
 * List? types
 * Map? types
 * Enum types
 Code generation should be enough to generate such PBImpl classes.
 Some other requirements are,
 * Leave other related code alone, like service implemention (e.g. 
 ContainerManagerImpl).
 * (If possible) Forward compatibility, developpers can write their own PBImpl 
 or genereate them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1871) We should eliminate writing *PBImpl code in YARN

2014-05-26 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14008650#comment-14008650
 ] 

Binglin Chang commented on YARN-1871:
-

Good idea on eliminating PBImpl code, some comments:
bq. Make record class become a non-abstract class, add simple getters/setters 
implementation.
I like we use simple getter/setter implementation, rather than current complex 
builder/proto/field mixed cache
It's better to remain the old API unchanged, and hide implementation. if we can 
generate PBImpl toProto/toRecord, why can't we generate simple getters/setters?

bq. serialization a record to Proto type using reflection
We'd better to generate code, reflection can be used on test code, but I'm 
afraid using reflection to ser/de-ser in rpc code is not acceptable.

bq. There are only some limited patterns in current *PBImpl
There are some complex situations: read only type/property, generic types, 
recursive types and name mismatch in record/proto. When doing YARN-2051, I 
found some situations hard to automate and need special treatment. 




 We should eliminate writing *PBImpl code in YARN
 

 Key: YARN-1871
 URL: https://issues.apache.org/jira/browse/YARN-1871
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.4.0
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-1871.demo.patch


 Currently, We need write PBImpl classes one by one. After running find . 
 -name *PBImpl*.java | xargs wc -l under hadoop source code directory, we 
 can see, there're more than 25,000 LOC. I think we should improve this, which 
 will be very helpful for YARN developers to make changes for YARN protocols.
 There're only some limited patterns in current *PBImpl,
 * Simple types, like string, int32, float.
 * List? types
 * Map? types
 * Enum types
 Code generation should be enough to generate such PBImpl classes.
 Some other requirements are,
 * Leave other related code alone, like service implemention (e.g. 
 ContainerManagerImpl).
 * (If possible) Forward compatibility, developpers can write their own PBImpl 
 or genereate them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1871) We should eliminate writing *PBImpl code in YARN

2014-03-25 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946460#comment-13946460
 ] 

Wangda Tan commented on YARN-1871:
--

Some possible methods to eliminate writing PBImpl source code in my head,
1. Using Java annotation processor (RetentionPolicy=SOURCE), an example is 
[google auto|https://github.com/google/auto] project. We can put an annotation 
in record classes, like
{code}
@GeneratePBImpl 
(protoclass=“org.apache.hadoop.yarn.proto.YarnProtos.ApplicationIdProto”)
public abstract class ApplicationId {
   ...
}
{code}
Then we can implement a GeneratePBImpl annotation processor to generate PBImpl 
code when compiling.

2. Using ProtocolBuffer parser directly parsing .proto and generate PBImpl code
We can get message description, fields, types to get fields in .proto file and 
generate code by using PB parser. But unfortunately, PB doesn’t provide a 
java-based parser, we need write a c-based program using such parsers (see 
[issue-263|https://code.google.com/p/protobuf/issues/detail?id=263])

3. Similar to @AtMostOnce annotation, make the ser-de as a runtime behavior.
In this method, we don’t need generate PBImpl source code or classes, we can 
create an RetentionPolicy=RUNTIME annotation processor, mark record classes, 
such as,

{code}
@RecordClass 
(protoclass=“org.apache.hadoop.yarn.proto.YarnProtos.ApplicationIdProto”)
public abstract class ApplicationId {
   ...
}
{code} 
Similar to  annotation, when we need serialize/deserialize this class, we will 
check if is it a “record class” or not in runtime. If yes, we can simply use 
its getters/setters and PB generated class (*Proto) doing 
serialization/deserialization.

Any other thoughts on this? Hope to get your ideas.

 We should eliminate writing *PBImpl code in YARN
 

 Key: YARN-1871
 URL: https://issues.apache.org/jira/browse/YARN-1871
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.4.0
Reporter: Wangda Tan
Assignee: Wangda Tan

 Currently, We need write PBImpl classes one by one. After running find . 
 -name *PBImpl*.java | xargs wc -l under hadoop source code directory, we 
 can see, there're more than 25,000 LOC. I think we should improve this, which 
 will be very helpful for YARN developers to make changes for YARN protocols.
 There're only some limited patterns in current *PBImpl,
 * Simple types, like string, int32, float.
 * List? types
 * Map? types
 * Enum types
 Code generation should be enough to generate such PBImpl classes.
 Some other requirements are,
 * Leave other related code alone, like service implemention (e.g. 
 ContainerManagerImpl).
 * (If possible) Forward compatibility, developpers can write their own PBImpl 
 or genereate them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)