[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051036#comment-13051036
 ] 

Hudson commented on HDFS-2058:
--

Integrated in Hadoop-Hdfs-trunk #699 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/699/])


 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: HDFS-2058.patch, hdfs-2058.txt, hdfs-2058.txt, 
 hdfs-2058.txt, hdfs-2058.txt, hdfs-2058.txt


 We've been talking about this for a long time... would be nice to use 
 something like protobufs or Thrift for some of our wire protocols.
 I knocked together a prototype of DataTransferProtocol on top of proto bufs 
 that seems to work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050115#comment-13050115
 ] 

Hudson commented on HDFS-2058:
--

Integrated in Hadoop-Hdfs-trunk-Commit #746 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/746/])


 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: HDFS-2058.patch, hdfs-2058.txt, hdfs-2058.txt, 
 hdfs-2058.txt, hdfs-2058.txt, hdfs-2058.txt


 We've been talking about this for a long time... would be nice to use 
 something like protobufs or Thrift for some of our wire protocols.
 I knocked together a prototype of DataTransferProtocol on top of proto bufs 
 that seems to work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-10 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047049#comment-13047049
 ] 

Todd Lipcon commented on HDFS-2058:
---

P.S. don't be alarmed at the size of the patch. If you exclude generated code 
it's only:

 16 files changed, 805 insertions(+), 673 deletions(-)


 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: hdfs-2058.txt


 We've been talking about this for a long time... would be nice to use 
 something like protobufs or Thrift for some of our wire protocols.
 I knocked together a prototype of DataTransferProtocol on top of proto bufs 
 that seems to work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047055#comment-13047055
 ] 

Hadoop QA commented on HDFS-2058:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12482031/hdfs-2058.txt
  against trunk revision 1134170.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 15 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The patch appears to cause tar ant target to fail.

-1 findbugs.  The patch appears to cause Findbugs (version 1.3.9) to fail.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:


-1 contrib tests.  The patch failed contrib unit tests.

-1 system test framework.  The patch failed system test framework compile.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/760//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/760//console

This message is automatically generated.

 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: hdfs-2058.txt


 We've been talking about this for a long time... would be nice to use 
 something like protobufs or Thrift for some of our wire protocols.
 I knocked together a prototype of DataTransferProtocol on top of proto bufs 
 that seems to work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-10 Thread M. C. Srivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047215#comment-13047215
 ] 

M. C. Srivas commented on HDFS-2058:


I thought Hadoop was standardizing on Avro everywhere. Any reason this work is 
not in Avro?

 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: hdfs-2058.txt, hdfs-2058.txt


 We've been talking about this for a long time... would be nice to use 
 something like protobufs or Thrift for some of our wire protocols.
 I knocked together a prototype of DataTransferProtocol on top of proto bufs 
 that seems to work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-10 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047289#comment-13047289
 ] 

Todd Lipcon commented on HDFS-2058:
---

I don't want this JIRA to devolve into a discussion of the merits and demerits 
of various serialization frameworks. In the past those discussions have been 
what resulted in us picking _no_ framework instead of just getting it done with 
_something_.

That said, here is my quick summary of why I picked protobufs vs Avro and 
Thrift:

h3. Avro
Avro is a fantastic data serialization framework with the following unique 
features: (a) dynamic schema stored with the data, (b) very compact storage 
format, (c) a standardized container format (d) Java-based codegen that 
integrates easily into a build. Features A, B, and C are very good when you 
want to store a lot of data on disk: it's small, you can read it without 
knowing what someone else wrote, and it's splittable and compressible in MR. D 
is great since you don't need to make developers install anything.

For the case of the DataTransferProtocol and Hadoop RPC in general, features A 
through C are less useful. The different parts of HDFS may divolve slightly 
over time but there's no need to talk to a completely unknown server. 
Compactness is always a plus, but a 10-20% improvement on compactness of header 
data only translates to a 1% improvement of compactness on data transfer, 
since the ratio of data:header is very high. The storage format doesn't help 
any for RPC -- this is transient.

In addition, the dynamic nature of Avro requires the readers and writers know 
the schema of their peer in order to communicate. This has to be done with a 
handshake of some kind. It would certainly be possible to implement this, but 
in order to do it without an extra round trip you need to add schema 
dictionaries, hashes, etc. Plus, the peer's schema needs to be threaded 
throughout the places where serialization/deserialization is done. This is 
possible, but I didn't want to do this work.

h3. Thrift vs Protobufs
I like Thrift a lot -- in fact I'm a Thrift committer and PMC member. So it 
might seem strange that I didn't pick Thrift. Here's my thinking:
- Thrift and Protobuf are more or less equivalent: tagged serialization, 
codegen tool written in C++, good language support, mature wire format
- Thrift has the plus side that it's a true open source community at the ASF 
with some committer overlap with the people working on Hadoop
- Protobufs has the plus side that, apparently, MR2/YARN has chosen it for 
their RPC formats.
- Protobuf has two nice features that thrift doesn't have yet: 1) when unknown 
data is read, it is maintained in a map and then put back on the wire if the 
same object is rewritten. 2) it has a decent plugin system that makes it easy 
to modify the generated code -- even with a plugin written in python or Java, 
in theory. These could be implemented in Thrift, but again, I didn't want to 
take the time.
- Thrift's main advantage vs protobufs is a standardized RPC wire format and 
set of clients/servers. I don't think the Java implementations in Thrift are 
nearly as mature as the Hadoop RPC stack, and swapping out for entirely new RPC 
transport is a lot more work than just switching serialization mechanisms. 
Since we already have a pretty good (albeit nonstandard) RPC stack, this 
advantage of Thrift is less of a big deal.

h3. Conclusions

- In the end I was torn between protobufs and Thrift. Mostly since MR2 uses 
Protobuf already, I just went with it.
- I think protobufs is a good choice for wire format serialization. I still 
think Avro is a good choice for disk storage (eg perhaps using Avro to store a 
denser machine-readable version of the audit log). These two things can coexist 
just fine.
- There is working code attached to this JIRA. If you disagree with my thinking 
above, please do not post a comment; post a patch with your serialization of 
choice, showing that the code is at least as clean, the performance is 
comparable, and the tests pass.
- IMO, there are a lot of interesting things to work on and discuss in Hadoop; 
this is not one of them. Let's just get something (anything) in there that 
works and move on with our lives.


So, assuming I have support from a couple of committers, I will move forward to 
clean up this patch. As you can see above, it already works modulo some bug 
with block token. With some more comments, a little refactoring, and a build 
target to regenerate code, I think we could commit this.

 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix 

[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-10 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047305#comment-13047305
 ] 

Suresh Srinivas commented on HDFS-2058:
---

This is really cool Todd! If you think this patch is ready for review, I will 
review it.

 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: hdfs-2058.txt, hdfs-2058.txt, hdfs-2058.txt


 We've been talking about this for a long time... would be nice to use 
 something like protobufs or Thrift for some of our wire protocols.
 I knocked together a prototype of DataTransferProtocol on top of proto bufs 
 that seems to work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-10 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047308#comment-13047308
 ] 

Eli Collins commented on HDFS-2058:
---

@Todd, this is awesome, complimentary to HADOOP-7347. I agree protobufs is a 
sound choice, thanks for spelling out your analysis. 

How about filing jiras for the other HDFS protocols/issues so we get a sense of 
the scope of the remaining effort?


 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: hdfs-2058.txt, hdfs-2058.txt, hdfs-2058.txt


 We've been talking about this for a long time... would be nice to use 
 something like protobufs or Thrift for some of our wire protocols.
 I knocked together a prototype of DataTransferProtocol on top of proto bufs 
 that seems to work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-10 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047311#comment-13047311
 ] 

Todd Lipcon commented on HDFS-2058:
---

bq.  If you think this patch is ready for review, I will review it.

It's not polished as much as I'd like - eg each of the protobufs should have 
a nice bit of doc explaining when it's sent, and there's a little 
refactoring/moving of classes that could be done.

But, if you want to review it for the overall content and not polish, it 
should be fine to start on it now. I can post cleanup as a diff. Or, we could 
even commit it and then clean up and improve docs as a followup, since let's be 
honest - it's not that clean as it stands now :) Committing this quicker rather 
than slower will also save a lot of time. As you know, patches like this fall 
out of date fast and the merges are a pain.

bq. How about filing jiras for the other HDFS protocols/issues so we get a 
sense of the scope of the remaining effort?

Sure. I have an idea what I would tackle next. The nice thing is that this can 
be done incrementally -- ie it's OK to use the existing RPC for calls to the NN 
and use protobuf for data transfer. The two protocols are non-overlapping, and 
there is benefit even at the half done state.

 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: hdfs-2058.txt, hdfs-2058.txt, hdfs-2058.txt


 We've been talking about this for a long time... would be nice to use 
 something like protobufs or Thrift for some of our wire protocols.
 I knocked together a prototype of DataTransferProtocol on top of proto bufs 
 that seems to work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-10 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047320#comment-13047320
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2058:
--

Nico work!

 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: hdfs-2058.txt, hdfs-2058.txt, hdfs-2058.txt


 We've been talking about this for a long time... would be nice to use 
 something like protobufs or Thrift for some of our wire protocols.
 I knocked together a prototype of DataTransferProtocol on top of proto bufs 
 that seems to work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-10 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047328#comment-13047328
 ] 

Todd Lipcon commented on HDFS-2058:
---

btw, a lot of the credit for this work goes to Nicholas who has done a lot of 
good cleanup in DataTransferProtocol over the last few months.

 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: hdfs-2058.txt, hdfs-2058.txt, hdfs-2058.txt


 We've been talking about this for a long time... would be nice to use 
 something like protobufs or Thrift for some of our wire protocols.
 I knocked together a prototype of DataTransferProtocol on top of proto bufs 
 that seems to work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047346#comment-13047346
 ] 

Hadoop QA commented on HDFS-2058:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12482078/hdfs-2058.txt
  against trunk revision 1134170.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 18 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 32 javac compiler warnings (more 
than the trunk's current 31 warnings).

-1 findbugs.  The patch appears to introduce 23 new Findbugs (version 
1.3.9) warnings.

-1 release audit.  The applied patch generated 3 release audit warnings 
(more than the trunk's current 0 warnings).

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.cli.TestHDFSCLI

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/765//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/765//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/765//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/765//console

This message is automatically generated.

 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: hdfs-2058.txt, hdfs-2058.txt, hdfs-2058.txt


 We've been talking about this for a long time... would be nice to use 
 something like protobufs or Thrift for some of our wire protocols.
 I knocked together a prototype of DataTransferProtocol on top of proto bufs 
 that seems to work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-10 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047370#comment-13047370
 ] 

Todd Lipcon commented on HDFS-2058:
---

Cool, thanks Suresh. I'm working on some cleanup as well, I'll merge your 
changes into mine. A few comments:

bq. Should we make DatanodeInfoProto members required

As I was doing this I noted that it's kind of silly that we even send 
DatanodeInfo as part of the pipeline. All of the fields are unused for data 
transfer, right? All we really need is DatanodeId here, no?

bq. Evenutally some of the messages can be common and could be moved out

Yep, in my patch I'm doing now, I'm refactoring out things like block/datanode 
into a different file, since as you said they're not datatransfer specific.

bq. I prefer generating DataTransferProtos.java instead of checking it in.

I agree long term. In the short term what I'd like to do is check it in but 
also provide an ant target or shell script that can be used to generate 
protobufs. This allows new developers to get up to speed quicker without having 
to set up the full toolchain. Longer term we could have an ant task which 
downloaded a protoc binary based on current platform and used that, perhaps.

 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: HDFS-2058.patch, hdfs-2058.txt, hdfs-2058.txt, 
 hdfs-2058.txt


 We've been talking about this for a long time... would be nice to use 
 something like protobufs or Thrift for some of our wire protocols.
 I knocked together a prototype of DataTransferProtocol on top of proto bufs 
 that seems to work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-10 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047374#comment-13047374
 ] 

Suresh Srinivas commented on HDFS-2058:
---

 As I was doing this I noted that it's kind of silly that we even send 
 DatanodeInfo as part of the pipeline. All of the fields are unused for data 
 transfer, right? All we really need is DatanodeId here, no?

You are right. But do you think the information we are sending could be useful 
for the client. I prefer leaving it as it is now.

+1 for the change. 

Since findbugs warnings are for generated code - we should just suppress them 
in src/test/findbugsExclude.xml. I have not looked at javac warnings.

 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: HDFS-2058.patch, hdfs-2058.txt, hdfs-2058.txt, 
 hdfs-2058.txt


 We've been talking about this for a long time... would be nice to use 
 something like protobufs or Thrift for some of our wire protocols.
 I knocked together a prototype of DataTransferProtocol on top of proto bufs 
 that seems to work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047403#comment-13047403
 ] 

Hadoop QA commented on HDFS-2058:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12482087/HDFS-2058.patch
  against trunk revision 1134397.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 18 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 32 javac compiler warnings (more 
than the trunk's current 31 warnings).

-1 findbugs.  The patch appears to introduce 21 new Findbugs (version 
1.3.9) warnings.

-1 release audit.  The applied patch generated 2 release audit warnings 
(more than the trunk's current 0 warnings).

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.cli.TestHDFSCLI

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/767//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/767//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/767//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/767//console

This message is automatically generated.

 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: HDFS-2058.patch, hdfs-2058.txt, hdfs-2058.txt, 
 hdfs-2058.txt


 We've been talking about this for a long time... would be nice to use 
 something like protobufs or Thrift for some of our wire protocols.
 I knocked together a prototype of DataTransferProtocol on top of proto bufs 
 that seems to work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-10 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047406#comment-13047406
 ] 

Arun C Murthy commented on HDFS-2058:
-

{quote}
Protobuf has two nice features that thrift doesn't have yet: 1) when unknown 
data is read, it is maintained in a map and then put back on the wire if the 
same object is rewritten. 2) it has a decent plugin system that makes it easy 
to modify the generated code – even with a plugin written in python or Java, in 
theory. These could be implemented in Thrift, but again, I didn't want to take 
the time.
{quote}

Similar reasons to choose PB for MRv2. Also, MRv2 has a pluggable layer, so in 
theory, it could be anything. We choose PB for exact same reasons outlined by 
Todd.

But, I'd encourage keeping serialization and in-memory data-structures separate 
ala MR-279.

Mainly, we need to move ahead with something.

+1


 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: HDFS-2058.patch, hdfs-2058.txt, hdfs-2058.txt, 
 hdfs-2058.txt


 We've been talking about this for a long time... would be nice to use 
 something like protobufs or Thrift for some of our wire protocols.
 I knocked together a prototype of DataTransferProtocol on top of proto bufs 
 that seems to work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047514#comment-13047514
 ] 

Hadoop QA commented on HDFS-2058:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12482098/hdfs-2058.txt
  against trunk revision 113.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 31 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.cli.TestHDFSCLI

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/768//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/768//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/768//console

This message is automatically generated.

 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: HDFS-2058.patch, hdfs-2058.txt, hdfs-2058.txt, 
 hdfs-2058.txt, hdfs-2058.txt


 We've been talking about this for a long time... would be nice to use 
 something like protobufs or Thrift for some of our wire protocols.
 I knocked together a prototype of DataTransferProtocol on top of proto bufs 
 that seems to work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-10 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047516#comment-13047516
 ] 

Suresh Srinivas commented on HDFS-2058:
---

Comments for the patch in this jira:
# Name ProtoUtil.java HDFSProtoUtil.java
# With change in build.xml, do you still need to add generated file?

Comments: (could be taken care of in a separate jira)
# Move ByteBufferOutputStream to common
# Create ProtoUtil.java in common and move vint related methods into that.
# ExactSizeInputStream constructor needs javadoc. Calling the parameter 
numBytes as remaining seems more appropriate.
# For the tests added instead of catching the exception with expected comments, 
you could do @Test(expected = EOFException.class)


 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: HDFS-2058.patch, hdfs-2058.txt, hdfs-2058.txt, 
 hdfs-2058.txt, hdfs-2058.txt


 We've been talking about this for a long time... would be nice to use 
 something like protobufs or Thrift for some of our wire protocols.
 I knocked together a prototype of DataTransferProtocol on top of proto bufs 
 that seems to work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-10 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047753#comment-13047753
 ] 

Suresh Srinivas commented on HDFS-2058:
---

One thing missed - ExactSizeInputStream could move to common as well.

 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: HDFS-2058.patch, hdfs-2058.txt, hdfs-2058.txt, 
 hdfs-2058.txt, hdfs-2058.txt


 We've been talking about this for a long time... would be nice to use 
 something like protobufs or Thrift for some of our wire protocols.
 I knocked together a prototype of DataTransferProtocol on top of proto bufs 
 that seems to work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-10 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047767#comment-13047767
 ] 

Suresh Srinivas commented on HDFS-2058:
---

+1 for the change.

 it problematic since I want to only be sure that the exception is thrown at 
 that one spot in the test, and not earlier or later than it's supposed to be.

Your tests are small enough and the your check is specific enough. So there is 
not other code that could throw EOFException.

 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: HDFS-2058.patch, hdfs-2058.txt, hdfs-2058.txt, 
 hdfs-2058.txt, hdfs-2058.txt, hdfs-2058.txt


 We've been talking about this for a long time... would be nice to use 
 something like protobufs or Thrift for some of our wire protocols.
 I knocked together a prototype of DataTransferProtocol on top of proto bufs 
 that seems to work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-10 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047768#comment-13047768
 ] 

Suresh Srinivas commented on HDFS-2058:
---

 Your tests are small enough and the your check is specific enough. So there 
 is not other code that could throw EOFException.

This is a minor thing. No need to hold up the patch for it. The patch is good 
to be committed.

 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: HDFS-2058.patch, hdfs-2058.txt, hdfs-2058.txt, 
 hdfs-2058.txt, hdfs-2058.txt, hdfs-2058.txt


 We've been talking about this for a long time... would be nice to use 
 something like protobufs or Thrift for some of our wire protocols.
 I knocked together a prototype of DataTransferProtocol on top of proto bufs 
 that seems to work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-10 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047786#comment-13047786
 ] 

Todd Lipcon commented on HDFS-2058:
---

bq. This is a minor thing. No need to hold up the patch for it. The patch is 
good to be committed.

Awesome. I'll commit this assuming Hudson comes back green. I'm very excited 
how fast this went from idea to completion -- 24 hrs! Thanks again.

 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: HDFS-2058.patch, hdfs-2058.txt, hdfs-2058.txt, 
 hdfs-2058.txt, hdfs-2058.txt, hdfs-2058.txt


 We've been talking about this for a long time... would be nice to use 
 something like protobufs or Thrift for some of our wire protocols.
 I knocked together a prototype of DataTransferProtocol on top of proto bufs 
 that seems to work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2058) DataTransfer Protocol using protobufs

2011-06-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047804#comment-13047804
 ] 

Hadoop QA commented on HDFS-2058:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12482115/hdfs-2058.txt
  against trunk revision 1134458.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 31 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.cli.TestHDFSCLI

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/771//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/771//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/771//console

This message is automatically generated.

 DataTransfer Protocol using protobufs
 -

 Key: HDFS-2058
 URL: https://issues.apache.org/jira/browse/HDFS-2058
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: HDFS-2058.patch, hdfs-2058.txt, hdfs-2058.txt, 
 hdfs-2058.txt, hdfs-2058.txt, hdfs-2058.txt


 We've been talking about this for a long time... would be nice to use 
 something like protobufs or Thrift for some of our wire protocols.
 I knocked together a prototype of DataTransferProtocol on top of proto bufs 
 that seems to work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira