Re: [protobuf] Static allocation

2012-07-19 Thread Evan Jones
On Jul 18, 2012, at 16:14 , Jeremy wrote:
> I understand, but if one wants to keep a large persistent message allocated 
> and walk over it frequently, there is a price to pay on cache misses that can 
> be significant. 

I guess you are wishing that the memory layout was completely contiguous? Eg. 
if you have three string fields, that their memory would be laid out one field 
after another? Chances are good that with most dynamic memory allocators, if 
you allocate this specific sized message at one time, the fields will *likely* 
be contiguous or close to it, but obviously there are no guarantees. I would 
personally be surprised if these cache misses would be an important performance 
difference, but as normal there is only one way to tell: measure it.

If you want something like this in protobuf though, you would need to change a 
*lot* of the internals. This would not be a simple change. I suggest trying to 
re-use a message, and seeing if the performance is acceptable or not. If not, 
you'll need to find some other serialization solution. Good luck,

Evan

--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Static allocation

2012-07-17 Thread Evan Jones
On Jul 17, 2012, at 2:33 , Jeremy Swigart wrote:
> Is there a way to tell the proto compiler to generate message definitions for 
> which the message fields are statically defined rather than each individual 
> field allocated with dynamic memory? Obviously the repeater fields couldn't 
> be fully statically allocated(unless you could provide the compiler with a 
> max size), but it would be preferable to have the option to create messages 
> with minimal dynamic memory impact. Is this possible in the current library?

I'll assume you are talking C++. In this case, if you re-use a single message, 
it will re-use the dynamically allocated memory. This means that after the 
"maximal" message(s) have been parsed, it will no longer allocate memory. This 
is approximately equivalent to what you want. See Optimization Tips in:

https://developers.google.com/protocol-buffers/docs/cpptutorial

Hope that helps,

Evan

--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] where is input_stream.py?

2012-07-16 Thread Evan Jones
On Jul 16, 2012, at 14:43 , jrf wrote:
> Is that because protobufs is "done" or not being further developed?  

You would need to get someone from Google to answer. The impression I get is 
that the open source release is, at the very least, in "maintenance" mode where 
they occasionally fix bugs etc.

Evan

--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] where is input_stream.py?

2012-07-16 Thread Evan Jones
On Jul 14, 2012, at 10:55 , jrf wrote:
> Is there a reason that a python equivalent of CodedInputStream is not part of 
> protobuf?

I seem to recall that the answer is basically "yeah, it probably should be but 
no one really works on this stuff any more."

You can dig around in the google.protobuf.internal package to get what you 
need. See this thread:

https://groups.google.com/d/msg/protobuf/2m8ihEta1UU/1OOGmyfKP90J

Evan

--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Error while using parseFrom

2012-06-27 Thread Evan Jones
On Jun 26, 2012, at 11:08 , d34th4ck3r wrote:
> What is it that I am doing wrong?

Protocol buffers are a *binary* format. Those funny characters at the end of 
the string are probably part of the message, and you should leave them there. 
You also should not be passing them around as strings. They need to be passed 
as bytes. If you need to call getBytes("UTF-8") you are doing something wrong. 
Good luck,

Evan

--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Best practices for proto file organization in large projects

2012-06-20 Thread Evan Jones
On Jun 19, 2012, at 13:53 , Justin Muncaster wrote:
> 1>Running C++ protocol buffer compiler on common/bar/bar.proto
> 1>common/foo/foo.proto: File not found.
> 1>bar.proto: Import "common/foo/foo.proto" was not found or had errors.
> 
> I can fix the error by hacking FindProtobuf.cmake and passing in additional 
> include directories, but I run into problems down the line, which leads me to 
> think there must be a better way. Every example I see has all proto files in 
> one folder and does not have cross-library protobuf message dependencies.

This should work, and with the project you attached it does work (well, once I 
fixed a bad field number):

Yamnuska:project ej$ protoc --cpp_out=build common/bar/bar.proto
Yamnuska:project ej$ 


I don't know how CMake or this PROTOBUF_GENERATE_CPP rule works, but maybe you 
need to pass the appropriate --proto_path argument so it looks for the included 
.proto in the right place?

Good luck,

Evan

--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Best practices for proto file organization in large projects

2012-06-19 Thread Evan Jones
On Jun 18, 2012, at 22:49 , Justin Muncaster wrote:
> I'm currently changing our build system to be cmake based and I'm again 
> finding myself fighting with the build system to get the .proto to be 
> automatically generated in a way where they build correctly.

What specific problems are you having? Errors in clean builds? Errors when 
modifying a .proto and rebuilding?


> How do you organize your proto files when you have many in common libraries? 
> Do all .proto files live in one folder? Should one avoid "import 
> a/b/c/d/f.proto"? Do you have any recommendations for how one ought one setup 
> the cmake build system to work with proto files that are organized as they 
> are above? Any general recommendations?

What I've done on my last project was to put all the .proto source code in 
their own "proto" directory. But this was a cross-language project, so I was 
accessing them from both C++ and Java, so that seemed to make the most sense to 
me. I configured the build to generate all C++ files into build/*, and the java 
files into build/java, then I included/compiled them from there.

The Chrome browser organizes its .proto files in a very different way:

http://src.chromium.org/viewvc/chrome/trunk/src/chrome/common/metrics/proto/

Hope this helps,

Evan

--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] 1MB message limit (recommendation)

2012-05-30 Thread Evan Jones
On May 29, 2012, at 23:26 , msrobo wrote:
> According to the documentation, it's recommended that the message size
> be <= 1 Megabyte. I've searched around for the reason for this
> recommendation, but I can't seem to find anything. Based on some basic
> benchmarking serializing/unserializing messages ranging from a few KB
> to more than 1MB in C++ there doesn't seem to be a drastic increase in
> time. More specifically, it doesn't seem to be performance driven in a
> C++ application.

I think the main motivation is that there is no way to "seek" inside a protocol 
buffer, and you must load the entire thing into memory in one go. Hence when 
you get really large messages, you may need to allocate huge amounts of memory 
(the memory for the serialized buffer, and the memory for the entire protocol 
buffer object).

1 MB is just a recommendation, but there are also some internal default limits 
set to 64 MB for "security" issues: If you parse an enormous message, it 
requires allocating a ton of RAM. Hence the limits can prevent servers from 
running out of memory. If you have huge messages, you'll need to call the 
appropriate APIs to change the limits.

https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.coded_stream#CodedInputStream.SetTotalBytesLimit.details

Evan

--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] message member case problem

2012-05-16 Thread Evan Jones
On May 16, 2012, at 5:02 , secondsquare wrote:
> After generating cpp files, the member becomes msgsize.
> 
> Big ‘S’ is changed to little 's'.

This is by design. Protocol buffers follows Google's style guide where C++ 
names_use_underscores while Java names useCamelCase. Protobuf will generate the 
appropriate names:

https://developers.google.com/protocol-buffers/docs/style


In other words, the recommendation is that you should use _ to separate works 
in your .proto.

Evan

--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] incompatible type changes philosophy

2012-05-10 Thread Evan Jones
On May 9, 2012, at 15:26 , Jeremy Stribling wrote:
> * There are two nodes, 1 and 2, running version A of the software.
> * They exchange messages containing protobuf P, which contains a string field 
> F.
> * We write a new version B of the software, which changes field F to an 
> integer as an optimization.
> * We upgrade node 1, but node 2.
> * If node 1 sends a protobuf P to node 2, I want node 2 to be able to access 
> field F as a string, even though the wire format sent by node 1 was an 
> integer.


I think you can achieve your goals by building a layer on top of the existing 
protocol buffer parsing, possibly in combination with some custom options, a 
protoc plugin, and maybe a small tweak to the existing C++ code generator. You 
do the breaking change by effectively "renaming" the field, then using a protoc 
plugin to make it invisible to the application. To make this concrete, your 
Version A looks like:

message P {
optional string F = 1;
}


Then Version B looks like the following:

message P {
optional string old_F = 1 [(custom_upgrade_option) = 
"some_upgrade_code"];
optional int32 F = 2;
}


With this structure, Version B can always parse a Version A message. Senders 
will always ensure there is only one version in the message, so the only thing 
you are "losing" here is a field number, which isn't a huge deal. However, you 
but now want to automatically convert old_F to F. This can be done without 
changing the guts of the parser by writing a protoc plugin that generates a 
member function based on the custom option:

void UpgradeToLatest() {
if (has_old_F()) {
set_F(some_upgrade_code(get_old_F()));
clear_old_F();
}
}


You then need to make sure that Version B of the software calls this everywhere 
it is needed. Maybe this argues that what is needed is a "post-processing" 
insertion point in ::MergePartialFromCodedStream? Then your protoc plugin could 
insert this call after a protocol buffer message is successfully parsed, so the 
application would only ever have to deal with the integer version.


In the other direction, I don't understand how the downgrading can possibly be 
done at the receiver, since it doesn't know how to do the downgrade (unless you 
are thinking about mobile code?). So in your example, Node 1 must create a 
Version A protocol buffer message when sending to Node 2. This means you need 
*some* sort of handshaking between Node 1 and Node 2, to indicate supported 
versions.

This is reason I proposed adding some other member function that takes a 
"target_version", so the sender knows what to emit. If sending the same message 
to multiple recipients, you'll need to send the lowest version in the group. 
Based on the above, your plugin could emit:

void DowngradeToVersion(int target_version) {
if (target_version < 0xB && has_F()) {
set_old_F(some_downgrade_code(get_F()));
clear_F();
}
}


There are many other ways you could do this, but it seems to me that this 
proposal is a way to do it without complicating the base protocol buffers 
library with application-specific details.

Evan

--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] incompatible type changes philosophy

2012-05-09 Thread Evan Jones
On May 8, 2012, at 21:26 , Jeremy Stribling wrote:
> Thanks for the response.  As you say, this solution is painful because you 
> can't enable the optimization until the old version of the program is 
> completely deprecated.  This is somewhat simple in the case that you yourself 
> are deploying the software, but when you're shipping software to customers 
> (as we are) and have to support many old versions, it will take a very long 
> time (possibly years) before you can enable the optimization.  Also, it 
> breaks the downgrade path.  Once you enable the optimization, you can never 
> downgrade back to a version that did not know about the new field.

I think I now understand your problem. You want to add some additional stuff to 
your .proto file to indicate the incompatible change, then have the application 
code not need to know about it? Eg. you want to write the application code that 
only accesses "new_my_data" and never needs to check for "deprecated_my_data", 
but in fact the underlying protocol buffer supports both fields, or something 
like that.

It seems to me like this is starts to end up in the territory of "too high 
level for the protocol buffer library itself" since I can't imagine this 
working without handshaking like Oliver talked about (e.g. "I understand 
everything up to version X"). My personal experience has been more like what 
Daniel describes: you keep both versions of the field, and your code has if 
statements to check for both. I believe this can be made to work, even in your 
scenario, but it does require ugly code in your application to handle it. My 
impression is that you are trying to avoid that.


Random brainstorming that may not be helpful in any way:

I'm curious about how you end up choosing to solve this, but I think you are 
going to need to use some combination of custom field options (to specify the 
change in a way that protoc can parse?), and then hacks in the C++ code 
generator  to call your custom upgrade / downgrade code. I think this can work 
somewhat seamlessly in the "reading older messages" case (eg. you just add code 
that says "if we see the old field, upgrade it to the new field"). However, 
this can't work in the "writing a newer message for an older receiver" case 
without making the Serialize* code aware of the version it should be *writing*. 
I think this is going to be pretty application specific?

My other thought: I think you might be able to get away with writing a protoc 
plugin that adds two functions to the class scope (which already exists as an 
insertion point):

static UpgradedMessage ParseAnyMessageVersion(…);
string SerializeToVersion(int target_version);

These functions can apply the appropriate upgrade/downgrading as needed. 
However, you then need to call the appropriate functions to read/write the 
messages. However, I would argue that since in the serializing case you are 
going to need to know the target_version anyway, this might actually work?

Good luck, and again I'd be interested to know how you do end up solving this.

Evan

--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Protocol Buffers for IOS

2012-03-31 Thread Evan Jones
On Mar 31, 2012, at 4:31 , Dhanaraj G wrote:
> I have gone through he following link..
> http://code.google.com/p/metasyntactic/wiki/ProtocolBuffers


There is no "official" support but I've used the following distribution with 
success, with the latest protoc (I'm pretty sure):

https://github.com/booyah/protobuf-objc


Good luck,

Evan

--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] [Java][Python][TCP] Reading messages that where written with writeDelimited and viceversa

2012-03-27 Thread Evan Jones
On Mar 26, 2012, at 21:49 , Galileo Sanchez wrote:
> Thanks man... It worked great! I guess I should read the documentation a 
> little more xP.

Sadly these functions aren't actually documented. The Python API doesn't expose 
these routines for some reason I don't understand / remember. Glad it worked!

Evan

--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] [Java][Python][TCP] Reading messages that where written with writeDelimited and viceversa

2012-03-26 Thread Evan Jones
On Mar 25, 2012, at 18:09 , Galileo Sanchez wrote:
> else 
> if (Should I write the size as a raw bit string?)
> thenHow do I do that?


You need to use something like the following. Not 100% sure it works but it 
should be close? Hope this helps,

Evan


# Output a message to be read with Java's parseDelimitedFrom
import google.protobuf.internal.encoder as encoder
out = message.SerializeToString()
out = encoder._VarintBytes(len(out)) + out


# Read a message from Java's writeDelimitedTo:
import google.protobuf.internal.decoder as decoder

# Read length
(size, position) = decoder._DecodeVarint(buffer, 0)
# Read the message
message_object.ParseFromString(buffer[position:position+size])


--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Problem with accent

2012-03-26 Thread Evan Jones
On Mar 23, 2012, at 9:07 , Simon wrote:
> I have an annoying problem with some accent.
> I build my proto-object, no problem, and when i want to read it the
> browser, using .toString function, i have \303\240 instead of "à",
> \303\250 instead of "è", etc…

What do you mean "i want to read it the browser using .toString function"? Is 
this Java or C++ or something else? What does your message definition look like?

By default, protocol buffers encodes strings in UTF-8. These characters seem to 
be encoded correctly as UTF-8, so the "sending" side is doing the right thing, 
but the code that is reading them is not doing the correct decoding:

à = U+00E0

Escaped in hexadecimal this is: "\xc3\xa0"
Escaped in octal this is: "\303\240"


So you need to decode from UTF-8 to get the correct characters. Hope this helps,

Evan

--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Protocol Buffers for version control of objects on a cache.

2012-03-21 Thread Evan Jones
On Mar 20, 2012, at 16:12 , Mick wrote:
> These objects are going to be accessible to multiple users, who's accessor 
> programs may be on different release cycles.  I have been looking into 
> protocol buffers as a way of managing dataloss/corruption between versions.  
>   Has anyone used protocol buffers to approach this type problem before?

I'm not quite sure what you mean and what information you are looking for. 
However, protocol buffers were designed to help with this sort of problem, but 
it still requires care to make it work. Random notes off the top of my head:

* You may want to make all fields optional, since if a message is missing a 
required field it will fail to parse. Certainly all *new* fields *must* be 
optional.

* Protocol buffers only help with the parsing. You still need to think about 
forward and backwards compatibility (eg. how is your software going to process 
the messages).

* Passing messages through (eg. proxies or other tools) will work.

Hope this helps. Good luck,

Evan

--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: How to read continuous stream of messages from TCP

2012-03-08 Thread Evan Jones
On Mar 8, 2012, at 2:30 , waynix wrote:
> Since this is so common an issue and the suggested solution is almost
> de facto standard,  (saw this after my initial post:
> http://code.google.com/apis/protocolbuffers/docs/techniques.html), it
> begs the question of why not build it into protobuf proper.

Yeah, I would agree that something simple probably should have been included. 
The reasoning here is that this allows people to use protocol buffers with 
whatever other systems they might already be using (eg. HTTP, databases, files, 
RPC protocols, whatever), without being tied to a specific implementation. 
Compare the protocol buffer API to Thrift, for example, where the message 
serialization/deserialization is tied pretty tightly to the RPC system. There 
were proposals to possibly add a "protocol buffer utils" API, or a "streaming" 
API, but neither of those went anywhere. The closest thing is writeDelimitedTo 
/ mergeDelimitedFrom in the Java API:

http://code.google.com/apis/protocolbuffers/docs/reference/java/com/google/protobuf/MessageLite.html#writeDelimitedTo(java.io.OutputStream)

Evan

--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] How to read continuous stream of messages from TCP

2012-03-06 Thread Evan Jones
On Feb 27, 2012, at 17:27 , waynix wrote:
> 1. Is this still the way to do it? Seems quite cumbersome (to lazy me ;-).  
> Is there  a wrapper built in to do this?

Yes. Sadly there is no wrapper included in the library.


> 2. If I understand Jason's suggestion riht, the length is really not
> part of the message, and the sender has to explcitly set it, instead
> of having protobuf encode it in. Which means a generic third party
> sender using my .proto file would not be sufficient.  Plus how would
> they know the length before encoding the message proper? Filling it in
> after the fact would change the length again? or I am totally
> missing it.

As long as both sides encode the length in the same way , just having the right 
.proto will do the trick. 


> 3. A related quesiton is in general do I have to manage reading of the
> socket, or for that matter any istream, and spoon feed the protobuf
> parser until it says OK, that's a whole message?

Basically yes. There is a sketch of some example code here:

https://groups.google.com/forum/?fromgroups#!searchin/protobuf/sequence/protobuf/pLwqN4jTVvY/60PBaEadW5IJ


Good luck,

Evan

--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Message thread safety in Java

2012-02-20 Thread Evan Jones
On Feb 20, 2012, at 16:20 , Christopher Smith wrote:
> Message objects *don't* have mutators and are conceptually a copy of the 
> relevant builder object.

Having attempted to refresh my knowledge of the Java Memory Model, I think 
there is a subtle difference between an object that has all final fields, and 
an "immutable" object like the protobuf messages. However, I don't think it 
matters in reality: As long as the message is "correctly published" to other 
threads (eg. a synchronized block, volatile reference, concurrent data 
structure), then everything is fine. Since everyone *should* be doing this 
already, Messages are safe to use across multiple threads.

Evan


PS. For language lawyers: I *think* the potential difference is as follows: 
Writes to final fields in a constructor are guaranteed to be visible to all 
threads when the constructor exits. So if you had the following:

static FinalImmutableObject someRef = ...;

Then if another thread sees a non-null value for someRef, it will correctly see 
all the values of the final fields. On the other hand, if you do this with a 
protobuf message, it *theoretically* could see a non-null value for someRef, 
but still see uninitialized or incorrectly initialized values for fields in 
someRef.

This is because this static variable is not synchronized or volatile, so there 
is no "happens-before" relationship between two threads. Thus, the reads on one 
thread *could* be reordered before the writes on the other thread. References:

http://java.sun.com/docs/books/jls/third_edition/html/memory.html#17.4
http://java.sun.com/docs/books/jls/third_edition/html/memory.html#17.5

--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Message thread safety in Java

2012-02-20 Thread Evan Jones
On Feb 20, 2012, at 8:25 , Frank Durden wrote:
> I'm sorry if this is explained somewhere, I couldn't find an answer.
> Are protobuf messages (in Java) thread safe for concurrent reads. I
> guess they're immutable in the sense that you can't modify them after
> they're built, but can a message object content be read from different
> threads safely? The generated variables in message objects don't seem
> to be final or volatile?

After you call .build() and get a Message, that message is immutable, as you 
observed. I'm not a Java memory model expert, but my understanding is that 
despite the fields not being market final, this is in fact thread-safe. 
However, my only support is this quote from Brian Goetz:

"With some additional work, it is possible to write immutable classes that use 
some non-final fields (for example, the standard implementation of String uses 
lazy computation of the hashCode value), which may perform better than strictly 
final classes."

http://www.ibm.com/developerworks/java/library/j-jtp02183/index.html

I'm pretty sure the right people at Google have examined the protobuf code, so 
it should be safe. However, I don't have a good argument for *why* it is safe. 
Maybe someone who is a Java memory model expert knows the reasoning here?

Evan

--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Error: Byte size calculation and serialization were inconsistent

2012-02-07 Thread Evan Jones
On Feb 6, 2012, at 21:54 , Robby Zinchak wrote:
> It turned out to be an uninitialized boolean.  Properly setting the value in 
> question seems to allow things to proceed normally.

Ah! Interesting. So one of your .set_* properties is a boolean, and one of them 
was uninitialized? That would do it. This was discussed previously and 
dismissed as a "wont fix" problem, because it is hard/impossible to make 
portable code that will test for this:

http://code.google.com/p/protobuf/issues/detail?id=234

Although its somewhat confusing since WireFormatLite::WriteBoolNoTag contains 
code to try to avoid this problem, which GCC helpfully optimizes away.

I am not able to get the exact crash as the one you reported, but I can get it 
to crash in MessageLite::SerializeWithCachedSizesToArray by creating a boolean 
with a value of 0x80 (serializing to two bytes instead of one, causing it to 
create a message larger than it expects). I can't figure out how it could crash 
at the point you report the crash, but that doesn't really matter.

Glad you got it working,

Evan

--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Error: Byte size calculation and serialization were inconsistent

2012-02-06 Thread Evan Jones
This is weird. I don't see any clear potential cause, so I have a few questions:

> HTMud::EnvAdd item;
> item.set_id(ID);
> item.set_idtype(typeID);
> item.set_x(X);
> item.set_y(Y);
> item.set_z(Z);
> item.set_lockdown(lockdown);
> item.set_mapid(map);
> item.set_tilesetno(tilesetNo);
> item.set_tilesetx(tilesetX);
> item.set_regionx(regionX);
> item.set_regiony(regionZ);


Are all these values primitives? Are any of them protocol buffers?

Have you tried dumping the values that are being set when it dies, and trying a 
standalone program that sets the values and calls SerializeToString to see if 
it has the same problem?

Have you made any changes to the protocol buffers library? I'm assuming you are 
using the released version of 2.4.1?

Have you tried running this under valgrind? I'm wondering if there could be 
other weird memory corruption that is happening? That seems to be a frequent 
cause of "this shouldn't be happening" type errors, particularly things that 
appear/disappear occur with optimization enabled/disabled.

Evan

--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Encoding question, datatypes & packed repeated fields

2012-02-06 Thread Evan Jones
On Feb 2, 2012, at 15:36 , Andreas Lie wrote:
> What i want to do essentially, is to store 4MB of data to a file.

If this is raw binary data, maybe you should consider the "bytes" type? That 
lets you store a variable length of raw binary data very efficiently.

The problem with integers is probably being caused by protocol buffer's 
variable-length integer encoding. Any byte with the most-significant bit set 
(0x80) takes two bytes. The tradeoff is that your message can now include 
larger integers, but if you don't need it, the "bytes" type is probably better.

You could also try using the fixed32 type.

See: http://code.google.com/apis/protocolbuffers/docs/encoding.html

Evan

--
http://evanjones.ca/

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: Problem with C++ -writing multiple messages with a repeated field to a file

2011-05-16 Thread Evan Jones

On May 16, 2011, at 9:45 , Nigel Pickard wrote:

I have actually got the code working, but it involves creating a new
output stream everytime I write to it (surely got to be wasteful and
not the right way?).


Definitely not needed, and it will be more efficient if you can re-use  
a single FileOutputStream, as it does buffering internally. You should  
probably create a CodedOutputStream for each message you write, but  
this can be stack allocated and is very lightweight.


Evan

--
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Problem with C++ -writing multiple messages with a repeated field to a file

2011-05-13 Thread Evan Jones

On May 13, 2011, at 10:12 , Nigel Pickard wrote:

"libprotobuf FATAL google/protobuf/io/zero_copy_stream_impl_lite.cc:
346] CHECK failed: (buffer_used_) == (buffer_size_):  BackUp() can
only be called after Next()."


Off the top of my head, I *believe* this is happening because the  
CodedOutputStream destructor is trying to reposition the  
FileOutputStream, but the FileOutputStream has already been closed. In  
this case, you either want to put the CodedOutputStream into its own  
enclosing scope, to force the destructor to run before you close the  
FileOutputStream, or just let the FileOutputStream destructor flush  
and close the file automatically.


I hope this helps,

Evan

--
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Generated Comments

2011-05-06 Thread Evan Jones

On May 6, 2011, at 11:19 , Ben Wright wrote:

I was wondering if there was any convenient way to add comments into
a .proto file that would be included in generated code for context
help.


This has been discussed before, but as far as I am aware, this has not  
been implemented in the protoc parser? See the following thread and  
the issue.


Evan

http://groups.google.com/group/protobuf/browse_thread/thread/b46653e9535766ab

http://code.google.com/p/protobuf/issues/detail?id=148

--
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] on the wire sizes

2011-04-01 Thread Evan Jones

On Apr 1, 2011, at 6:54 , AdrianPilko wrote:

What is the [best] way to determine the on the wire size?


You probably want msg.ByteSize() in C++, msg.getSerializedSize() in  
Java.


Evan

--
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Serializing part of a message and writing to disk

2011-03-08 Thread Evan Jones

On Mar 8, 2011, at 2:12 , Linus wrote:

At a later stage in the code, the values of (say) Message A are
changed by the user. Is there a way of modifying only Message A and
updating the file on disk, without loading the composite Message C
updating the Message A and flushing entire contents to disk again?


Not really: protocol buffers are a variable length encoding, so  
changing Message A could change the length, so overwriting doesn't  
really work, at least not without additional checks and effort.


Evan

--
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] A protocol message was rejected because it was too big ???

2011-03-07 Thread Evan Jones

On Mar 7, 2011, at 13:03 , ksamdev wrote:
Hmm, thanks for the advice. It may work fine. Nevertheless, I have  
to skip previously read messages in this case every time  
CodedInputStream is read.


Not true: Creating a CodedInputStream does not change the position in  
the underlying stream. Your code can easily look like:



while (still more messages to read) {
  CodedInputStream in(&input_stream);
  in.Read*
  ...
  msg.ParseFromCodedStream();
}


This creates and destroys the CodedInputStream for each message, which  
is efficient.



Unfortunately, reading does not work out after 2^31 bytes are read.  
Is there a way around?


You will need to destroy and re-create the CodedInputStream object. If  
you don't want to do it for each message, you need to at least do it  
occasionally.


Evan

--
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] A protocol message was rejected because it was too big ???

2011-03-07 Thread Evan Jones

On Mar 6, 2011, at 18:45 , ksamdev wrote:
I think I found the source of the problem. The problem is that  
CodedInputStream has internal counter of how many bytes are read so  
far with the same object.


Ah, right. With the C++ API, the intention is that you will not reuse  
the CodedInputStream, and instead it will be created and destroyed for  
each message. It is very cheap to allocate / destroy if it is a local  
variable.


In your case, you should do something like change your ::write method  
to do:


CodedOutputStream out(_raw_out.get());
out.WriteVarint32(event.ByteSize());
event.SerializeWithCachedSizes(&out);


This will also save the extra copy that your code currently has. Hope  
this helps,


Evan

--
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] A protocol message was rejected because it was too big ???

2011-03-06 Thread Evan Jones

On Mar 6, 2011, at 12:19 , ksamdev wrote:
libprotobuf ERROR google/protobuf/io/coded_stream.cc:147] A protocol  
message was rejected because it was too big (more than 67108864  
bytes).  To increase the limit (or to disable these warnings), see  
CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/ 
coded_stream.h.


Protocol buffers limit the parsed size to 64 MB by default. You have  
generated a very large message. You either need to set the limit  
larger, or split your message into multiple messages. See:


http://code.google.com/apis/protocolbuffers/docs/techniques.html#large-data

http://code.google.com/apis/protocolbuffers/docs/reference/cpp/google.protobuf.io.coded_stream.html#CodedInputStream.SetTotalBytesLimit.details

Hope this helps,

Evan

--
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] RuntimeException while parsing back the byte[] to protocol buffer message instance! (deserialization)

2011-03-04 Thread Evan Jones

On Mar 4, 2011, at 11:11 , Aditya Narayan wrote:
Exception in thread "main" java.lang.RuntimeException:  
Uncompilable source code


This error means there is a build problem in your Eclipse project. You  
are trying to call some code that is not building compiled correctly.  
Fix your build errors and then your example should work. Good luck,


Evan

--
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Chunking a large message

2011-03-04 Thread Evan Jones

On Mar 3, 2011, at 15:53 , Linus wrote:

I am wondering if there are any examples of chunking large PB messages
(about 1MB) into smaller chunks, to transmit over the wire.


This is going to be pretty application specific. Typically it involves  
taking one message with a huge repeated field and sending it / writing  
it as a sequence of messages with fewer items for each repeated field.  
So I can't really point you to any examples off the top of my head.


That said: the documentation suggests keeping protocol buffers to be ~  
<1 MB in size, so if your messages are 1 MB, I personally wouldn't  
worry about it. Hope this helps,


Evan

--
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] How to get the byte[] from a serialized data ?

2011-03-04 Thread Evan Jones

On Mar 4, 2011, at 7:15 , Aditya Narayan wrote:
I have created .proto files and compiled them to get the generated  
classes. Also I can build the message objects using the setters &  
finally build() method. But to store it to database, I need  
serialized data as byte[] or byte buffers. How do I finally get that  
from the message instances ??


You want .toByteArray():

http://code.google.com/apis/protocolbuffers/docs/reference/java/com/google/protobuf/MessageLite.html#toByteArray()

Evan

--
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Can a message derive from another message?

2011-03-03 Thread Evan Jones

On 03/02/2011 10:04 AM, ZHOU Xiaobo wrote:

 required string Content = 3;


WARNING: You should be using type bytes here, not type string. This 
doesn't matter for C++, but matters for other languages which will 
assume strings contain UTF-8 data.


Evan

--
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Beginner's Q: Does protobuf generate underlying transport sockets as well

2011-02-28 Thread Evan Jones

On Feb 28, 2011, at 11:46 , footloose wrote:

The tutorials talk only about marshalling and un marshalling the data
structures. Do the sockets have to be written manually?


Yes. The protocol buffer library from Google does not include an RPC  
implementation. There are a bunch of third-party implementations though:


http://code.google.com/p/protobuf/wiki/ThirdPartyAddOns

Evan

--
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Fwd: RpcChannel and RpcController Implementation

2011-02-21 Thread Evan Jones

On Feb 21, 2011, at 3:06 , Amit Pandey wrote:

Did anyone get the chance to look into it.


If you want to use the RPC system, you need to provide your own  
implementation, or maybe use an existing one, such as:


http://code.google.com/p/protobuf/wiki/ThirdPartyAddOns#RPC_Implementations

If this doesn't answer your question, maybe you need to be more  
specific. What are you trying to do?


Evan

--
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] New protobuf feature proposal: Generated classes for streaming / visitors

2011-02-08 Thread Evan Jones

On Feb 8, 2011, at 13:34 , Kenton Varda wrote:
I handle user messages by passing them as "bytes", embedded in my  
own outer message.


This is what I do as well, as does protobuf-socket-rpc:

http://code.google.com/p/protobuf-socket-rpc/source/browse/trunk/proto/rpc.proto


I guess I was thinking that if you already have to do some sort of  
"lookup" of the message type that is stored in that byte blob, then  
maybe you don't need the streaming extension. For example, you could  
just build a library that produces a sequence of byte strings, which  
the "user" of the library can then parse appropriately.


I see how you are using it though: it is a friendly wrapper around  
this simple "sequence of byte strings" model, that automatically  
parses that byte string using the tag and "schema message." This might  
be useful for some people.


This is somewhat inefficient currently, as it will require an extra  
copy of all those bytes.  However, it seems likely that future  
improvements to protocol buffers will allow "bytes" fields to share  
memory with the original buffer, which will eliminate this concern.


Ah cool. I was considering changing my protocol to be two messages:  
the first one is the "descriptor" (eg. your CallRequest message), then  
the second would be the "body" of the request, which I would then  
parse based on the type passed in the CallRequest.



Note that I expect people will generally only "stream" their top- 
level message.  Although the proposal allows for streaming sub- 
messages as well, I expect that people will normally want to parse  
them into message objects which are handled whole.  So, you only  
have to manually implement the top-level stream, and then you can  
invoke some reflective algorithm from there.


Right, but my concern is that I might want to use this streaming API  
to write messages into files. In this case, I might have a file  
containing the FooStream and another file containing the BarStream.  
I'll have to implement both these ::Writer interfaces, or hack the  
code generator to generate it for me. Although now that I think about  
this, the implementation of these two APIs will be relatively trivial...



features like being able to detect broken streams and "resume" in  
the middle are useful.
I'm not sure how this relates.  This seems like it should be handled  
at a lower layer, like in the InputStream -- if the connection is  
lost, it can re-establish and resume, without the parser ever  
knowing what happened.


Sorry, just an example of why you might want a different protocol. If  
I've streamed 10e9 messages to disk, I don't want this stream to break  
if there is some weird corruption in the middle, so I want some  
protocol that can "resume" from corruption.


Evan

--
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] New protobuf feature proposal: Generated classes for streaming / visitors

2011-02-08 Thread Evan Jones
I read this proposal somewhat carefully, and thought about it for a  
couple days. I think something like this might solve the problem that  
many people have with streams of messages. However, I was wondering a  
couple things about the design:



* It seems to me that this will solve the problem for people who know  
statically at compile time what types they need to handle from a  
stream, so they can define the "stream type" appropriately. Will users  
find themselves running into the case where they need to handle  
"generic" messages, and end up needing to "roll their own" stream  
support anyway?


I ask this question because I built my own RPC system on top of  
protocol buffers, and in this domain it is useful to be able to pass  
"unknown" messages around, typically as unparsed byte strings. Hence,  
this streams proposal wouldn't be useful to me, so I'm just wondering:  
am I an anomaly here, or could it be that many applications will find  
themselves needing to handle "any" protocol buffer message in their  
streams?



The Visitor class has two standard implementations:  "Writer" and  
"Filler".  MyStream::Writer writes the visited fields to a  
CodedOutputStream, using the same wire format as would be used to  
encode MyStream as one big message.


Imagine I wanted a different protocol. Eg. I want something that  
checksums each message, or maybe compresses them, etc. Will I need to  
subclass MessageType::Visitor for each stream that I want to encode?  
Or will I need to change the code generator? Maybe this is an unusual  
enough need that the design doesn't need to be flexible enough to  
handle this, but it is worth thinking about a little, since features  
like being able to detect broken streams and "resume" in the middle  
are useful.


Thanks!

Evan

--
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: protobuf not handling special characters between Java server and C++ client

2011-01-26 Thread Evan Jones

On Jan 26, 2011, at 3:43 , Hitesh Jethwani wrote:
Can we encode the protobuf data in ISO-8859-1 from the server end  
itself?


Yes. In this case, you need to use the protocol buffer "bytes" type  
instead of the protocol buffer "string" type, since you want to  
exchange ISO-8859-1 bytes from program to program (bytes), not unicode  
text (string).


On the Java side, you'll need to use  
ByteString.copyFrom(myStringobject, "ISO-8859-1") to make a ByteString  
out of a Java string.


Hope this helps,

Evan

--
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] protobuf not handling special characters between Java server and C++ client

2011-01-25 Thread Evan Jones

On Jan 25, 2011, at 15:27 , Hitesh Jethwani wrote:
As may be evident from above I am naive at Java and Protobuf. Any  
help on this is appreciated.



The Java protocol buffer API encodes strings as UTF-8. Since C++ has  
no unicode support, what you get on the other end is the raw UTF-8  
encoded data. You'll need to use some Unicode API to process it in  
whatever way your application requires. I suggest ICU:


http://site.icu-project.org/

Hope this helps,

Evan

--
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] protocol buffers and client-server communication

2011-01-23 Thread Evan Jones

On Jan 22, 2011, at 16:33 , Marco@worldcorp wrote:
I am guessing i will need 1 proto file for each type of message,  
correct?


Sounds like that is what you want to me. You may also end up needing  
some additional "header" message or wrapper message to be able to  
figure out "what is the next message in the stream?". See:


http://code.google.com/apis/protocolbuffers/docs/techniques.html#union

the archives of this group also contain many discussions on this  
subject.


Evan Jones

--
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Dealing with Corrupted Protocol Buffers

2011-01-20 Thread Evan Jones

On Jan 20, 2011, at 2:48 , julius-schorzman wrote:

My question is -- can anything be done to retrieve part of the file?
It would be nice to know at which point in the file the problematic
message occurred, and then I could crop to that point or do some
manual exception -- but unfortunately this exception is very general.
I find it hard to believe that a single mis-saved bit makes the whole
file worthless.


You are correct: your entire data is not worthless, but at the point  
of the error, you will need some manual intervention to figure out  
what is going on.


It is probably possible to figure out the byte offset where this error  
occurs. The CodedInputStream tracks some sort of bytesRead counter, I  
seem to recall. However, this will require you to modify the source.




I also find it curious that the source provides no way (that I can
tell) to get at any lower level data in the p.b. since whenever I try
to do anything with it it throws an exception.  Best I can tell I will
have to write from scratch my own code to decode the p.b. file.


The lowest level tools that are provided is CodedInputStream. But yes,  
you will effectively have to "parse" the message yourself. Look at the  
code that is generated for the mergeFrom method of your message to get  
an idea for how it works, and you can read the encoding documentation:


http://code.google.com/apis/protocolbuffers/docs/encoding.html

You can definitely figure out what is going on, but it will be a bit  
of a pain. Good luck,


Evan Jones

--
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] custom constructor

2011-01-14 Thread Evan Jones

On Jan 14, 2011, at 9:22 , Tim Wisniewski wrote:

The reason for this is that I can't find a bitset type within the
proto language.  Any thoughts?


This is not possible because the intention is that the .proto file  
will be portable between many different languages, so it only supports  
fairly portable types.


I have solved this either by writing a wrapper type that provides a  
"friendly" interface to the protocol buffer generated types, or by  
writing some utility methods (eg. maybe  
BitSetUtils.copyToByteString() / .copyFromByteString()). Either way is  
a bit of extra mechanical work, but it works.


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Java Newbie Question: Attaching a CodedInputStream to an NIO Socket

2011-01-13 Thread Evan Jones

On Jan 13, 2011, at 1:55 , Nader Salehi wrote:

It does help.  However, I seem to have some problem reading messages
that way.  My guess is that it has something to do with the fact that
the channels are non-blocking.  Is there any special thing to consider
when working with such channels?


You need to know the length of the message you are reading, then only  
call the parse method once you have the entire thing buffered. So you  
send the size first, then the message. On the receiving side, you read  
the size, then then you keep reading from the non-blocking socket  
until you have the whole thing buffered, then you parse it. I have  
code that actually does this that is open source, but it is "research  
quality" so it may not actually be helpful to others. But you may want  
to look at it:


http://people.csail.mit.edu/evanj/hg/index.cgi/javatxn/file/260423aa1c25/src/ca/evanjones/protorpc/ProtoConnection.java#l40

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: Large message vs. a list message for smaller messages

2011-01-13 Thread Evan Jones

On Jan 13, 2011, at 4:38 , Meghana wrote:

Is there any way of working around this problem?


You can increase the limit with CodedInputStream.setSizeLimit, which  
is an easy route. The problem is that the performance is bad for  
really large messages, because the whole thing needs to be serialized/ 
deserialized to/from a single buffer.


The "high performance" version would be to encode your own simple  
protocol. Something like:


1. Write the number of messages with writeRawVarint32
2. Write each message with writeDelimitedTo

On the decoding side, do the opposite. I'm not familiar with the  
framework you are using, but this should be feasible. Hope this helps,


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Large message vs. a list message for smaller messages

2011-01-12 Thread Evan Jones

On Jan 12, 2011, at 8:38 , Meghana wrote:

Would ListA also be considered a large message or will the encoding be
done on each individual A message making it immune to the large
message problem?


ListA itself will be a large message if it contains a large message of  
a sub-messages. If you are really sending / writing a large number of  
messages, you want to read something like:


http://code.google.com/apis/protocolbuffers/docs/techniques.html#streaming

Good luck,

Evan Jones

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Java Newbie Question: Attaching a CodedInputStream to an NIO Socket

2011-01-12 Thread Evan Jones

On Jan 12, 2011, at 12:57 , Nader Salehi wrote:

I have a Java-base TCP server which needs some modification.  It has
to accept messages as CodedInputStream from C++ clients that send
CodedOutputStream.  The server uses NIO class
java.nio.channels.SocketChannel to read from the socket.  What would
be the easiest way to attach a CodedInputStream to this?


I created a really thin InputStream implementation that wrapped my NIO  
ByteBuffer(s), then use CodedInputStream.newInstance(InputStream  
stream). You really only need to implement the read(byte[]  
destination, int offset, int length) method of this class, so it is  
actually pretty straightforward. There might be a "better" way but it  
works for me. Hope this helps,


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Using a ByteBuffer instead of a ByteString?

2011-01-11 Thread Evan Jones

On Jan 11, 2011, at 0:45 , Nicolae Mihalache wrote:

But I have noticed in java that it is impossible to create a message
containing a "bytes" fields without copying some buffers around. For
example if I have a encoded message of 1MB with a few regular fields
and one big bytes field, decoding the message will make a copy of the
entire buffer instead of keeping a reference to it.


By "decoding" I'm assuming you mean deserializing the message from a  
file or something.


This is a disadvantage, but it makes things much easier: it means the  
buffer used to read data can be recycled for the next message. Without  
this copy, the library would need to do complicated tracking of chunks  
of memory to determine if they are "in use" or not.


However, now that you mention it: in the case of big buffers,  
CodedInputStream.readBytes() gets called, which currently makes 2  
copies of the data (it calls readRawBytes() then calls  
ByteString.copyFrom()). This could probably be "fixed" in  
CodedInputStream.readBytes(), which might improve performance a fair  
bit. I'll put this on my TODO list of things to look at, since I think  
my code does this pretty frequently.




Even worse when encoding: if I read some data from file, does not seem
possible to put it directly into a ByteString so I have to make first
a byte[], then copy it into the ByteString and when encoding, it makes
yet another byte[].


The copy cannot be avoided because it makes the API simpler (thread- 
safety, don't need to worry about the ByteBuffer being accidentally  
changed, etc). The latest version of Protocol Buffers in Subversion  
has ByteString.copyFrom(ByteBuffer) which will do what you want  
efficiently.


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Protocol Buffers Python extension in C

2011-01-08 Thread Evan Jones

On Jan 7, 2011, at 6:36 , Atamurad Hezretkuliyev wrote:
Currently our basic deserializer module is 17x faster than Google's  
implementation in pure Python.


The pure python code is pretty slow. However, the repository version  
(and the newly released 2.4.0 rc 1?) has C++ code to do  
serialization / deserialization. There is no documentation, but the  
following thread describes it:


http://groups.google.com/group/protobuf/browse_thread/thread/cfb13cd0a609b1c7/a5ada8791ca3c0ca#a5ada8791ca3c0ca

You may want to test that and see how it turns out. And/or contact  
Yang about this, since he was interested in the same problem.


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: protobuffer suport composite object in stack not in heap?

2011-01-04 Thread Evan Jones

On Jan 4, 2011, at 20:25 , Igor Gatis wrote:
A while ago, a colleague had a "memory leak" reusing a PB message  
which contained a repeated field. If I'm not mistaken the problem  
was that pb_message::Clear() calls vector::clear() and  
string::clear() which does not really release the memory allocated.  
I can't really tell for sure actually.


@Kenton, does that make any sense? If yes, is there a way to avoid it?


Yes, I have run into this same issue, when I occasionally read in a  
"huge" message. I think this is "by design." As Kenton noted: if you  
re-use the message, it never has to free / reallocate memory. See the  
"Optimization Tips" in this document:


http://code.google.com/apis/protocolbuffers/docs/cpptutorial.html

There is a ::SpaceUsed() method that can be helpful.

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Does PrococolBuf support the buffer function when try to read the information from the inputstream?

2010-12-29 Thread Evan Jones

On Dec 27, 2010, at 4:43 , 飞 杨 wrote:
Does PrococolBuf support the above buffer function? if does, what  
the code would be like...


I think what you are asking is to have a streaming protocol, where one  
connection or file contains multiple protocol buffers?


http://code.google.com/apis/protocolbuffers/docs/techniques.html#streaming

You may also want to search for "streaming" in the protobuf archives:

http://groups.google.com/group/protobuf/search?group=protobuf&q=streaming

Hope this helps,

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] how can we deal with multi-lingual string? (c++)

2010-12-27 Thread Evan Jones

On Dec 25, 2010, at 21:10 , alcohol wrote:

without std::wstring support,  how can we deal with strings consists
of  Ascii, Chinese,Japanese, Korean characters?


It is expected that you put UTF-8 encoded characters into protocol  
buffer strings. Alternatively, you can use the bytes type and use any  
encoding you want, but then you'll need to handle the conversions,  
whereas if you use Java, the protocol buffer API handles conversions  
for you (to/from Java String, which uses Unicode). Hope that helps,


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: java parse with class known at runtime (and compiled proto)

2010-12-06 Thread Evan Jones

On Dec 6, 2010, at 10:31 , Koert Kuipers wrote:
But that doesn't make a parseFrom() in message interface invalid,  
does it?
Indeed some other information outside the raw bytes will be needed  
to pick to right Message subclass. But that's fine.


Oh, sorry, I misunderstood your question, so my answer is somewhat  
invalid.



One could then:
1) pick the right subclass of Message based upon some information  
outside the raw bytes (in my case something stored in a protobuf  
wrapper around the raw bytes)

2) call subclass.parseFrom(bytes)

now we have to jump through more hoops for step 2 (create instance  
of Message subclass, newBuilderForType, mergeFrom, isInitialized,  
build)


The MessageLite.Builder interface has a mergeFrom method that does  
what you want. What you should do is something like:


* Get a MessageLite instance for the message type you want to parse  
(eg. something like MyMessageType.getDefaultInstance(), or  
MessageLite.getDefaultInstanceForType())
* Hold on to that MessageLite instance in some sort of registry.  
(HashMap?)
* When you get a message, look at the protobuf wrapper to determine  
the type.

* Look up the "prototype" MessageLite instance in your registry.
* Call prototypeInstance.newBuilderForType().mergeFrom(bytes).build()

This only creates a single instance of the message each time.  
The .build() method will automatically check that the message is  
initialized, so you don't need to call isInitialized (although you may  
want to catch the exception it could throw?).


This Builder pattern is used so that the Message objects are  
immutable. This means they can be passed between threads without  
requiring any synchronization. See:


http://code.google.com/apis/protocolbuffers/docs/javatutorial.html#builders

Hope this helps,

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: java parse with class known at runtime (and compiled proto)

2010-12-06 Thread Evan Jones

On Dec 6, 2010, at 9:05 , Koert Kuipers wrote:
i am still confused why protobuffers does not have a parseFrom()  
method in the message Interface. that would have been a lot cleaner  
i think. or am i missing something?


Because the serialized representation does not include anything to  
describe the type. Thus, based on just the raw bytes, you can't tell  
what type of message it is. If you need this functionality, you need  
to build it yourself, either using union types:


http://code.google.com/apis/protocolbuffers/docs/techniques.html#union

Or by doing some custom thing (including some unique identifier, or  
the fully qualified message name, or something in some header first).


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] java parse with class known at runtime (and compiled proto)

2010-12-03 Thread Evan Jones

On Dec 3, 2010, at 14:21 , Koert Kuipers wrote:

public class ProtobufDeserializer {
public T fromByteBuffer(ByteBuffer byteBuffer) {


I don't *think* the generic type is going to be enough due to erasure,  
but I'm not a generics expert. I know something like the following  
works (I may be messing up the generics syntax since I'm not super  
familiar with it):


 public T fromByteBuffer(ByteBuffer byteBuffer,  
T defaultInstance) {

  Builder b = defaultInstance.newBuilderForType();
  b.mergeFrom(ByteString.copyFrom(byteBuffer));
  return b.build();
}

You can get defaultInstance from  
ConcreteMessageType.getDefaultInstance();


You may want to create a tiny InputStream wrapper around ByteBuffer to  
avoid an extra copy, or if you know it is a heap byte buffer, use the  
array mergeFrom().


Hope that helps,

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] 2.4.0 and lazy UTF-8 conversions in Java

2010-12-01 Thread Evan Jones

On Nov 30, 2010, at 20:35 , Kenton Varda wrote:
BTW, we actually ended up reverting your change and replacing it  
with the new implementation.  We found that having two references  
increased memory pressure too much.  I thought I had mentioned this  
to you; sorry if I forgot.


Ah; I'm not surprised, which is why it was conditional on the SPEED  
implementation. It wasn't just two references, but two copies of each  
string as well.


The instanceof approach to switch between the two is a good idea. When  
I wrote my implementation, I was concerned about the thread-safeness  
issues, although I don't think I ever considered this particular  
version. However, I think this can be made thread-safe, even without  
volatile (although I only understand the JMM enough to be dangerous).


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] 2.4.0 and lazy UTF-8 conversions in Java

2010-11-30 Thread Evan Jones

On Nov 30, 2010, at 15:58 , Blair Zajac wrote:
"""Added lazy conversion of UTF-8 encoded strings to String objects  
to improve performance."""


Is the lazyness thread safe?

Without looking at the implementation, then if it isn't thread safe,  
I would guess this isn't much overhead, but if it is thread safe and  
you know you're going to use all the string fields, then does it  
hurt performance instead?


Interesting! I looked at this sort of thing a bit, since I have a  
patch that makes string encoding somewhat faster, although it is quite  
intrusive, so probably not appropriate for including in the main  
source tree.


Guesses based on my knowledge of the Java implementation:

* It will be thread-safe, since that is the guarantee provided by the  
current protocol buffers implementation.


* I'll guess that it will not be slower if you access all the strings.  
Currently, the parsing process copies the raw bytes from the input  
buffer into an individual byte array, then converts that to a String.  
This is, sadly, the most efficient thing you can do, since you need  
"special" code to create Strings. Therefore, doing "lazy" conversion  
isn't going to be slower. The objects already have both byte[] and  
String fields for each string due to an encoding improvement I  
contributed, so this should be nearly a pure win.


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] fails to parse from string

2010-11-10 Thread Evan Jones

On Nov 10, 2010, at 14:13 , Brad Lira wrote:

yes it was the null character, on the server side when copying buffer
into string, i had add 1 to the
size of the buffer (i guess for the null), then the parsing was ok
with no error.


Just adding 1 is still probably not correct. You have similar  
incorrect code on the receive side:



recvfrom(socket, buf, )
mystr.assign(buf, strlen(buf));



strlen(buf) is not going to give you the right thing. You should be  
using the return value from recvfrom(), which gives you the number of  
bytes that were read from the network.


Note: If you are using UDP, it will end up not working as soon as you  
have a message which is bigger than either your buffer, or the maximum  
UDP packet size, whichever comes first.


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] fails to parse from string

2010-11-10 Thread Evan Jones

Brad Lira wrote:

address_book.SerializeToString(&mystr)
strncpy(buf, mystr.c_str(), strlen(mystr.c_str()));


strlen will return a shorter length than the real length, due to null 
characters. Use mystr.size()





Maybe this method is not the right way to send string across socket.
I tried using SerializeToFileDescriptor(socket), that worked on the
client side, but on the server side, i never get the message with UDP sockets.
is there a better way of sending data across network?


You probably want to use TCP sockets, since it provides retransmissions 
for you. Also, you'll need to prepend a length. See:


http://code.google.com/apis/protocolbuffers/docs/techniques.html#streaming


Or search the group archives for threads such as:

http://groups.google.com/group/protobuf/browse_thread/thread/3af587ab16132a3f


Good luck,

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] fails to parse from string

2010-11-09 Thread Evan Jones

On Nov 9, 2010, at 16:11 , Brad Lira wrote:

it returns false, but it actually gets the message correctly from
client side, so i am not sure why it thinks that parsing has failed.
any ideas?


How are you putting data into mystr? Protocol buffers contain null  
bytes, so you must pass both a char* and a length:


mystr.assign(data, length);

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] help required

2010-11-03 Thread Evan Jones

On Nov 3, 2010, at 14:54 , Manoj Upadhyay wrote:

I want to use the protobuf in my project. I have many java POJO's in
project and there are few POJO are composition of other POJO's and
these POJO's are used in service and other places.
Please let me know how can I define the .protoc file for this.

For example : Suppose I have 6 POJO's(CLASS A ,B,C,D,E,F)
and Class A has composition of the reference of other 5  POJO's (CLASS
B,C,D,E,F) and these 5 POJO are used many places in project. Then how
can I proceed to define the proto file.


You'll need to redefine protocol buffers for all these objects. See:

http://code.google.com/apis/protocolbuffers/docs/javatutorial.html

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] CodedInputStream on top of sockets

2010-11-02 Thread Evan Jones

On Nov 2, 2010, at 10:37 , Jesper wrote:

I'm trying to implement the writeDelimitedTo/parseDelimitedFrom
methods in C++, but getting stuck on how to create a CodedInputStream
on top of a socket in a portable manner. Can CodedInputStream work
with windows sockets as well?


You can certainly make it work one way or another. You'll need to  
create an implementation of ZeroCopyInputStream that makes the  
appropriate Windows socket calls to read data from the socket. Note  
that it may be easier to implement the CopyingInputStream interface,  
then wrap it in the CopyingInputStreamAdaptor. This is typically  
easier since CopyingInputStreamAdaptor then implements the appropriate  
buffering logic.



See:

http://code.google.com/apis/protocolbuffers/docs/reference/cpp/google.protobuf.io.zero_copy_stream_impl_lite.html#CopyingInputStreamAdaptor

Good luck,

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: Generic Message Dispatch and Message Handler

2010-10-27 Thread Evan Jones

On Oct 27, 2010, at 11:36 , Jimm wrote:

How are you parsing arbitrary PB bytes into a Generated Message ? I am
finding no class in API that can deserialize  PB byte buffer into  
GeneratedMessage?


I'm using the generic Service API that is included with protocol  
buffers, so I'm not using GeneratedMessage. Rather, I'm using a  
message instance itself. The "register" does something ilke this:


serviceRegister.registerCall(MyCustomMessage.getDefaultInstance());


Then you can parse this with code like the following:

Message requestPrototype = ...;  // stored in registerCall  
implementation

Message.Builder builder = requestPrototype.newBuilderForType();
builder.mergeFrom(requestByteString);


My code is actually available in the following hg repository. I don't   
recommend that people use it directly, since it is a bit hacky, but it  
could serve as an example:


http://people.csail.mit.edu/evanj/hg/index.cgi/javatxn/file/tip/src/ca/evanjones/protorpc/ServiceRegistry.java
http://people.csail.mit.edu/evanj/hg/index.cgi/javatxn/file/tip/src/ca/evanjones/protorpc/ProtoMethodInvoker.java


Good luck,

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Generic Message Dispatch and Message Handler

2010-10-27 Thread Evan Jones

On Oct 26, 2010, at 15:45 , maninder batth wrote:

My generic Handler would create a GeneratedMessage and look for the
field messageType. Based on the value of the messageType, a particular
handler will be invoked.


This is basically what I have done, for my protobuf RPC  
implementation. If you only need to choose between a limited set of  
types, you may want a union type or extensions instead:


http://code.google.com/apis/protocolbuffers/docs/techniques.html#union

http://code.google.com/apis/protocolbuffers/docs/proto.html#extensions

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: Message missing required fields exception when parsing a message that has required fields defaulted

2010-10-26 Thread Evan Jones

On Oct 26, 2010, at 4:13 , locky wrote:

The C++ side is setting things correctly.  My understanding is that
default values are not sent over the wire.  When building a received
message from a byte[ ] a check is done to see if required fields have
been set.  Any required field that was not sent due to having a
default value on the other side is not marked as being set and the
exception gets thrown.


This is exactly correct. You should do two things:

1. Set this field on the sending side, but you mentioned that you are  
already doing this.


2. Verify that the bytes you are reading in on one side match the  
bytes being sent. I usually get this error when there is some sort of  
message handling error. For example, if you pass protobuf an empty  
array, you'll get this error message. You should write out the bytes  
that you are writing, and the bytes that you are reading and verify  
that they match. Also verify that the size you are passing in matches.


There is a difference between an unset field with a default value of  
"" and a set field with a value of "". The .hasProperty() method will  
return true for the set field, and false for the unset field. Thus,  
these messages are serialized differently.


Hope this helps,

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] protocol buffer within a protocol buffer from C++ to Java

2010-10-26 Thread Evan Jones

On Oct 25, 2010, at 21:45 , Paul wrote:

 optional string meas_rec_str = 2;


Change this to:

optional bytes meas_rec_str = 2;

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] ParseFromArray -- in Java

2010-10-25 Thread Evan Jones

On Oct 25, 2010, at 16:52 , ury wrote:

i.e.  Does the Java implementation has the Clear() method ?


No, the Java implementation has immutable objects, so this is  
generally not possible. A new object must be created for each item.  
Immutable objects have benefits like being thread safe (see http://www.javapractices.com/topic/TopicAction.do?Id=29)


That said, I think you *might* be able to hack something like this  
using the Builder object. I would be interested to know if you try  
this, and if it has any performance benefits.


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: Delay in Sending Data

2010-10-22 Thread Evan Jones

Kevin wrote:

My only concern now is in regards to messages sizes and the prepending
the size at the beginning.  What is the best way to go about this?  My
test message required on one byte but my next messages will probably
require 2 if not 3  bytes.  What is the proper way to handle this in
the C++ code as the Java code has this built-in?


If you want to use parseDelimited on the Java side, you must use 
CodedOutptuStream::WriteVarint32() on the C++ side. See this recent 
thread for some code that should do the trick:


http://groups.google.com/group/protobuf/browse_thread/thread/3af587ab16132a3f



In addition, my colleague has used Thrift before and was extremely
surprised that the C++ classes did not have matching function calls in
Java and vice versa.  Can someone explain this short coming?


Someone added it as a convenience method to the Java implementation. No 
one has yet added it to the C++ implementation. I think mostly because 
protocol buffers are a fairly "low level" library, and other people wrap 
them in many different ways. However, this is probably just an 
oversight: if the Java side has parseDelimited/writeDelimited, the other 
implementations probably should as well.


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Delay in Sending Data

2010-10-21 Thread Evan Jones

On Oct 21, 2010, at 1:21 , Kevin wrote:

Basically, the code that receives the data will wait until the stream
is closed before reading the data.  I thought that flushing the data
would cause the data to be sent but that apparently has no effect.  Is
this my implementation or a problem with using the writeTo
function?


The flush *should* be causing the data to be sent. The problem is on  
the reader side: the default read methods read until the end of the  
stream. You'll need to prepend a length. You may want to use  
parseDelimited(). See the following document, or search the archives  
for many conversations about this. Hope this helps,


Evan

http://code.google.com/apis/protocolbuffers/docs/techniques.html#streaming

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] buffer sizes when sending messages from c++ to java

2010-10-20 Thread Evan Jones

On Oct 20, 2010, at 2:13 , Kenton Varda wrote:
But you are actually writing a varint32, which can be anywhere  
between 1 and 5 bytes depending on the value.


Use CodedOutputStream::Varint32Size() to compute the number of bytes  
needed to encode a particular value.


This has the advantage that you can allocate a buffer of exactly the  
right size, rather than adding 100 as an estimate. However, you can  
also find the final size after all the writes with  
CodedOutputStream::ByteCount()


You should not need to do any byte swapping if you are serializing and  
deserializing integers using the protobuf API: it handles any required  
byte swapping for you.


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: valgrind invalid write and double free errors

2010-10-14 Thread Evan Jones

On Oct 14, 2010, at 11:32 , CB wrote:

Actually, yes, we have a shared library containing our protobuf code,
which we do load with dlopen.  A command line option tells the app
which protocol it needs to use, and the app loads the appropriate
library.  The open only happens once, very shortly after program
launch.  We're not constantly loading and unloading.


Do you ever call dlclose() on this library? Protobuf has some  
complicated initialization time and shutdown clean up code buried in  
descriptor.cc that I don't really understand. At the very least, there  
is a call to this:


internal::OnShutdown(&DeleteGeneratedPool);


I'm a little surprised that I don't see that function appear on your  
stack trace, if that is in fact the problem, but it must be something  
like that. Could you try adding a printf() to the  
DeleteGeneratedPool() function in protobuf/descriptor.cc and see if  
that is getting called multiple times?


This FileDescriptorTable object is used internally by the protobuf  
library and I don't really understand it. I'm hoping someone who might  
understand this code might be able to suggest where this double free  
could be coming from.


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] valgrind invalid write and double free errors

2010-10-13 Thread Evan Jones

On Oct 13, 2010, at 16:53 , CB wrote:

Any feedback on how to further debug this problem would be
appreciated.


You aren't doing anything strange like using dlopen() to dynamically  
load/unload libraries, are you? I can't think of anything obvious that  
might cause this kind of error. The FileDescriptorTables are "static"  
objects of sorts, I think. Are you calling ShutdownProtobufLibrary()  
somewhere? Maybe more than once? Memory leaks *will* be reported by  
valgrind if you don't call ShutdownProtobufLibrary(), but I don't know  
what could cause a double free.


Evan


--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: sending a message over TCP from a C++ client to a Java server

2010-10-13 Thread Evan Jones

On Oct 13, 2010, at 16:49 , Paul wrote:

Thanks for the suggestion.  However, I am already prepending the
message size on the C++ side in the line:
coded_output->WriteVarint64(snap1.ByteSize());


You may want to verify that the exact bytes that come out of  
msg.SerializeToString (or related) are coming out the other end and  
getting passed into parseDelimited. It might be helpful if you sent a  
snippet of code where you are sending and receiving the messages, but  
I can't think of anything off the top of my head.


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] sending a message over TCP from a C++ client to a Java server

2010-10-13 Thread Evan Jones

On Oct 13, 2010, at 15:13 , Paul wrote:

On the client side (in C++), I open a TCP socket connection on the
same port with the server's IP address.  I serialize the message using
SerializeToCodedStream into an array using ArrayOutputStream.  After
serializing it, I send it over the TCP connection using my sendTCP
method which uses C++ sockets.


SerializeToCodedStream does *not* prepend the message size. The Java  
side is expecting that the message will start with the message length,  
so that is probably why you are getting parse errors. You need to do  
something like:



codedOutput.WriteVarint32(msg.ByteSize());
msg.SerializeToCodedStream(codedOutput);
codedOutput.flush();

...


Hope this helps,

Evan

(as an aside: the C++ API really should have an equivalent to  
writeDelimitedTo and parseDelimited on the Java side).


--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Feature proposal: mapped fields

2010-10-06 Thread Evan Jones

On Oct 6, 2010, at 9:23 , Igor Gatis wrote:

It would be nice to have mapped fields, e.g. key-value pairs.


I think that map support would probably be useful. I've basically  
created my own maps in protocol buffers a couple times, either by  
using two repeated fields, or a repeated field of a custom "pair"  
type. In these cases, it would have been nice to be able to use the  
Protocol Buffer as a map directly, rather than needing to transfer the  
data to some other object that actually implements the map. I would be  
interested to hear the opinion of the Google maintainers. I'm assuming  
that there are probably many applications inside Google that exchange  
map-like messages.


This would be a big change, although it wouldn't be an impossible one,  
I don't think. I think it could be implemented as "syntactic sugar"  
over a repeated Pair message. I think the biggest challenge is that  
maps are a "higher level" abstraction than repeated fields, which  
leads to many design challenges:


* Are the maps ordered or unordered?
	* If ordered, how are keys compared? This needs to be consistent  
across programming languages.
	* If unordered, how are hash values computed? This could result in a  
message being parsed and re-serialized differently, if different  
languages compute the hashes differently.

* For both, how are "'unknown" fields handled?
* Do the maps support repeated keys?
* If not, what happens when parsing a message with repeated keys?


Other message protocols contain map-like structures: JSON, Thrift, and  
Avro. Avro only supports string keys. JSON only supports primitive  
keys.  Thrift has a similar note about maps:


http://wiki.apache.org/thrift/ThriftTypes

For maximal compatibility, the key type for map should be a basic  
type rather than a struct or container type. There are some  
languages which do not support more complex key types in their  
native map types. In addition the JSON protocol only supports key  
types that are base types.



Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: Timeouts for reading from a CodedInputStream

2010-09-28 Thread Evan Jones

On Sep 28, 2010, at 18:36 , Patrick wrote:

I also have the problem that the RPC I wrote comes in a threaded model
and a multi-process model. The multi-process one makes some things a
bit harder. I was hoping to utilize a shm mutex to signal termination
but this would only work if my message parsing loop timed out every so
often and, therefore, could check the mutex.


This should be pretty easy to achieve by supplying your own  
implementation of FileInputStream that uses select() and a non- 
blocking read() rather than just read(). It can then fail the call to  
Next() whenever it is convenient.


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Timeouts for reading from a CodedInputStream

2010-09-28 Thread Evan Jones

On Sep 28, 2010, at 15:33 , Patrick wrote:

This is all fine and dandy except when I want to shutdown the server
or connection (not client initiated). The ReadTag (as well as the
other Read functions) blocks until data is received but I want it to
timeout after a specified amount of time. So in essence a polling read
instead of a blocking one. This will allow me to check that the
connection is still valid and either re-enter my message parsing
function or cleanup and exit.


One quick hack that might work: if you have threads anyway, if you  
close the file descriptor in the other thread, the read will fail.  
This causes input.ReadTag() to return 0.


The more complex hack is to supply your own ZeroCopyInputStream  
implementation, and in your implementation of ::Next, implement your  
own time out logic.


In my implementation, I manage this by manually managing my own  
buffer, so I never call the CodedInputStream routines unless I know  
there is sufficient data. This may not be ideal for your application,  
so your milage may vary.


Good luck,

Evan Jones

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: MyType_Parse() calls ParseNamedEnum() with 'const std::string' parameter instead of 'const string'

2010-09-22 Thread Evan Jones

On Sep 22, 2010, at 14:24 , Anand Ganesh wrote:
This header is not using namespace std explicitly (protobuf-2.1.0).  
Notice how it's gotten generated with 'const string&'.


Right, but at the top of google/protobuf/stubs/common.h is the  
following:



namespace google {
namespace protobuf {

using namespace std;  // Don't do this at home, kids.



That file is included via:

generated_message_reflection.h -> message_lite.h -> stubs/common.h


So there is some sort of weird namespace clashing going on. I wonder  
if maybe the issue is that the code in generated_message_reflection.h  
is in the google::protobuf::internal namespace, rather than in  
google::protobuf?


Good luck,

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] MyType_Parse() calls ParseNamedEnum() with 'const std::string' parameter instead of 'const string'

2010-09-22 Thread Evan Jones

On Sep 21, 2010, at 20:19 , Kenton Varda wrote:
That still seems strange.  The generated code explicitly refers  
to ::std::string, so it couldn't be using your type.


Is your custom "string" type defined with macros? That would probably  
do the trick, since they don't respect namespaces.


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: Silly beginner question: Do the different RPC implementations inter-work?

2010-08-27 Thread Evan Jones

Navigateur wrote:

Did you use the automatically-generated "abstract service" code, or
did you do the "recently recommended" "make your own code-generator
plugin" to do the implementation?


My implementation was started before the code generation plugins were 
done, so I used the existing abstract service. Were I to start it today, 
I would use the code generator, since there are a few small things in 
the automatically generated RPC interface that I would like to change.


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Silly beginner question: Do the different RPC implementations inter-work?

2010-08-27 Thread Evan Jones

Navigateur wrote:

So, say I took a Java implementation from somewhere, and a C++
implementation from somewhere else, and a C# implementation from
somewhere else, would they just easily inter-work (say if I specified
the same transport channel or something)?


Almost certainly not. While the service definitions and the data format 
are defined by the protobuf library, an RPC implementation needs to pass 
some additional data between the end points (eg. what method to invoke, 
potentially other options). The format of this data is not specified, so 
implementations are incompatible.




If not, is it easy to make them inter-work? Is it a good idea or
practical to try to do so? If so, what are the basic steps?


I suspect it wouldn't be very difficult. Mostly it would be an issue of 
having a bunch of implementors agree on a wire protocol. However, as a 
third party, I suspect it would be hard, since you won't understand the 
protocol.


I think part of the issue is that most people "roll their own" 
interchange protocol on top of protocol buffers, since it is pretty easy 
to do. However, if you want a "ready to use" RPC system, protocol 
buffers aren't your best choice.


That said: If you are desperate and willing to put in some effort, I 
have a C++ and Java implementation of the protocol buffers RPC 
interfaces that I've been using. It isn't the best thing in the world, 
and you'll need to put in some effort to get it to work in some other 
project, but it is available under a BSD license.


Hope this helps,

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Status of protobufs

2010-08-26 Thread Evan Jones

On Aug 26, 2010, at 12:07 , Jean-Sebastien Stoezel wrote:

More specifically how they are parsed from real time datastreams?


You should manually insert a leading "length of next message" field  
into the data stream. The Java implementation even has a shortcut  
methods for this (see below). In C++ you have to implement it  
yourself, but it is only a few lines of code.



See:

http://code.google.com/apis/protocolbuffers/docs/techniques.html#streaming

http://code.google.com/apis/protocolbuffers/docs/reference/java/com/google/protobuf/MessageLite.html#writeDelimitedTo(java.io.OutputStream)
http://code.google.com/apis/protocolbuffers/docs/reference/java/com/google/protobuf/MessageLite.Builder.html#mergeDelimitedFrom(java.io.InputStream)


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: Performance of java proto buffers

2010-08-23 Thread Evan Jones

On Aug 23, 2010, at 11:41 , achintms wrote:

Thanks Evan. That was very helpful. I got rid of the external object
and created the internal objects directly. After that the only part
that was taking time was decoding. I like the idea of using bytes for
serialization and do my own encoding/decoding on top of that. That way
I can delay decoding until it is needed. For example for comparisons I
should just be able to use the bytes.


This is true, provided that everyone uses the same encoding without  
any bugs, and canonicalizes Unicode in the same way (http://unicode.org/reports/tr15 
). In general, this is tricky, and I would suggest using the built-in  
string type. However, if you have a very specific need, and the  
decoding is a bottleneck, this should work.




Also do you think that if I
encode/decode using utf-16 it would be faster? Clearly it is not as
compressed.


I would think it should be, but I haven't done any performance  
measurements, so I can't confirm 100% that this is the case.


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Service can only receive one argument

2010-08-22 Thread Evan Jones

On Aug 22, 2010, at 4:36 , omer.c wrote:

Can a service receive multiplate arugments or only one?


Only one.



How can I define a service which will accept both arguments:


Create a union message, or define two RPCs. Unions:

http://code.google.com/apis/protocolbuffers/docs/techniques.html#union

Two RPCs:

service Service {
  rpc sendObject1(Object1) returns (Result1);
  rpc sendObject2(Object2) returns (Result2);
}

Hope this helps,

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Performance of java proto buffers

2010-08-22 Thread Evan Jones

On Aug 19, 2010, at 11:45 , achintms wrote:

I have an application that is reading data from disk and is using
proto buffers to create java objects. When doing performance analysis
I was surprised to find out that most of the time was spent in and
around proto buffers and not reading data from disk.


In my experience, protocol buffers are more than fast enough to be  
able to keep up with disk speeds. That is, when reading uncached data  
from the disk at 100 MB/s, protocol buffers can decode it at that  
speed. Now, if your data is cached, and your application is not doing  
much with the data, then I would expect protocol buffers to take 100%  
of the CPU time, since the disk read doesn't take CPU, and your  
application isn't doing much.


In other words: in a more "real" application, I would expect protocol  
buffers will take only a very small portion of your application's time.




Again I expected that decoding strings would be almost all the time
(although decoding here still seems slower than in C in my
experience). I am trying to figure out why mergeFrom method for this
message is taking 6 sec (own time).


Decoding strings in Java is way slower because it actually decodes the  
UTF-8 encoded strings into UTF-16 strings in memory. The C++ version  
just leaves the data in UTF-8. If this is a performance issue for your  
application, you may wish to consider using the bytes protocol buffer  
type rather than strings. This is less convenient, and means you can  
"screw up" by accidentally sending invalid data, but is faster.




There are around 15 SubMessages.


This is basically the problem right here. Each time you parse one of  
these messages, it ends up allocating a new object for each of these  
sub messages, and a new object for each string inside them. This is  
pretty slow.


As I said above: I suspect that in a "real" application, this won't be  
a problem. However, it would be faster if you get rid of all the sub  
messages (assuming that you don't actually need them for some other  
reason).



Finally, I'll take a moment to promote my patch that improves Java  
message *encoding* performance, by optimizing string encoding. It is  
available at the following URL. Unfortunately, there is no similar  
approach to improving the decoding performance.


http://codereview.appspot.com/949044/

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: How to retrieve parameters using tag numbers using CodedInputStream

2010-08-16 Thread Evan Jones

On Aug 16, 2010, at 10:56 , Prakash Rao wrote:

I'm just looking for a easy way to write null response if data is not
present in DB and write proto message object if data is present &
parse these in client side appropriately. I didn't get a easy way to
do this using CodedInputStream. Currently i'm creating a empty proto
object on server side and checking for key attribute at client side as
stated above.


The "empty" protocol buffer message serializes to zero bytes, so if  
your message has no content, you could just send a zero byte message.  
This would avoid creating a protocol buffer message. However, I  
suspect that isn't really a big overhead. You can also use  
YourProtocolMessage.getDefaultInstance() to avoid creating a message.


Hope this helps,

Evan Jones

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: Java implementation questions

2010-08-05 Thread Evan Jones

On Aug 5, 2010, at 9:16 , Ralf wrote:

I might be mistaken, but didn't "groups" use this approach - use a
special tag to indicate the end of a message? As only tags are
checked, there is no need to escape any data.


Good point, I forgot about groups. They definitely do use that  
approach. Maybe one of the Googlers on this list will have a better  
idea about why groups are now deprecated in favour of nested messages.




Anyway, I was referring more to the implementation. For example, we
could first serialize the message to a ByteArrayOutputStream, then
write the result and its size to the output. Obviously this approach
is much slower, but I was wondering if there were other similar
approaches.


That's true, and would work. The other option would be to use fixed  
width integers for the lengths, so then you could "reserve" space in  
the buffer, serialize the message, then go back and fill in the length  
field. This would be an incompatible change to the serialization  
format, however.


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Java implementation questions

2010-08-05 Thread Evan Jones

On Aug 5, 2010, at 7:06 , Ralf wrote:

1. CodedInputStream uses an internal byte[] buffer, instead of
directly using the InputStream. Does this give significant performance
improvements?


It does appear to, actually. I tried it with CodedOutputStream, but  
the issues should be the same. The issue is that CodedOutputStream  
produces output a byte at a time. It is significantly faster to do raw  
array accesses than to call out to OutputStream each time. I no longer  
have this code, but here is a post I made about my experiment:


http://groups.google.com/group/protobuf/browse_thread/thread/70cc0f632228195



2. Messages implement the method getSerializedSize(). Is this used for
anything other than for serializing the message? Are there any
alternative implementations you considered, that does not depend on
pre-computing the size?


The protocol buffer format requires this, since the messages are not  
self delimiting. That is, to serialize them you need to prepend the  
length. The alternative would require a special "end of message"  
marker. I think the reason this seems less appealing is that you then  
need to "escape" this special marker in the output (eg. if it appears  
in an embedded byte string), but I'm just guessing here.


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Setting a nested message field without making a pointer?

2010-08-04 Thread Evan Jones

On Aug 4, 2010, at 9:50 , mark.t.macdon...@googlemail.com wrote:

Is there a way I can do this without creating a Pixel pointer?
Something like this (which doesn't compile):


Try fractal.mutable_pixel(0)->set_*

Hope this helps,

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Total bytes limit on coded input stream in C++?

2010-08-04 Thread Evan Jones

On Aug 3, 2010, at 16:44 , Julian González wrote:
I used the approach you mentioned and it worked. I just have a  
problem I am writing 10,000 little messages in a file, first I write  
the size of the message and then the message as it follows:


codedOutput->WriteVarint32(sample.ByteSize());
res = sample.SerializeToCodedStream(codedOutput);

The problem is that when I try to read the 10,000 messages I just  
wrote I just can read 9984 messages, when I try to read the 9885 an  
error is thrown:


libprotobuf ERROR c:\active\protobuf-2.3.0\src\google\protobuf 
\message_lite.cc:1
23] Can't parse message of type "apm.Sample" because it is missing  
required fiel

ds: timestamp

what is happening? It looks like only 9886 messages were written  
into the file, why the last 16 messages were not written?


It shouldn't be happening. Since the sender checks that all required  
fields are present, this indicates that some mismatch is occurring  
between the serialization and deserialization code. Are you sure the  
data that is being sent is exactly the same as the data being  
received? Normally these errors occur because the data is being  
truncated or changed in transit somehow (eg. truncating at a null  
byte? truncating at some buffer limit?).


The other thing that could be happening is that you could be mis- 
parsing the earlier messages. To parse multiple messages from a  
stream, you need to limit the number of bytes read (eg. using  
CodedInputStream::PushLimit, or MessageLite::  
ParseFromBoundedZeroCopyStream).


Good luck,

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Total bytes limit on coded input stream in C++?

2010-08-03 Thread Evan Jones

On Aug 3, 2010, at 12:46 , Jon Schewe wrote:

I know that I could create a new coded input stream for each message,
but this seems rather wasteful and slow compared with just resetting a
counter.


I complained about the same thing a little while ago:

http://groups.google.com/group/protobuf/browse_thread/thread/a4bc2a3788d356f6


Read that thread for details, but the summary is: patches welcomed.  
CodedInputStream is pretty lightweight though, so creating and  
destroying one per message should be pretty efficient.


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Error in repeated_field.h

2010-07-30 Thread Evan Jones

On Jul 30, 2010, at 8:29 , arborges wrote:
"libprotobuf FATAL /usr/local/include/google/protobuf/ 
repeated_field.h:

637] CHECK failed: (index) < (size()): "


This means you are accessing an index past the end of the array. This  
is almost certainly a bug in your code.You should attach to this with  
a debugger to look at the entire stack trace to see where your bug is:




   const surroundsound::Arquivo::L1_Cena& cena_sonora =
projeto.cena(idxCena);
   numObj = cena_sonora.objetosonoro_size();


   for(int k = 0; k < numObj; k++)
   {
   const surroundsound::Arquivo::L1_ObjetoSonoro&
objeto_sonoro = cena_sonora.objetosonoro(k);


I'm guessing this is happening because of this line. This code looks  
okay to me, since you check that k < _size(). Are you modifying this  
list somewhere in your code at the same time? Or could you have memory  
corruption somewhere? Try using valgrind if you might have memory  
corruption. Good luck,


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Cannot parse message with CodedInputStream over a pipe

2010-07-30 Thread Evan Jones

On Jul 30, 2010, at 11:18 , jetcube wrote:

On the caller application i open a pipe to the previous app and write
a pb message of 10766 bytes and don't close the pipe but the first
application never finishes the if evaluation.


PushLimit() is a little funny: It doesn't  stop the CodedInputStream  
from attempting to fill its buffer. Thus, I think your problem is that  
the IstreamInputStream is probably blocked on the pipe, waiting for  
more data. Try using request.ParseFromBoundedZeroCopyStream() instead.  
Or manually use a LimitingInputStream to limit the number of bytes  
read, which is what that method does under the covers (I think).


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: C++ syntax: how to set a singular enum field

2010-07-16 Thread Evan Jones

On Jul 16, 2010, at 5:47 , mark.t.macdon...@googlemail.com wrote:

if (stabiliser.retraction()==device::HOUSED) cout<<"True\n";
//but this doesn't compile
stabiliser.set_retraction(device::RETRACTED);
}


See the generated .h file to see what might be going wrong. For enums,  
a .set_* method should be generated. Note that the .proto you sent  
doesn't have RETRACTED defined, so maybe that is your problem?


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Help with basic concepts of descriptors and reflection

2010-07-16 Thread Evan Jones

On Jul 15, 2010, at 16:40 , mark.t.macdon...@googlemail.com wrote:

cout<GetString( ref,
stabiliser.GetDescriptor()->FindFieldByName("name"));


The differences between these two are HUGE. The first one is a  
compiled local variable reference (effectively). The second has to do  
some sort of table lookup. You want to use the first form if you care  
about performance.


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Protocol Buffers RPC design questions

2010-07-16 Thread Evan Jones
One note is that the built-in service implementation is sort of  
considered to be "deprecated" at this point, due to the plugin  
infrastructure that allows people to generate their own service code.


On Jul 15, 2010, at 5:22 , Jamie McCrindle wrote:

2. I've been pondering how to inject in Service references. I like the
idea that I have a 'local' RPC implementation that could be swapped
out for a 'remote' one without having to change the client class. It
doesn't seem right to have this code in the client (i.e. recreate the
stub for every call):


I do exactly what you do: I create the stub once and pass it in where  
needed. You may also find the TestService.Interface interface to be  
useful for this. It lets you "inject" testing versions of the  
services, for example.




3. Regarding extending RpcController. Adding a timeout and a timestamp
seem pretty good candidates but the 'EnhancedRpcController' then
becomes a pervasive cast as well as a RPC implementation lockin


This is probably the "worst" part of the built-in service API, in my  
opinion. I end up with casts related to controllers in lots of places.  
Its ugly, but I don't see any good way to fix it.


Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Protobuf for client/server messaging?

2010-07-14 Thread Evan Jones

On Jul 14, 2010, at 4:36 , bwp wrote:

If we have to go down that route what would be a good identifier?


See Peter's email. But you can also use  
msg.getDescriptorForType().getFullName() to get a unique string for  
each protocol buffer message type. This is what I do for my own RPC  
system, which needs to be able to handle *any* message type (hence the  
"union" or "extension" approaches are not really correct). This needs  
the non-lite runtime, in order to have descriptors for messages. See:


http://code.google.com/apis/protocolbuffers/docs/reference/java/com/google/protobuf/Descriptors.Descriptor.html#getFullName()

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Can't read in Java message sent from C++

2010-07-10 Thread Evan Jones

On Jul 10, 2010, at 7:47 , Maxim Leonovich wrote:

ArrayOutputStream(buffer,msg.ByteSize() + 4,sizeof(char));


The documentation states:

block_size is mainly useful for testing; in production you would  
probably never want to set it.


So you should get rid of the "sizeof(char)" part.


   cos->WriteLittleEndian32(msg.ByteSize()); //Tryed  
"WriteVariant32", didn't help

   msg.SerializeToCodedStream(cos);


If you want to use Java's .parseDelimitedFrom, you *must* use  
WriteVarint32, because that is the format it expects the length  
prefix. In this case, you'll need to call ArrayOutputStream::  
ByteCount() to figure out how many bytes were actually serialized.


You also probably should create the ArrayOutputStream and  
CodedOutputStream on the stack, rather than using new. This will be  
slightly faster.



That said, the only issue here that affects correctness is the  
WriteVarint32 part. The rest shouldn't matter unless I missed  
something. You should change your code to do that, then if you are  
still having problems you should try dumping the contents of the  
buffer on both the C++ and the Java side. Maybe the input/output is  
getting messed up somewhere?


Good luck,

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Basic message encoding/decoding

2010-07-07 Thread Evan Jones

On Jul 7, 2010, at 5:43 , Timothy Parez wrote:

I'm aware I can simply use one of the various libraries, but it's
important I understand basic encoding/decoding so I can
pass this knowledge to teams who are using a language which is not
supported by any of the libraries.


I don't understand: you want code for encoding/decoding protocol  
buffers that does not use the official protocol buffer library? Or you  
want an example that uses the protocol buffer library? If you want to  
know how raw messages are encoded and decoded, digging through the  
source code for CodedInputStream / CodedOutputStream is probably  
helpful.


Also: did you look at the third party libraries? Many programming  
languages have implementations you could try using:


http://code.google.com/p/protobuf/wiki/ThirdPartyAddOns

Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



  1   2   >