Re: [protobuf] Static allocation
On Jul 18, 2012, at 16:14 , Jeremy wrote: I understand, but if one wants to keep a large persistent message allocated and walk over it frequently, there is a price to pay on cache misses that can be significant. I guess you are wishing that the memory layout was completely contiguous? Eg. if you have three string fields, that their memory would be laid out one field after another? Chances are good that with most dynamic memory allocators, if you allocate this specific sized message at one time, the fields will *likely* be contiguous or close to it, but obviously there are no guarantees. I would personally be surprised if these cache misses would be an important performance difference, but as normal there is only one way to tell: measure it. If you want something like this in protobuf though, you would need to change a *lot* of the internals. This would not be a simple change. I suggest trying to re-use a message, and seeing if the performance is acceptable or not. If not, you'll need to find some other serialization solution. Good luck, Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Static allocation
On Jul 17, 2012, at 2:33 , Jeremy Swigart wrote: Is there a way to tell the proto compiler to generate message definitions for which the message fields are statically defined rather than each individual field allocated with dynamic memory? Obviously the repeater fields couldn't be fully statically allocated(unless you could provide the compiler with a max size), but it would be preferable to have the option to create messages with minimal dynamic memory impact. Is this possible in the current library? I'll assume you are talking C++. In this case, if you re-use a single message, it will re-use the dynamically allocated memory. This means that after the maximal message(s) have been parsed, it will no longer allocate memory. This is approximately equivalent to what you want. See Optimization Tips in: https://developers.google.com/protocol-buffers/docs/cpptutorial Hope that helps, Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] where is input_stream.py?
On Jul 14, 2012, at 10:55 , jrf wrote: Is there a reason that a python equivalent of CodedInputStream is not part of protobuf? I seem to recall that the answer is basically yeah, it probably should be but no one really works on this stuff any more. You can dig around in the google.protobuf.internal package to get what you need. See this thread: https://groups.google.com/d/msg/protobuf/2m8ihEta1UU/1OOGmyfKP90J Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] where is input_stream.py?
On Jul 16, 2012, at 14:43 , jrf wrote: Is that because protobufs is done or not being further developed? You would need to get someone from Google to answer. The impression I get is that the open source release is, at the very least, in maintenance mode where they occasionally fix bugs etc. Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Error while using parseFrom
On Jun 26, 2012, at 11:08 , d34th4ck3r wrote: What is it that I am doing wrong? Protocol buffers are a *binary* format. Those funny characters at the end of the string are probably part of the message, and you should leave them there. You also should not be passing them around as strings. They need to be passed as bytes. If you need to call getBytes(UTF-8) you are doing something wrong. Good luck, Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Best practices for proto file organization in large projects
On Jun 18, 2012, at 22:49 , Justin Muncaster wrote: I'm currently changing our build system to be cmake based and I'm again finding myself fighting with the build system to get the .proto to be automatically generated in a way where they build correctly. What specific problems are you having? Errors in clean builds? Errors when modifying a .proto and rebuilding? How do you organize your proto files when you have many in common libraries? Do all .proto files live in one folder? Should one avoid import a/b/c/d/f.proto? Do you have any recommendations for how one ought one setup the cmake build system to work with proto files that are organized as they are above? Any general recommendations? What I've done on my last project was to put all the .proto source code in their own proto directory. But this was a cross-language project, so I was accessing them from both C++ and Java, so that seemed to make the most sense to me. I configured the build to generate all C++ files into build/*, and the java files into build/java, then I included/compiled them from there. The Chrome browser organizes its .proto files in a very different way: http://src.chromium.org/viewvc/chrome/trunk/src/chrome/common/metrics/proto/ Hope this helps, Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] 1MB message limit (recommendation)
On May 29, 2012, at 23:26 , msrobo wrote: According to the documentation, it's recommended that the message size be = 1 Megabyte. I've searched around for the reason for this recommendation, but I can't seem to find anything. Based on some basic benchmarking serializing/unserializing messages ranging from a few KB to more than 1MB in C++ there doesn't seem to be a drastic increase in time. More specifically, it doesn't seem to be performance driven in a C++ application. I think the main motivation is that there is no way to seek inside a protocol buffer, and you must load the entire thing into memory in one go. Hence when you get really large messages, you may need to allocate huge amounts of memory (the memory for the serialized buffer, and the memory for the entire protocol buffer object). 1 MB is just a recommendation, but there are also some internal default limits set to 64 MB for security issues: If you parse an enormous message, it requires allocating a ton of RAM. Hence the limits can prevent servers from running out of memory. If you have huge messages, you'll need to call the appropriate APIs to change the limits. https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.coded_stream#CodedInputStream.SetTotalBytesLimit.details Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] message member case problem
On May 16, 2012, at 5:02 , secondsquare wrote: After generating cpp files, the member becomes msgsize. Big ‘S’ is changed to little 's'. This is by design. Protocol buffers follows Google's style guide where C++ names_use_underscores while Java names useCamelCase. Protobuf will generate the appropriate names: https://developers.google.com/protocol-buffers/docs/style In other words, the recommendation is that you should use _ to separate works in your .proto. Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] incompatible type changes philosophy
On May 9, 2012, at 15:26 , Jeremy Stribling wrote: * There are two nodes, 1 and 2, running version A of the software. * They exchange messages containing protobuf P, which contains a string field F. * We write a new version B of the software, which changes field F to an integer as an optimization. * We upgrade node 1, but node 2. * If node 1 sends a protobuf P to node 2, I want node 2 to be able to access field F as a string, even though the wire format sent by node 1 was an integer. I think you can achieve your goals by building a layer on top of the existing protocol buffer parsing, possibly in combination with some custom options, a protoc plugin, and maybe a small tweak to the existing C++ code generator. You do the breaking change by effectively renaming the field, then using a protoc plugin to make it invisible to the application. To make this concrete, your Version A looks like: message P { optional string F = 1; } Then Version B looks like the following: message P { optional string old_F = 1 [(custom_upgrade_option) = some_upgrade_code]; optional int32 F = 2; } With this structure, Version B can always parse a Version A message. Senders will always ensure there is only one version in the message, so the only thing you are losing here is a field number, which isn't a huge deal. However, you but now want to automatically convert old_F to F. This can be done without changing the guts of the parser by writing a protoc plugin that generates a member function based on the custom option: void UpgradeToLatest() { if (has_old_F()) { set_F(some_upgrade_code(get_old_F())); clear_old_F(); } } You then need to make sure that Version B of the software calls this everywhere it is needed. Maybe this argues that what is needed is a post-processing insertion point in ::MergePartialFromCodedStream? Then your protoc plugin could insert this call after a protocol buffer message is successfully parsed, so the application would only ever have to deal with the integer version. In the other direction, I don't understand how the downgrading can possibly be done at the receiver, since it doesn't know how to do the downgrade (unless you are thinking about mobile code?). So in your example, Node 1 must create a Version A protocol buffer message when sending to Node 2. This means you need *some* sort of handshaking between Node 1 and Node 2, to indicate supported versions. This is reason I proposed adding some other member function that takes a target_version, so the sender knows what to emit. If sending the same message to multiple recipients, you'll need to send the lowest version in the group. Based on the above, your plugin could emit: void DowngradeToVersion(int target_version) { if (target_version 0xB has_F()) { set_old_F(some_downgrade_code(get_F())); clear_F(); } } There are many other ways you could do this, but it seems to me that this proposal is a way to do it without complicating the base protocol buffers library with application-specific details. Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] incompatible type changes philosophy
On May 8, 2012, at 21:26 , Jeremy Stribling wrote: Thanks for the response. As you say, this solution is painful because you can't enable the optimization until the old version of the program is completely deprecated. This is somewhat simple in the case that you yourself are deploying the software, but when you're shipping software to customers (as we are) and have to support many old versions, it will take a very long time (possibly years) before you can enable the optimization. Also, it breaks the downgrade path. Once you enable the optimization, you can never downgrade back to a version that did not know about the new field. I think I now understand your problem. You want to add some additional stuff to your .proto file to indicate the incompatible change, then have the application code not need to know about it? Eg. you want to write the application code that only accesses new_my_data and never needs to check for deprecated_my_data, but in fact the underlying protocol buffer supports both fields, or something like that. It seems to me like this is starts to end up in the territory of too high level for the protocol buffer library itself since I can't imagine this working without handshaking like Oliver talked about (e.g. I understand everything up to version X). My personal experience has been more like what Daniel describes: you keep both versions of the field, and your code has if statements to check for both. I believe this can be made to work, even in your scenario, but it does require ugly code in your application to handle it. My impression is that you are trying to avoid that. Random brainstorming that may not be helpful in any way: I'm curious about how you end up choosing to solve this, but I think you are going to need to use some combination of custom field options (to specify the change in a way that protoc can parse?), and then hacks in the C++ code generator to call your custom upgrade / downgrade code. I think this can work somewhat seamlessly in the reading older messages case (eg. you just add code that says if we see the old field, upgrade it to the new field). However, this can't work in the writing a newer message for an older receiver case without making the Serialize* code aware of the version it should be *writing*. I think this is going to be pretty application specific? My other thought: I think you might be able to get away with writing a protoc plugin that adds two functions to the class scope (which already exists as an insertion point): static UpgradedMessage ParseAnyMessageVersion(…); string SerializeToVersion(int target_version); These functions can apply the appropriate upgrade/downgrading as needed. However, you then need to call the appropriate functions to read/write the messages. However, I would argue that since in the serializing case you are going to need to know the target_version anyway, this might actually work? Good luck, and again I'd be interested to know how you do end up solving this. Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Protocol Buffers for IOS
On Mar 31, 2012, at 4:31 , Dhanaraj G wrote: I have gone through he following link.. http://code.google.com/p/metasyntactic/wiki/ProtocolBuffers There is no official support but I've used the following distribution with success, with the latest protoc (I'm pretty sure): https://github.com/booyah/protobuf-objc Good luck, Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] [Java][Python][TCP] Reading messages that where written with writeDelimited and viceversa
On Mar 26, 2012, at 21:49 , Galileo Sanchez wrote: Thanks man... It worked great! I guess I should read the documentation a little more xP. Sadly these functions aren't actually documented. The Python API doesn't expose these routines for some reason I don't understand / remember. Glad it worked! Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Problem with accent
On Mar 23, 2012, at 9:07 , Simon wrote: I have an annoying problem with some accent. I build my proto-object, no problem, and when i want to read it the browser, using .toString function, i have \303\240 instead of à, \303\250 instead of è, etc… What do you mean i want to read it the browser using .toString function? Is this Java or C++ or something else? What does your message definition look like? By default, protocol buffers encodes strings in UTF-8. These characters seem to be encoded correctly as UTF-8, so the sending side is doing the right thing, but the code that is reading them is not doing the correct decoding: à = U+00E0 Escaped in hexadecimal this is: \xc3\xa0 Escaped in octal this is: \303\240 So you need to decode from UTF-8 to get the correct characters. Hope this helps, Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] [Java][Python][TCP] Reading messages that where written with writeDelimited and viceversa
On Mar 25, 2012, at 18:09 , Galileo Sanchez wrote: else if (Should I write the size as a raw bit string?) thenHow do I do that? You need to use something like the following. Not 100% sure it works but it should be close? Hope this helps, Evan # Output a message to be read with Java's parseDelimitedFrom import google.protobuf.internal.encoder as encoder out = message.SerializeToString() out = encoder._VarintBytes(len(out)) + out # Read a message from Java's writeDelimitedTo: import google.protobuf.internal.decoder as decoder # Read length (size, position) = decoder._DecodeVarint(buffer, 0) # Read the message message_object.ParseFromString(buffer[position:position+size]) -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Protocol Buffers for version control of objects on a cache.
On Mar 20, 2012, at 16:12 , Mick wrote: These objects are going to be accessible to multiple users, who's accessor programs may be on different release cycles. I have been looking into protocol buffers as a way of managing dataloss/corruption between versions. Has anyone used protocol buffers to approach this type problem before? I'm not quite sure what you mean and what information you are looking for. However, protocol buffers were designed to help with this sort of problem, but it still requires care to make it work. Random notes off the top of my head: * You may want to make all fields optional, since if a message is missing a required field it will fail to parse. Certainly all *new* fields *must* be optional. * Protocol buffers only help with the parsing. You still need to think about forward and backwards compatibility (eg. how is your software going to process the messages). * Passing messages through (eg. proxies or other tools) will work. Hope this helps. Good luck, Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Re: How to read continuous stream of messages from TCP
On Mar 8, 2012, at 2:30 , waynix wrote: Since this is so common an issue and the suggested solution is almost de facto standard, (saw this after my initial post: http://code.google.com/apis/protocolbuffers/docs/techniques.html), it begs the question of why not build it into protobuf proper. Yeah, I would agree that something simple probably should have been included. The reasoning here is that this allows people to use protocol buffers with whatever other systems they might already be using (eg. HTTP, databases, files, RPC protocols, whatever), without being tied to a specific implementation. Compare the protocol buffer API to Thrift, for example, where the message serialization/deserialization is tied pretty tightly to the RPC system. There were proposals to possibly add a protocol buffer utils API, or a streaming API, but neither of those went anywhere. The closest thing is writeDelimitedTo / mergeDelimitedFrom in the Java API: http://code.google.com/apis/protocolbuffers/docs/reference/java/com/google/protobuf/MessageLite.html#writeDelimitedTo(java.io.OutputStream) Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] How to read continuous stream of messages from TCP
On Feb 27, 2012, at 17:27 , waynix wrote: 1. Is this still the way to do it? Seems quite cumbersome (to lazy me ;-). Is there a wrapper built in to do this? Yes. Sadly there is no wrapper included in the library. 2. If I understand Jason's suggestion riht, the length is really not part of the message, and the sender has to explcitly set it, instead of having protobuf encode it in. Which means a generic third party sender using my .proto file would not be sufficient. Plus how would they know the length before encoding the message proper? Filling it in after the fact would change the length again? or I am totally missing it. As long as both sides encode the length in the same way , just having the right .proto will do the trick. 3. A related quesiton is in general do I have to manage reading of the socket, or for that matter any istream, and spoon feed the protobuf parser until it says OK, that's a whole message? Basically yes. There is a sketch of some example code here: https://groups.google.com/forum/?fromgroups#!searchin/protobuf/sequence/protobuf/pLwqN4jTVvY/60PBaEadW5IJ Good luck, Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Message thread safety in Java
On Feb 20, 2012, at 8:25 , Frank Durden wrote: I'm sorry if this is explained somewhere, I couldn't find an answer. Are protobuf messages (in Java) thread safe for concurrent reads. I guess they're immutable in the sense that you can't modify them after they're built, but can a message object content be read from different threads safely? The generated variables in message objects don't seem to be final or volatile? After you call .build() and get a Message, that message is immutable, as you observed. I'm not a Java memory model expert, but my understanding is that despite the fields not being market final, this is in fact thread-safe. However, my only support is this quote from Brian Goetz: With some additional work, it is possible to write immutable classes that use some non-final fields (for example, the standard implementation of String uses lazy computation of the hashCode value), which may perform better than strictly final classes. http://www.ibm.com/developerworks/java/library/j-jtp02183/index.html I'm pretty sure the right people at Google have examined the protobuf code, so it should be safe. However, I don't have a good argument for *why* it is safe. Maybe someone who is a Java memory model expert knows the reasoning here? Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Message thread safety in Java
On Feb 20, 2012, at 16:20 , Christopher Smith wrote: Message objects *don't* have mutators and are conceptually a copy of the relevant builder object. Having attempted to refresh my knowledge of the Java Memory Model, I think there is a subtle difference between an object that has all final fields, and an immutable object like the protobuf messages. However, I don't think it matters in reality: As long as the message is correctly published to other threads (eg. a synchronized block, volatile reference, concurrent data structure), then everything is fine. Since everyone *should* be doing this already, Messages are safe to use across multiple threads. Evan PS. For language lawyers: I *think* the potential difference is as follows: Writes to final fields in a constructor are guaranteed to be visible to all threads when the constructor exits. So if you had the following: static FinalImmutableObject someRef = ...; Then if another thread sees a non-null value for someRef, it will correctly see all the values of the final fields. On the other hand, if you do this with a protobuf message, it *theoretically* could see a non-null value for someRef, but still see uninitialized or incorrectly initialized values for fields in someRef. This is because this static variable is not synchronized or volatile, so there is no happens-before relationship between two threads. Thus, the reads on one thread *could* be reordered before the writes on the other thread. References: http://java.sun.com/docs/books/jls/third_edition/html/memory.html#17.4 http://java.sun.com/docs/books/jls/third_edition/html/memory.html#17.5 -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Error: Byte size calculation and serialization were inconsistent
On Feb 6, 2012, at 21:54 , Robby Zinchak wrote: It turned out to be an uninitialized boolean. Properly setting the value in question seems to allow things to proceed normally. Ah! Interesting. So one of your .set_* properties is a boolean, and one of them was uninitialized? That would do it. This was discussed previously and dismissed as a wont fix problem, because it is hard/impossible to make portable code that will test for this: http://code.google.com/p/protobuf/issues/detail?id=234 Although its somewhat confusing since WireFormatLite::WriteBoolNoTag contains code to try to avoid this problem, which GCC helpfully optimizes away. I am not able to get the exact crash as the one you reported, but I can get it to crash in MessageLite::SerializeWithCachedSizesToArray by creating a boolean with a value of 0x80 (serializing to two bytes instead of one, causing it to create a message larger than it expects). I can't figure out how it could crash at the point you report the crash, but that doesn't really matter. Glad you got it working, Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Error: Byte size calculation and serialization were inconsistent
This is weird. I don't see any clear potential cause, so I have a few questions: HTMud::EnvAdd item; item.set_id(ID); item.set_idtype(typeID); item.set_x(X); item.set_y(Y); item.set_z(Z); item.set_lockdown(lockdown); item.set_mapid(map); item.set_tilesetno(tilesetNo); item.set_tilesetx(tilesetX); item.set_regionx(regionX); item.set_regiony(regionZ); Are all these values primitives? Are any of them protocol buffers? Have you tried dumping the values that are being set when it dies, and trying a standalone program that sets the values and calls SerializeToString to see if it has the same problem? Have you made any changes to the protocol buffers library? I'm assuming you are using the released version of 2.4.1? Have you tried running this under valgrind? I'm wondering if there could be other weird memory corruption that is happening? That seems to be a frequent cause of this shouldn't be happening type errors, particularly things that appear/disappear occur with optimization enabled/disabled. Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Re: Problem with C++ -writing multiple messages with a repeated field to a file
On May 16, 2011, at 9:45 , Nigel Pickard wrote: I have actually got the code working, but it involves creating a new output stream everytime I write to it (surely got to be wasteful and not the right way?). Definitely not needed, and it will be more efficient if you can re-use a single FileOutputStream, as it does buffering internally. You should probably create a CodedOutputStream for each message you write, but this can be stack allocated and is very lightweight. Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Problem with C++ -writing multiple messages with a repeated field to a file
On May 13, 2011, at 10:12 , Nigel Pickard wrote: libprotobuf FATAL google/protobuf/io/zero_copy_stream_impl_lite.cc: 346] CHECK failed: (buffer_used_) == (buffer_size_): BackUp() can only be called after Next(). Off the top of my head, I *believe* this is happening because the CodedOutputStream destructor is trying to reposition the FileOutputStream, but the FileOutputStream has already been closed. In this case, you either want to put the CodedOutputStream into its own enclosing scope, to force the destructor to run before you close the FileOutputStream, or just let the FileOutputStream destructor flush and close the file automatically. I hope this helps, Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] on the wire sizes
On Apr 1, 2011, at 6:54 , AdrianPilko wrote: What is the [best] way to determine the on the wire size? You probably want msg.ByteSize() in C++, msg.getSerializedSize() in Java. Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] A protocol message was rejected because it was too big ???
On Mar 6, 2011, at 18:45 , ksamdev wrote: I think I found the source of the problem. The problem is that CodedInputStream has internal counter of how many bytes are read so far with the same object. Ah, right. With the C++ API, the intention is that you will not reuse the CodedInputStream, and instead it will be created and destroyed for each message. It is very cheap to allocate / destroy if it is a local variable. In your case, you should do something like change your ::write method to do: CodedOutputStream out(_raw_out.get()); out.WriteVarint32(event.ByteSize()); event.SerializeWithCachedSizes(out); This will also save the extra copy that your code currently has. Hope this helps, Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] A protocol message was rejected because it was too big ???
On Mar 7, 2011, at 13:03 , ksamdev wrote: Hmm, thanks for the advice. It may work fine. Nevertheless, I have to skip previously read messages in this case every time CodedInputStream is read. Not true: Creating a CodedInputStream does not change the position in the underlying stream. Your code can easily look like: while (still more messages to read) { CodedInputStream in(input_stream); in.Read* ... msg.ParseFromCodedStream(); } This creates and destroys the CodedInputStream for each message, which is efficient. Unfortunately, reading does not work out after 2^31 bytes are read. Is there a way around? You will need to destroy and re-create the CodedInputStream object. If you don't want to do it for each message, you need to at least do it occasionally. Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] How to get the byte[] from a serialized data ?
On Mar 4, 2011, at 7:15 , Aditya Narayan wrote: I have created .proto files and compiled them to get the generated classes. Also I can build the message objects using the setters finally build() method. But to store it to database, I need serialized data as byte[] or byte buffers. How do I finally get that from the message instances ?? You want .toByteArray(): http://code.google.com/apis/protocolbuffers/docs/reference/java/com/google/protobuf/MessageLite.html#toByteArray() Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Chunking a large message
On Mar 3, 2011, at 15:53 , Linus wrote: I am wondering if there are any examples of chunking large PB messages (about 1MB) into smaller chunks, to transmit over the wire. This is going to be pretty application specific. Typically it involves taking one message with a huge repeated field and sending it / writing it as a sequence of messages with fewer items for each repeated field. So I can't really point you to any examples off the top of my head. That said: the documentation suggests keeping protocol buffers to be ~ 1 MB in size, so if your messages are 1 MB, I personally wouldn't worry about it. Hope this helps, Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] RuntimeException while parsing back the byte[] to protocol buffer message instance! (deserialization)
On Mar 4, 2011, at 11:11 , Aditya Narayan wrote: Exception in thread main java.lang.RuntimeException: Uncompilable source code This error means there is a build problem in your Eclipse project. You are trying to call some code that is not building compiled correctly. Fix your build errors and then your example should work. Good luck, Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Can a message derive from another message?
On 03/02/2011 10:04 AM, ZHOU Xiaobo wrote: required string Content = 3; WARNING: You should be using type bytes here, not type string. This doesn't matter for C++, but matters for other languages which will assume strings contain UTF-8 data. Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Beginner's Q: Does protobuf generate underlying transport sockets as well
On Feb 28, 2011, at 11:46 , footloose wrote: The tutorials talk only about marshalling and un marshalling the data structures. Do the sockets have to be written manually? Yes. The protocol buffer library from Google does not include an RPC implementation. There are a bunch of third-party implementations though: http://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Fwd: RpcChannel and RpcController Implementation
On Feb 21, 2011, at 3:06 , Amit Pandey wrote: Did anyone get the chance to look into it. If you want to use the RPC system, you need to provide your own implementation, or maybe use an existing one, such as: http://code.google.com/p/protobuf/wiki/ThirdPartyAddOns#RPC_Implementations If this doesn't answer your question, maybe you need to be more specific. What are you trying to do? Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] New protobuf feature proposal: Generated classes for streaming / visitors
I read this proposal somewhat carefully, and thought about it for a couple days. I think something like this might solve the problem that many people have with streams of messages. However, I was wondering a couple things about the design: * It seems to me that this will solve the problem for people who know statically at compile time what types they need to handle from a stream, so they can define the stream type appropriately. Will users find themselves running into the case where they need to handle generic messages, and end up needing to roll their own stream support anyway? I ask this question because I built my own RPC system on top of protocol buffers, and in this domain it is useful to be able to pass unknown messages around, typically as unparsed byte strings. Hence, this streams proposal wouldn't be useful to me, so I'm just wondering: am I an anomaly here, or could it be that many applications will find themselves needing to handle any protocol buffer message in their streams? The Visitor class has two standard implementations: Writer and Filler. MyStream::Writer writes the visited fields to a CodedOutputStream, using the same wire format as would be used to encode MyStream as one big message. Imagine I wanted a different protocol. Eg. I want something that checksums each message, or maybe compresses them, etc. Will I need to subclass MessageType::Visitor for each stream that I want to encode? Or will I need to change the code generator? Maybe this is an unusual enough need that the design doesn't need to be flexible enough to handle this, but it is worth thinking about a little, since features like being able to detect broken streams and resume in the middle are useful. Thanks! Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] New protobuf feature proposal: Generated classes for streaming / visitors
On Feb 8, 2011, at 13:34 , Kenton Varda wrote: I handle user messages by passing them as bytes, embedded in my own outer message. This is what I do as well, as does protobuf-socket-rpc: http://code.google.com/p/protobuf-socket-rpc/source/browse/trunk/proto/rpc.proto I guess I was thinking that if you already have to do some sort of lookup of the message type that is stored in that byte blob, then maybe you don't need the streaming extension. For example, you could just build a library that produces a sequence of byte strings, which the user of the library can then parse appropriately. I see how you are using it though: it is a friendly wrapper around this simple sequence of byte strings model, that automatically parses that byte string using the tag and schema message. This might be useful for some people. This is somewhat inefficient currently, as it will require an extra copy of all those bytes. However, it seems likely that future improvements to protocol buffers will allow bytes fields to share memory with the original buffer, which will eliminate this concern. Ah cool. I was considering changing my protocol to be two messages: the first one is the descriptor (eg. your CallRequest message), then the second would be the body of the request, which I would then parse based on the type passed in the CallRequest. Note that I expect people will generally only stream their top- level message. Although the proposal allows for streaming sub- messages as well, I expect that people will normally want to parse them into message objects which are handled whole. So, you only have to manually implement the top-level stream, and then you can invoke some reflective algorithm from there. Right, but my concern is that I might want to use this streaming API to write messages into files. In this case, I might have a file containing the FooStream and another file containing the BarStream. I'll have to implement both these ::Writer interfaces, or hack the code generator to generate it for me. Although now that I think about this, the implementation of these two APIs will be relatively trivial... features like being able to detect broken streams and resume in the middle are useful. I'm not sure how this relates. This seems like it should be handled at a lower layer, like in the InputStream -- if the connection is lost, it can re-establish and resume, without the parser ever knowing what happened. Sorry, just an example of why you might want a different protocol. If I've streamed 10e9 messages to disk, I don't want this stream to break if there is some weird corruption in the middle, so I want some protocol that can resume from corruption. Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Re: protobuf not handling special characters between Java server and C++ client
On Jan 26, 2011, at 3:43 , Hitesh Jethwani wrote: Can we encode the protobuf data in ISO-8859-1 from the server end itself? Yes. In this case, you need to use the protocol buffer bytes type instead of the protocol buffer string type, since you want to exchange ISO-8859-1 bytes from program to program (bytes), not unicode text (string). On the Java side, you'll need to use ByteString.copyFrom(myStringobject, ISO-8859-1) to make a ByteString out of a Java string. Hope this helps, Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] protobuf not handling special characters between Java server and C++ client
On Jan 25, 2011, at 15:27 , Hitesh Jethwani wrote: As may be evident from above I am naive at Java and Protobuf. Any help on this is appreciated. The Java protocol buffer API encodes strings as UTF-8. Since C++ has no unicode support, what you get on the other end is the raw UTF-8 encoded data. You'll need to use some Unicode API to process it in whatever way your application requires. I suggest ICU: http://site.icu-project.org/ Hope this helps, Evan -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] protocol buffers and client-server communication
On Jan 22, 2011, at 16:33 , Marco@worldcorp wrote: I am guessing i will need 1 proto file for each type of message, correct? Sounds like that is what you want to me. You may also end up needing some additional header message or wrapper message to be able to figure out what is the next message in the stream?. See: http://code.google.com/apis/protocolbuffers/docs/techniques.html#union the archives of this group also contain many discussions on this subject. Evan Jones -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Dealing with Corrupted Protocol Buffers
On Jan 20, 2011, at 2:48 , julius-schorzman wrote: My question is -- can anything be done to retrieve part of the file? It would be nice to know at which point in the file the problematic message occurred, and then I could crop to that point or do some manual exception -- but unfortunately this exception is very general. I find it hard to believe that a single mis-saved bit makes the whole file worthless. You are correct: your entire data is not worthless, but at the point of the error, you will need some manual intervention to figure out what is going on. It is probably possible to figure out the byte offset where this error occurs. The CodedInputStream tracks some sort of bytesRead counter, I seem to recall. However, this will require you to modify the source. I also find it curious that the source provides no way (that I can tell) to get at any lower level data in the p.b. since whenever I try to do anything with it it throws an exception. Best I can tell I will have to write from scratch my own code to decode the p.b. file. The lowest level tools that are provided is CodedInputStream. But yes, you will effectively have to parse the message yourself. Look at the code that is generated for the mergeFrom method of your message to get an idea for how it works, and you can read the encoding documentation: http://code.google.com/apis/protocolbuffers/docs/encoding.html You can definitely figure out what is going on, but it will be a bit of a pain. Good luck, Evan Jones -- http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] custom constructor
On Jan 14, 2011, at 9:22 , Tim Wisniewski wrote: The reason for this is that I can't find a bitset type within the proto language. Any thoughts? This is not possible because the intention is that the .proto file will be portable between many different languages, so it only supports fairly portable types. I have solved this either by writing a wrapper type that provides a friendly interface to the protocol buffer generated types, or by writing some utility methods (eg. maybe BitSetUtils.copyToByteString() / .copyFromByteString()). Either way is a bit of extra mechanical work, but it works. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Re: Large message vs. a list message for smaller messages
On Jan 13, 2011, at 4:38 , Meghana wrote: Is there any way of working around this problem? You can increase the limit with CodedInputStream.setSizeLimit, which is an easy route. The problem is that the performance is bad for really large messages, because the whole thing needs to be serialized/ deserialized to/from a single buffer. The high performance version would be to encode your own simple protocol. Something like: 1. Write the number of messages with writeRawVarint32 2. Write each message with writeDelimitedTo On the decoding side, do the opposite. I'm not familiar with the framework you are using, but this should be feasible. Hope this helps, Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Java Newbie Question: Attaching a CodedInputStream to an NIO Socket
On Jan 13, 2011, at 1:55 , Nader Salehi wrote: It does help. However, I seem to have some problem reading messages that way. My guess is that it has something to do with the fact that the channels are non-blocking. Is there any special thing to consider when working with such channels? You need to know the length of the message you are reading, then only call the parse method once you have the entire thing buffered. So you send the size first, then the message. On the receiving side, you read the size, then then you keep reading from the non-blocking socket until you have the whole thing buffered, then you parse it. I have code that actually does this that is open source, but it is research quality so it may not actually be helpful to others. But you may want to look at it: http://people.csail.mit.edu/evanj/hg/index.cgi/javatxn/file/260423aa1c25/src/ca/evanjones/protorpc/ProtoConnection.java#l40 Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Java Newbie Question: Attaching a CodedInputStream to an NIO Socket
On Jan 12, 2011, at 12:57 , Nader Salehi wrote: I have a Java-base TCP server which needs some modification. It has to accept messages as CodedInputStream from C++ clients that send CodedOutputStream. The server uses NIO class java.nio.channels.SocketChannel to read from the socket. What would be the easiest way to attach a CodedInputStream to this? I created a really thin InputStream implementation that wrapped my NIO ByteBuffer(s), then use CodedInputStream.newInstance(InputStream stream). You really only need to implement the read(byte[] destination, int offset, int length) method of this class, so it is actually pretty straightforward. There might be a better way but it works for me. Hope this helps, Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Large message vs. a list message for smaller messages
On Jan 12, 2011, at 8:38 , Meghana wrote: Would ListA also be considered a large message or will the encoding be done on each individual A message making it immune to the large message problem? ListA itself will be a large message if it contains a large message of a sub-messages. If you are really sending / writing a large number of messages, you want to read something like: http://code.google.com/apis/protocolbuffers/docs/techniques.html#streaming Good luck, Evan Jones -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to protobuf@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Using a ByteBuffer instead of a ByteString?
On Jan 11, 2011, at 0:45 , Nicolae Mihalache wrote: But I have noticed in java that it is impossible to create a message containing a bytes fields without copying some buffers around. For example if I have a encoded message of 1MB with a few regular fields and one big bytes field, decoding the message will make a copy of the entire buffer instead of keeping a reference to it. By decoding I'm assuming you mean deserializing the message from a file or something. This is a disadvantage, but it makes things much easier: it means the buffer used to read data can be recycled for the next message. Without this copy, the library would need to do complicated tracking of chunks of memory to determine if they are in use or not. However, now that you mention it: in the case of big buffers, CodedInputStream.readBytes() gets called, which currently makes 2 copies of the data (it calls readRawBytes() then calls ByteString.copyFrom()). This could probably be fixed in CodedInputStream.readBytes(), which might improve performance a fair bit. I'll put this on my TODO list of things to look at, since I think my code does this pretty frequently. Even worse when encoding: if I read some data from file, does not seem possible to put it directly into a ByteString so I have to make first a byte[], then copy it into the ByteString and when encoding, it makes yet another byte[]. The copy cannot be avoided because it makes the API simpler (thread- safety, don't need to worry about the ByteBuffer being accidentally changed, etc). The latest version of Protocol Buffers in Subversion has ByteString.copyFrom(ByteBuffer) which will do what you want efficiently. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Protocol Buffers Python extension in C
On Jan 7, 2011, at 6:36 , Atamurad Hezretkuliyev wrote: Currently our basic deserializer module is 17x faster than Google's implementation in pure Python. The pure python code is pretty slow. However, the repository version (and the newly released 2.4.0 rc 1?) has C++ code to do serialization / deserialization. There is no documentation, but the following thread describes it: http://groups.google.com/group/protobuf/browse_thread/thread/cfb13cd0a609b1c7/a5ada8791ca3c0ca#a5ada8791ca3c0ca You may want to test that and see how it turns out. And/or contact Yang about this, since he was interested in the same problem. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Re: java parse with class known at runtime (and compiled proto)
On Dec 6, 2010, at 10:31 , Koert Kuipers wrote: But that doesn't make a parseFrom() in message interface invalid, does it? Indeed some other information outside the raw bytes will be needed to pick to right Message subclass. But that's fine. Oh, sorry, I misunderstood your question, so my answer is somewhat invalid. One could then: 1) pick the right subclass of Message based upon some information outside the raw bytes (in my case something stored in a protobuf wrapper around the raw bytes) 2) call subclass.parseFrom(bytes) now we have to jump through more hoops for step 2 (create instance of Message subclass, newBuilderForType, mergeFrom, isInitialized, build) The MessageLite.Builder interface has a mergeFrom method that does what you want. What you should do is something like: * Get a MessageLite instance for the message type you want to parse (eg. something like MyMessageType.getDefaultInstance(), or MessageLite.getDefaultInstanceForType()) * Hold on to that MessageLite instance in some sort of registry. (HashMapInteger, MessageLite?) * When you get a message, look at the protobuf wrapper to determine the type. * Look up the prototype MessageLite instance in your registry. * Call prototypeInstance.newBuilderForType().mergeFrom(bytes).build() This only creates a single instance of the message each time. The .build() method will automatically check that the message is initialized, so you don't need to call isInitialized (although you may want to catch the exception it could throw?). This Builder pattern is used so that the Message objects are immutable. This means they can be passed between threads without requiring any synchronization. See: http://code.google.com/apis/protocolbuffers/docs/javatutorial.html#builders Hope this helps, Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] java parse with class known at runtime (and compiled proto)
On Dec 3, 2010, at 14:21 , Koert Kuipers wrote: public class ProtobufDeserializerT extends Message { public T fromByteBuffer(ByteBuffer byteBuffer) { I don't *think* the generic type is going to be enough due to erasure, but I'm not a generics expert. I know something like the following works (I may be messing up the generics syntax since I'm not super familiar with it): T extends MessageLite public T fromByteBuffer(ByteBuffer byteBuffer, T defaultInstance) { Builder b = defaultInstance.newBuilderForType(); b.mergeFrom(ByteString.copyFrom(byteBuffer)); return b.build(); } You can get defaultInstance from ConcreteMessageType.getDefaultInstance(); You may want to create a tiny InputStream wrapper around ByteBuffer to avoid an extra copy, or if you know it is a heap byte buffer, use the array mergeFrom(). Hope that helps, Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] 2.4.0 and lazy UTF-8 conversions in Java
On Nov 30, 2010, at 20:35 , Kenton Varda wrote: BTW, we actually ended up reverting your change and replacing it with the new implementation. We found that having two references increased memory pressure too much. I thought I had mentioned this to you; sorry if I forgot. Ah; I'm not surprised, which is why it was conditional on the SPEED implementation. It wasn't just two references, but two copies of each string as well. The instanceof approach to switch between the two is a good idea. When I wrote my implementation, I was concerned about the thread-safeness issues, although I don't think I ever considered this particular version. However, I think this can be made thread-safe, even without volatile (although I only understand the JMM enough to be dangerous). Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] 2.4.0 and lazy UTF-8 conversions in Java
On Nov 30, 2010, at 15:58 , Blair Zajac wrote: Added lazy conversion of UTF-8 encoded strings to String objects to improve performance. Is the lazyness thread safe? Without looking at the implementation, then if it isn't thread safe, I would guess this isn't much overhead, but if it is thread safe and you know you're going to use all the string fields, then does it hurt performance instead? Interesting! I looked at this sort of thing a bit, since I have a patch that makes string encoding somewhat faster, although it is quite intrusive, so probably not appropriate for including in the main source tree. Guesses based on my knowledge of the Java implementation: * It will be thread-safe, since that is the guarantee provided by the current protocol buffers implementation. * I'll guess that it will not be slower if you access all the strings. Currently, the parsing process copies the raw bytes from the input buffer into an individual byte array, then converts that to a String. This is, sadly, the most efficient thing you can do, since you need special code to create Strings. Therefore, doing lazy conversion isn't going to be slower. The objects already have both byte[] and String fields for each string due to an encoding improvement I contributed, so this should be nearly a pure win. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] fails to parse from string
Brad Lira wrote: address_book.SerializeToString(mystr) strncpy(buf, mystr.c_str(), strlen(mystr.c_str())); strlen will return a shorter length than the real length, due to null characters. Use mystr.size() Maybe this method is not the right way to send string across socket. I tried using SerializeToFileDescriptor(socket), that worked on the client side, but on the server side, i never get the message with UDP sockets. is there a better way of sending data across network? You probably want to use TCP sockets, since it provides retransmissions for you. Also, you'll need to prepend a length. See: http://code.google.com/apis/protocolbuffers/docs/techniques.html#streaming Or search the group archives for threads such as: http://groups.google.com/group/protobuf/browse_thread/thread/3af587ab16132a3f Good luck, Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] fails to parse from string
On Nov 10, 2010, at 14:13 , Brad Lira wrote: yes it was the null character, on the server side when copying buffer into string, i had add 1 to the size of the buffer (i guess for the null), then the parsing was ok with no error. Just adding 1 is still probably not correct. You have similar incorrect code on the receive side: recvfrom(socket, buf, ) mystr.assign(buf, strlen(buf)); strlen(buf) is not going to give you the right thing. You should be using the return value from recvfrom(), which gives you the number of bytes that were read from the network. Note: If you are using UDP, it will end up not working as soon as you have a message which is bigger than either your buffer, or the maximum UDP packet size, whichever comes first. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] help required
On Nov 3, 2010, at 14:54 , Manoj Upadhyay wrote: I want to use the protobuf in my project. I have many java POJO's in project and there are few POJO are composition of other POJO's and these POJO's are used in service and other places. Please let me know how can I define the .protoc file for this. For example : Suppose I have 6 POJO's(CLASS A ,B,C,D,E,F) and Class A has composition of the reference of other 5 POJO's (CLASS B,C,D,E,F) and these 5 POJO are used many places in project. Then how can I proceed to define the proto file. You'll need to redefine protocol buffers for all these objects. See: http://code.google.com/apis/protocolbuffers/docs/javatutorial.html Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] CodedInputStream on top of sockets
On Nov 2, 2010, at 10:37 , Jesper wrote: I'm trying to implement the writeDelimitedTo/parseDelimitedFrom methods in C++, but getting stuck on how to create a CodedInputStream on top of a socket in a portable manner. Can CodedInputStream work with windows sockets as well? You can certainly make it work one way or another. You'll need to create an implementation of ZeroCopyInputStream that makes the appropriate Windows socket calls to read data from the socket. Note that it may be easier to implement the CopyingInputStream interface, then wrap it in the CopyingInputStreamAdaptor. This is typically easier since CopyingInputStreamAdaptor then implements the appropriate buffering logic. See: http://code.google.com/apis/protocolbuffers/docs/reference/cpp/google.protobuf.io.zero_copy_stream_impl_lite.html#CopyingInputStreamAdaptor Good luck, Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Generic Message Dispatch and Message Handler
On Oct 26, 2010, at 15:45 , maninder batth wrote: My generic Handler would create a GeneratedMessage and look for the field messageType. Based on the value of the messageType, a particular handler will be invoked. This is basically what I have done, for my protobuf RPC implementation. If you only need to choose between a limited set of types, you may want a union type or extensions instead: http://code.google.com/apis/protocolbuffers/docs/techniques.html#union http://code.google.com/apis/protocolbuffers/docs/proto.html#extensions Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Re: Generic Message Dispatch and Message Handler
On Oct 27, 2010, at 11:36 , Jimm wrote: How are you parsing arbitrary PB bytes into a Generated Message ? I am finding no class in API that can deserialize PB byte buffer into GeneratedMessage? I'm using the generic Service API that is included with protocol buffers, so I'm not using GeneratedMessage. Rather, I'm using a message instance itself. The register does something ilke this: serviceRegister.registerCall(MyCustomMessage.getDefaultInstance()); Then you can parse this with code like the following: Message requestPrototype = ...; // stored in registerCall implementation Message.Builder builder = requestPrototype.newBuilderForType(); builder.mergeFrom(requestByteString); My code is actually available in the following hg repository. I don't recommend that people use it directly, since it is a bit hacky, but it could serve as an example: http://people.csail.mit.edu/evanj/hg/index.cgi/javatxn/file/tip/src/ca/evanjones/protorpc/ServiceRegistry.java http://people.csail.mit.edu/evanj/hg/index.cgi/javatxn/file/tip/src/ca/evanjones/protorpc/ProtoMethodInvoker.java Good luck, Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] protocol buffer within a protocol buffer from C++ to Java
On Oct 25, 2010, at 21:45 , Paul wrote: optional string meas_rec_str = 2; Change this to: optional bytes meas_rec_str = 2; Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Re: Message missing required fields exception when parsing a message that has required fields defaulted
On Oct 26, 2010, at 4:13 , locky wrote: The C++ side is setting things correctly. My understanding is that default values are not sent over the wire. When building a received message from a byte[ ] a check is done to see if required fields have been set. Any required field that was not sent due to having a default value on the other side is not marked as being set and the exception gets thrown. This is exactly correct. You should do two things: 1. Set this field on the sending side, but you mentioned that you are already doing this. 2. Verify that the bytes you are reading in on one side match the bytes being sent. I usually get this error when there is some sort of message handling error. For example, if you pass protobuf an empty array, you'll get this error message. You should write out the bytes that you are writing, and the bytes that you are reading and verify that they match. Also verify that the size you are passing in matches. There is a difference between an unset field with a default value of and a set field with a value of . The .hasProperty() method will return true for the set field, and false for the unset field. Thus, these messages are serialized differently. Hope this helps, Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] ParseFromArray -- in Java
On Oct 25, 2010, at 16:52 , ury wrote: i.e. Does the Java implementation has the Clear() method ? No, the Java implementation has immutable objects, so this is generally not possible. A new object must be created for each item. Immutable objects have benefits like being thread safe (see http://www.javapractices.com/topic/TopicAction.do?Id=29) That said, I think you *might* be able to hack something like this using the Builder object. I would be interested to know if you try this, and if it has any performance benefits. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Re: Delay in Sending Data
Kevin wrote: My only concern now is in regards to messages sizes and the prepending the size at the beginning. What is the best way to go about this? My test message required on one byte but my next messages will probably require 2 if not 3 bytes. What is the proper way to handle this in the C++ code as the Java code has this built-in? If you want to use parseDelimited on the Java side, you must use CodedOutptuStream::WriteVarint32() on the C++ side. See this recent thread for some code that should do the trick: http://groups.google.com/group/protobuf/browse_thread/thread/3af587ab16132a3f In addition, my colleague has used Thrift before and was extremely surprised that the C++ classes did not have matching function calls in Java and vice versa. Can someone explain this short coming? Someone added it as a convenience method to the Java implementation. No one has yet added it to the C++ implementation. I think mostly because protocol buffers are a fairly low level library, and other people wrap them in many different ways. However, this is probably just an oversight: if the Java side has parseDelimited/writeDelimited, the other implementations probably should as well. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Delay in Sending Data
On Oct 21, 2010, at 1:21 , Kevin wrote: Basically, the code that receives the data will wait until the stream is closed before reading the data. I thought that flushing the data would cause the data to be sent but that apparently has no effect. Is this my implementation or a problem with using the writeTo function? The flush *should* be causing the data to be sent. The problem is on the reader side: the default read methods read until the end of the stream. You'll need to prepend a length. You may want to use parseDelimited(). See the following document, or search the archives for many conversations about this. Hope this helps, Evan http://code.google.com/apis/protocolbuffers/docs/techniques.html#streaming -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] buffer sizes when sending messages from c++ to java
On Oct 20, 2010, at 2:13 , Kenton Varda wrote: But you are actually writing a varint32, which can be anywhere between 1 and 5 bytes depending on the value. Use CodedOutputStream::Varint32Size() to compute the number of bytes needed to encode a particular value. This has the advantage that you can allocate a buffer of exactly the right size, rather than adding 100 as an estimate. However, you can also find the final size after all the writes with CodedOutputStream::ByteCount() You should not need to do any byte swapping if you are serializing and deserializing integers using the protobuf API: it handles any required byte swapping for you. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Re: valgrind invalid write and double free errors
On Oct 14, 2010, at 11:32 , CB wrote: Actually, yes, we have a shared library containing our protobuf code, which we do load with dlopen. A command line option tells the app which protocol it needs to use, and the app loads the appropriate library. The open only happens once, very shortly after program launch. We're not constantly loading and unloading. Do you ever call dlclose() on this library? Protobuf has some complicated initialization time and shutdown clean up code buried in descriptor.cc that I don't really understand. At the very least, there is a call to this: internal::OnShutdown(DeleteGeneratedPool); I'm a little surprised that I don't see that function appear on your stack trace, if that is in fact the problem, but it must be something like that. Could you try adding a printf() to the DeleteGeneratedPool() function in protobuf/descriptor.cc and see if that is getting called multiple times? This FileDescriptorTable object is used internally by the protobuf library and I don't really understand it. I'm hoping someone who might understand this code might be able to suggest where this double free could be coming from. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] sending a message over TCP from a C++ client to a Java server
On Oct 13, 2010, at 15:13 , Paul wrote: On the client side (in C++), I open a TCP socket connection on the same port with the server's IP address. I serialize the message using SerializeToCodedStream into an array using ArrayOutputStream. After serializing it, I send it over the TCP connection using my sendTCP method which uses C++ sockets. SerializeToCodedStream does *not* prepend the message size. The Java side is expecting that the message will start with the message length, so that is probably why you are getting parse errors. You need to do something like: codedOutput.WriteVarint32(msg.ByteSize()); msg.SerializeToCodedStream(codedOutput); codedOutput.flush(); ... Hope this helps, Evan (as an aside: the C++ API really should have an equivalent to writeDelimitedTo and parseDelimited on the Java side). -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Re: sending a message over TCP from a C++ client to a Java server
On Oct 13, 2010, at 16:49 , Paul wrote: Thanks for the suggestion. However, I am already prepending the message size on the C++ side in the line: coded_output-WriteVarint64(snap1.ByteSize()); You may want to verify that the exact bytes that come out of msg.SerializeToString (or related) are coming out the other end and getting passed into parseDelimited. It might be helpful if you sent a snippet of code where you are sending and receiving the messages, but I can't think of anything off the top of my head. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] valgrind invalid write and double free errors
On Oct 13, 2010, at 16:53 , CB wrote: Any feedback on how to further debug this problem would be appreciated. You aren't doing anything strange like using dlopen() to dynamically load/unload libraries, are you? I can't think of anything obvious that might cause this kind of error. The FileDescriptorTables are static objects of sorts, I think. Are you calling ShutdownProtobufLibrary() somewhere? Maybe more than once? Memory leaks *will* be reported by valgrind if you don't call ShutdownProtobufLibrary(), but I don't know what could cause a double free. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Feature proposal: mapped fields
On Oct 6, 2010, at 9:23 , Igor Gatis wrote: It would be nice to have mapped fields, e.g. key-value pairs. I think that map support would probably be useful. I've basically created my own maps in protocol buffers a couple times, either by using two repeated fields, or a repeated field of a custom pair type. In these cases, it would have been nice to be able to use the Protocol Buffer as a map directly, rather than needing to transfer the data to some other object that actually implements the map. I would be interested to hear the opinion of the Google maintainers. I'm assuming that there are probably many applications inside Google that exchange map-like messages. This would be a big change, although it wouldn't be an impossible one, I don't think. I think it could be implemented as syntactic sugar over a repeated Pair message. I think the biggest challenge is that maps are a higher level abstraction than repeated fields, which leads to many design challenges: * Are the maps ordered or unordered? * If ordered, how are keys compared? This needs to be consistent across programming languages. * If unordered, how are hash values computed? This could result in a message being parsed and re-serialized differently, if different languages compute the hashes differently. * For both, how are 'unknown fields handled? * Do the maps support repeated keys? * If not, what happens when parsing a message with repeated keys? Other message protocols contain map-like structures: JSON, Thrift, and Avro. Avro only supports string keys. JSON only supports primitive keys. Thrift has a similar note about maps: http://wiki.apache.org/thrift/ThriftTypes For maximal compatibility, the key type for map should be a basic type rather than a struct or container type. There are some languages which do not support more complex key types in their native map types. In addition the JSON protocol only supports key types that are base types. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Timeouts for reading from a CodedInputStream
On Sep 28, 2010, at 15:33 , Patrick wrote: This is all fine and dandy except when I want to shutdown the server or connection (not client initiated). The ReadTag (as well as the other Read functions) blocks until data is received but I want it to timeout after a specified amount of time. So in essence a polling read instead of a blocking one. This will allow me to check that the connection is still valid and either re-enter my message parsing function or cleanup and exit. One quick hack that might work: if you have threads anyway, if you close the file descriptor in the other thread, the read will fail. This causes input.ReadTag() to return 0. The more complex hack is to supply your own ZeroCopyInputStream implementation, and in your implementation of ::Next, implement your own time out logic. In my implementation, I manage this by manually managing my own buffer, so I never call the CodedInputStream routines unless I know there is sufficient data. This may not be ideal for your application, so your milage may vary. Good luck, Evan Jones -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Re: Timeouts for reading from a CodedInputStream
On Sep 28, 2010, at 18:36 , Patrick wrote: I also have the problem that the RPC I wrote comes in a threaded model and a multi-process model. The multi-process one makes some things a bit harder. I was hoping to utilize a shm mutex to signal termination but this would only work if my message parsing loop timed out every so often and, therefore, could check the mutex. This should be pretty easy to achieve by supplying your own implementation of FileInputStream that uses select() and a non- blocking read() rather than just read(). It can then fail the call to Next() whenever it is convenient. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Re: MyType_Parse() calls ParseNamedEnum() with 'const std::string' parameter instead of 'const string'
On Sep 22, 2010, at 14:24 , Anand Ganesh wrote: This header is not using namespace std explicitly (protobuf-2.1.0). Notice how it's gotten generated with 'const string'. Right, but at the top of google/protobuf/stubs/common.h is the following: namespace google { namespace protobuf { using namespace std; // Don't do this at home, kids. That file is included via: generated_message_reflection.h - message_lite.h - stubs/common.h So there is some sort of weird namespace clashing going on. I wonder if maybe the issue is that the code in generated_message_reflection.h is in the google::protobuf::internal namespace, rather than in google::protobuf? Good luck, Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Re: Silly beginner question: Do the different RPC implementations inter-work?
Navigateur wrote: Did you use the automatically-generated abstract service code, or did you do the recently recommended make your own code-generator plugin to do the implementation? My implementation was started before the code generation plugins were done, so I used the existing abstract service. Were I to start it today, I would use the code generator, since there are a few small things in the automatically generated RPC interface that I would like to change. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Status of protobufs
On Aug 26, 2010, at 12:07 , Jean-Sebastien Stoezel wrote: More specifically how they are parsed from real time datastreams? You should manually insert a leading length of next message field into the data stream. The Java implementation even has a shortcut methods for this (see below). In C++ you have to implement it yourself, but it is only a few lines of code. See: http://code.google.com/apis/protocolbuffers/docs/techniques.html#streaming http://code.google.com/apis/protocolbuffers/docs/reference/java/com/google/protobuf/MessageLite.html#writeDelimitedTo(java.io.OutputStream) http://code.google.com/apis/protocolbuffers/docs/reference/java/com/google/protobuf/MessageLite.Builder.html#mergeDelimitedFrom(java.io.InputStream) Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Performance of java proto buffers
On Aug 19, 2010, at 11:45 , achintms wrote: I have an application that is reading data from disk and is using proto buffers to create java objects. When doing performance analysis I was surprised to find out that most of the time was spent in and around proto buffers and not reading data from disk. In my experience, protocol buffers are more than fast enough to be able to keep up with disk speeds. That is, when reading uncached data from the disk at 100 MB/s, protocol buffers can decode it at that speed. Now, if your data is cached, and your application is not doing much with the data, then I would expect protocol buffers to take 100% of the CPU time, since the disk read doesn't take CPU, and your application isn't doing much. In other words: in a more real application, I would expect protocol buffers will take only a very small portion of your application's time. Again I expected that decoding strings would be almost all the time (although decoding here still seems slower than in C in my experience). I am trying to figure out why mergeFrom method for this message is taking 6 sec (own time). Decoding strings in Java is way slower because it actually decodes the UTF-8 encoded strings into UTF-16 strings in memory. The C++ version just leaves the data in UTF-8. If this is a performance issue for your application, you may wish to consider using the bytes protocol buffer type rather than strings. This is less convenient, and means you can screw up by accidentally sending invalid data, but is faster. There are around 15 SubMessages. This is basically the problem right here. Each time you parse one of these messages, it ends up allocating a new object for each of these sub messages, and a new object for each string inside them. This is pretty slow. As I said above: I suspect that in a real application, this won't be a problem. However, it would be faster if you get rid of all the sub messages (assuming that you don't actually need them for some other reason). Finally, I'll take a moment to promote my patch that improves Java message *encoding* performance, by optimizing string encoding. It is available at the following URL. Unfortunately, there is no similar approach to improving the decoding performance. http://codereview.appspot.com/949044/ Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Service can only receive one argument
On Aug 22, 2010, at 4:36 , omer.c wrote: Can a service receive multiplate arugments or only one? Only one. How can I define a service which will accept both arguments: Create a union message, or define two RPCs. Unions: http://code.google.com/apis/protocolbuffers/docs/techniques.html#union Two RPCs: service Service { rpc sendObject1(Object1) returns (Result1); rpc sendObject2(Object2) returns (Result2); } Hope this helps, Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Re: How to retrieve parameters using tag numbers using CodedInputStream
On Aug 16, 2010, at 10:56 , Prakash Rao wrote: I'm just looking for a easy way to write null response if data is not present in DB and write proto message object if data is present parse these in client side appropriately. I didn't get a easy way to do this using CodedInputStream. Currently i'm creating a empty proto object on server side and checking for key attribute at client side as stated above. The empty protocol buffer message serializes to zero bytes, so if your message has no content, you could just send a zero byte message. This would avoid creating a protocol buffer message. However, I suspect that isn't really a big overhead. You can also use YourProtocolMessage.getDefaultInstance() to avoid creating a message. Hope this helps, Evan Jones -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Re: Java implementation questions
On Aug 5, 2010, at 9:16 , Ralf wrote: I might be mistaken, but didn't groups use this approach - use a special tag to indicate the end of a message? As only tags are checked, there is no need to escape any data. Good point, I forgot about groups. They definitely do use that approach. Maybe one of the Googlers on this list will have a better idea about why groups are now deprecated in favour of nested messages. Anyway, I was referring more to the implementation. For example, we could first serialize the message to a ByteArrayOutputStream, then write the result and its size to the output. Obviously this approach is much slower, but I was wondering if there were other similar approaches. That's true, and would work. The other option would be to use fixed width integers for the lengths, so then you could reserve space in the buffer, serialize the message, then go back and fill in the length field. This would be an incompatible change to the serialization format, however. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Total bytes limit on coded input stream in C++?
On Aug 3, 2010, at 16:44 , Julian González wrote: I used the approach you mentioned and it worked. I just have a problem I am writing 10,000 little messages in a file, first I write the size of the message and then the message as it follows: codedOutput-WriteVarint32(sample.ByteSize()); res = sample.SerializeToCodedStream(codedOutput); The problem is that when I try to read the 10,000 messages I just wrote I just can read 9984 messages, when I try to read the 9885 an error is thrown: libprotobuf ERROR c:\active\protobuf-2.3.0\src\google\protobuf \message_lite.cc:1 23] Can't parse message of type apm.Sample because it is missing required fiel ds: timestamp what is happening? It looks like only 9886 messages were written into the file, why the last 16 messages were not written? It shouldn't be happening. Since the sender checks that all required fields are present, this indicates that some mismatch is occurring between the serialization and deserialization code. Are you sure the data that is being sent is exactly the same as the data being received? Normally these errors occur because the data is being truncated or changed in transit somehow (eg. truncating at a null byte? truncating at some buffer limit?). The other thing that could be happening is that you could be mis- parsing the earlier messages. To parse multiple messages from a stream, you need to limit the number of bytes read (eg. using CodedInputStream::PushLimit, or MessageLite:: ParseFromBoundedZeroCopyStream). Good luck, Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Setting a nested message field without making a pointer?
On Aug 4, 2010, at 9:50 , mark.t.macdon...@googlemail.com wrote: Is there a way I can do this without creating a Pixel pointer? Something like this (which doesn't compile): Try fractal.mutable_pixel(0)-set_* Hope this helps, Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Total bytes limit on coded input stream in C++?
On Aug 3, 2010, at 12:46 , Jon Schewe wrote: I know that I could create a new coded input stream for each message, but this seems rather wasteful and slow compared with just resetting a counter. I complained about the same thing a little while ago: http://groups.google.com/group/protobuf/browse_thread/thread/a4bc2a3788d356f6 Read that thread for details, but the summary is: patches welcomed. CodedInputStream is pretty lightweight though, so creating and destroying one per message should be pretty efficient. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Cannot parse message with CodedInputStream over a pipe
On Jul 30, 2010, at 11:18 , jetcube wrote: On the caller application i open a pipe to the previous app and write a pb message of 10766 bytes and don't close the pipe but the first application never finishes the if evaluation. PushLimit() is a little funny: It doesn't stop the CodedInputStream from attempting to fill its buffer. Thus, I think your problem is that the IstreamInputStream is probably blocked on the pipe, waiting for more data. Try using request.ParseFromBoundedZeroCopyStream() instead. Or manually use a LimitingInputStream to limit the number of bytes read, which is what that method does under the covers (I think). Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Error in repeated_field.h
On Jul 30, 2010, at 8:29 , arborges wrote: libprotobuf FATAL /usr/local/include/google/protobuf/ repeated_field.h: 637] CHECK failed: (index) (size()): This means you are accessing an index past the end of the array. This is almost certainly a bug in your code.You should attach to this with a debugger to look at the entire stack trace to see where your bug is: const surroundsound::Arquivo::L1_Cena cena_sonora = projeto.cena(idxCena); numObj = cena_sonora.objetosonoro_size(); for(int k = 0; k numObj; k++) { const surroundsound::Arquivo::L1_ObjetoSonoro objeto_sonoro = cena_sonora.objetosonoro(k); I'm guessing this is happening because of this line. This code looks okay to me, since you check that k _size(). Are you modifying this list somewhere in your code at the same time? Or could you have memory corruption somewhere? Try using valgrind if you might have memory corruption. Good luck, Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Protocol Buffers RPC design questions
One note is that the built-in service implementation is sort of considered to be deprecated at this point, due to the plugin infrastructure that allows people to generate their own service code. On Jul 15, 2010, at 5:22 , Jamie McCrindle wrote: 2. I've been pondering how to inject in Service references. I like the idea that I have a 'local' RPC implementation that could be swapped out for a 'remote' one without having to change the client class. It doesn't seem right to have this code in the client (i.e. recreate the stub for every call): I do exactly what you do: I create the stub once and pass it in where needed. You may also find the TestService.Interface interface to be useful for this. It lets you inject testing versions of the services, for example. 3. Regarding extending RpcController. Adding a timeout and a timestamp seem pretty good candidates but the 'EnhancedRpcController' then becomes a pervasive cast as well as a RPC implementation lockin This is probably the worst part of the built-in service API, in my opinion. I end up with casts related to controllers in lots of places. Its ugly, but I don't see any good way to fix it. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Help with basic concepts of descriptors and reflection
On Jul 15, 2010, at 16:40 , mark.t.macdon...@googlemail.com wrote: coutMessage1.name(); coutMessage1.GetReflection()-GetString( ref, stabiliser.GetDescriptor()-FindFieldByName(name)); The differences between these two are HUGE. The first one is a compiled local variable reference (effectively). The second has to do some sort of table lookup. You want to use the first form if you care about performance. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Re: C++ syntax: how to set a singular enum field
On Jul 16, 2010, at 5:47 , mark.t.macdon...@googlemail.com wrote: if (stabiliser.retraction()==device::HOUSED) coutTrue\n; //but this doesn't compile stabiliser.set_retraction(device::RETRACTED); } See the generated .h file to see what might be going wrong. For enums, a .set_* method should be generated. Note that the .proto you sent doesn't have RETRACTED defined, so maybe that is your problem? Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Protobuf for client/server messaging?
On Jul 14, 2010, at 4:36 , bwp wrote: If we have to go down that route what would be a good identifier? See Peter's email. But you can also use msg.getDescriptorForType().getFullName() to get a unique string for each protocol buffer message type. This is what I do for my own RPC system, which needs to be able to handle *any* message type (hence the union or extension approaches are not really correct). This needs the non-lite runtime, in order to have descriptors for messages. See: http://code.google.com/apis/protocolbuffers/docs/reference/java/com/google/protobuf/Descriptors.Descriptor.html#getFullName() Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Can't read in Java message sent from C++
On Jul 10, 2010, at 7:47 , Maxim Leonovich wrote: ArrayOutputStream(buffer,msg.ByteSize() + 4,sizeof(char)); The documentation states: block_size is mainly useful for testing; in production you would probably never want to set it. So you should get rid of the sizeof(char) part. cos-WriteLittleEndian32(msg.ByteSize()); //Tryed WriteVariant32, didn't help msg.SerializeToCodedStream(cos); If you want to use Java's .parseDelimitedFrom, you *must* use WriteVarint32, because that is the format it expects the length prefix. In this case, you'll need to call ArrayOutputStream:: ByteCount() to figure out how many bytes were actually serialized. You also probably should create the ArrayOutputStream and CodedOutputStream on the stack, rather than using new. This will be slightly faster. That said, the only issue here that affects correctness is the WriteVarint32 part. The rest shouldn't matter unless I missed something. You should change your code to do that, then if you are still having problems you should try dumping the contents of the buffer on both the C++ and the Java side. Maybe the input/output is getting messed up somewhere? Good luck, Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Basic message encoding/decoding
On Jul 7, 2010, at 5:43 , Timothy Parez wrote: I'm aware I can simply use one of the various libraries, but it's important I understand basic encoding/decoding so I can pass this knowledge to teams who are using a language which is not supported by any of the libraries. I don't understand: you want code for encoding/decoding protocol buffers that does not use the official protocol buffer library? Or you want an example that uses the protocol buffer library? If you want to know how raw messages are encoded and decoded, digging through the source code for CodedInputStream / CodedOutputStream is probably helpful. Also: did you look at the third party libraries? Many programming languages have implementations you could try using: http://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Serialized Message Field Order
On Jul 6, 2010, at 7:33 , Srivats P wrote: @evan: That's what I'm doing currently - serializing and writing the bytestream for the magic, checksum and actual content messages separately and in that order - I was just wondering if I could put them all into one message { ... } and just serialize one message instead of three. Well, the checksum must be a separate message from the content, since you need the content bytes in order to compute the checksum. At least with the official implementation. But you could use the content message for the magic bytes part. I would just use a CodedOutputStream/CodedInputStream directly to do this, with something like the following: CodedOutputStream out = ...; ByteString msgBytes = msg.toByteString(); byte[] checksum = computeChecksum(msgBytes); out.writeRawBytes(magicBytes); out.writeRawBytes(checksum); out.writeBytesNoTag(msgBytes); -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Detecting end of CodedInputStream
On Jun 25, 2010, at 4:40 , Louis-Marie wrote: My question is then: how can I safely detect end of file? I guess I could do something like calling Next() on the underlying FileInputStream until it returns false (end of file) or a non empty buffer (and then call BackUp() to re-queue this buffer before creating the CodedInputStream), but it seems a bit overkill (and probably not the best thing from a performance point of view...) I think that detecting the end of file may depend on your underlying input stream. I have some code that uses the built-in FileInputStream, and I simply keep trying to read values until I get an error: bool success = in.ReadVarint32(size); if (!success) { // we are probably at EOF close(); return; } Then my close() method looks like: assert(input_-GetErrno() == 0); bool success = input_-Close(); assert(success); This works for me. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] using compression on protobuf messages
On Jun 22, 2010, at 13:54 , sheila miguez wrote: When I have a message to compress, I know the size of the byte array stream buffer to allocate. Then call the writeTo on it. Is there anything I should do other than this, given a message? writeTo should be pretty performant, yes? In unit test, when measuring the speed that takes, it is pretty good. I don't quite understand what you are doing. Are you allocating a ByteArrayOutputStream, writing the message to it, then passing the byte[] from the ByteArrayOutputStream to some LZO library? You could just call message.toByteArray() if that is what you want, which will be faster. I haven't tested this carefully, but my experience is that if you want the absolute best performance while using the Java API: * If you are writing to an OutputStream, you want re-use a single CodedOutputStream. It has an internal buffer, and allocating this buffer multiple times seems to slow things down. You probably want this option if you are writing many messages. Its typically pretty easy to provide your own implementation of OutputStream if you need to pass data to something else (eg. LZO). * If you have a byte[] array that is big enough, pass it in to CodedOutputStream.newInstance() to avoid an extra copy. * If you just want a byte[] array that is the exact right size, just call message.toByteArray() Does the LZO library have an OutputStream API? This would allow you to compress large protobuf messages as they are written out, rather than needing to serialize the entire thing to a byte[] array, then compress it. This could be better, but as always you'll have to measure it. Hope this helps, Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] using compression on protobuf messages
On Jun 22, 2010, at 15:35 , sheila miguez wrote: I've got a servlet filter which wraps the HttpServletResponse. So, the servlet response's output stream, which is wrapped in a stream from the lzo library, is compressing data as it is getting written to. Ah, so the best case is probably message.writeTo(servletOutputStream) If you are writing multiple messages, you'll probably want to explicitly create a single CodedOutputStream to write all of them. If you experiment with this and find something different, I would be interested to know. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Re: using compression on protobuf messages
Alex Antonov wrote: When I use .writeTo(...) and pass it a CompressionOutputStream as an input, it takes only 38,226,661 ns to compress 112,178 bytes. Wow! Glad to hear this helped so much. If you have a sequence of messages, you could try using a single CodedOutputStream. Something like: CodedOutputStream out = new CodedOutputStream(compressionStream); for (msg : messages) { msg.writeTo(out); } out.flush(); This should be slightly faster than using: msg.writeTo(compressionStream); because it avoids re-allocating the CodedOutputStream (and its internal buffer). It should be quite a bit better for small messages. Now I'm trying to figure out how I can speed up the decompression on the receiving side. What I have right now is: * Take the CompressionInputStream, convert it into a byte[] * Take the resulting byte[] and do .parseFrom(byte[]) This seems to be a faster route, then just doing .parseFrom(CompressionInputStream). Interesting. The only reason that I can think of which might make the byte[] version faster is that maybe you use a big read, while .parseFrom(InputStream) defaults to 4096 bytes. You could try editing the source to make BUFFER_SIZE in CodedInputStream bigger, if you care. The only thing I can think of is if you are reading a sequence of many messages, you can again re-use a single CodedInputStream, although this requires some work. Again, this will be better for small messages but probably not large messages. This is trickier than re-using a single CodedOutputStream. If you are interested, I can send the details about what I have used. Although to be honest: I haven't tested it carefully to see if this is *actually* faster than doing the simple thing such as .parseFrom() and friends. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Re: Efficiently reading/writing multiple messages in C++
On Jun 18, 2010, at 17:06 , Kenton Varda wrote: But I doubt there is really much overhead in constructing a new CodedInputStream on the stack for each message. No heap space is allocated in the process. If I end up doing some performance intensive stuff with this code, I'll look into it at some point and report back. For now, what I'm doing is plenty fast enough. I was mostly just slightly surprised that I can't do what I do on the Java side. What we really need is a MessageStream class which handles this kind of stuff at a higher level, but I haven't gotten around to writing such a thing. Huh. Probably like most people on this list, I have bits and pieces of protocol buffer related support code lying around. One of the pieces is something that is like a MessageStream. It may be a bit too specific for my application at the moment, but I certainly wouldn't be opposed to putting some effort into including it in protobuf, or in a protobuf-utils type project. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
[protobuf] Re: Efficiently reading/writing multiple messages in C++
On Jun 17, 2010, at 17:05 , Evan Jones wrote: I'm working around this by moving the CodedInputStream inside my loop, which is fine, but seems sub-optimal. At the very least, since I have lots of small messages, this means my ZeroCopyInputStream's methods get called many times. Based on previous mailing list discussions, this is the recommend way to do this. I don't care enough at the moment to test it, but it seems like using a single CodedInputStream for many small messages would be more efficient. Maybe at some point I'll try some benchmarks, but for now I'll ignore this. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
[protobuf] String Encoding Performance Improvement
Serialize to /dev/null reusing FileChannel: 7373 iterations in 31.919s; 18.629936MB/s OPTIMIZED Benchmarking benchmarks.GoogleSize$SizeMessage1 with file google_message1.dat Serialize to byte string: 3432701 iterations in 29.986s; 24.891575MB/s Serialize to byte array: 3455325 iterations in 30.373s; 24.73638MB/s Serialize to memory stream: 3398582 iterations in 30.742s; 24.038122MB/s Serialize to /dev/null with FileOutputStream: 2932259 iterations in 28.331s; 22.504812MB/s Serialize to /dev/null reusing FileOutputStream: 2779893 iterations in 26.785s; 22.566872MB/s Serialize to /dev/null with FileChannel: 3129454 iterations in 28.526s; 23.854078MB/s Serialize to /dev/null reusing FileChannel: 3183935 iterations in 28.779s; 24.056MB/s Benchmarking benchmarks.GoogleSize$SizeMessage2 with file google_message2.dat Serialize to byte string: 6497 iterations in 26.656s; 19.657772MB/s Serialize to byte array: 7231 iterations in 29.827s; 19.552631MB/s Serialize to memory stream: 6643 iterations in 27.582s; 19.424726MB/s Serialize to /dev/null with FileOutputStream: 7078 iterations in 27.844s; 20.501957MB/s Serialize to /dev/null reusing FileOutputStream: 7434 iterations in 30.969s; 19.360287MB/s Serialize to /dev/null with FileChannel: 6988 iterations in 29.144s; 19.338385MB/s Serialize to /dev/null reusing FileChannel: 7279 iterations in 30.338s; 19.3509MB/s Deserialize from byte string: 5254 iterations in 29.942s; 14.152257MB/s Deserialize from byte array: 5429 iterations in 30.481s; 14.3650465MB/s Deserialize from memory stream: 6156 iterations in 32.337s; 15.353779MB/s -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Issues with Large Coded Stream Files?
On Jun 3, 2010, at 14:18 , Nader Salehi wrote: I was told that coded streams have issues when they are larger than 2GB. Is it true, and, if so, what are the issues? If you have a single object that is 2GB in size, there are 32-bit integers that will overflow. However, provided that you call .resetSizeCounter() occasionally, I think it should work just fine. I'm certainly using a single Java CodedInputStream per long lived connection without any trouble. Unclear if I've sent 2GB of data over a single connection though. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Issues with Large Coded Stream Files?
On Jun 3, 2010, at 15:29 , Nader Salehi wrote: It is not a single object; I am writing into a coded output stream file which could grow to much larger than 2GB (it's more like 100GB). I also have to read from this file. Is there a performance hit in the above-mentioned scenario? No, this should work just fine. On the input size, you'll need to call CodedInputStream.resetSizeCounter() after each message, otherwise you'll run into the size limit. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Re: Implementing protobuf in symbian
anup wrote: I am getting undefined symbol error for mutex implementations . You need to link with libprotobuf_lite.a or libprotobuf.a. The Mutex class is defined in src/google/protobuf/stubs/common.cc. This suggests you are not linking with the protocol buffer runtime library. Hope this helps, Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Java UTF-8 encoding/decoding: possible performance improvements
On Jun 1, 2010, at 2:29 , David Dabbs wrote: Even with the extra call to access the offset, I would think there would be some advantage to not making the data copies, which generate garbage cruft. However, the way I am doing it doesn't generate any garbage: I keep a temporary char[] buffer around to use with String.getChars(). The cost is copying chars VS using reflection to access two fields. With the small strings I tested (average ~30 bytes per string), the copy is a bit cheaper than the reflection access. I assume that for larger strings, the reflection approach will probably be better. Which reminds me: I really need to test this with larger strings to make sure it isn't dog slow in that case. I seem to remember you saying that using an Encoder/Decoder didn't pay off when the number of strings to en/decode was small. Did the same hold true when using a ThreadLocal? From memory, the ThreadLocal appears to be very cheap, and not make much performance difference, but I should double check this as well. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Java UTF-8 encoding/decoding: possible performance improvements
On May 31, 2010, at 14:25 , David Dabbs wrote: you may access a String's internals via reflection in a safe, albeit potentially implementation-specific way. See class code below. As long as your java.lang.String uses value for the char[] and offset for the storage offset, this should work. No sun.misc.Unsafe used. Only tested/used on JDK6. Good idea! Unfortunately, this isn't much faster for small strings. It is faster if you just get the value char[] array. However, when I modified my implementation to get both the char[] value and int offset, it ended up being about the same speed for my test data set, which is composed mostly of short UTF-8 and ASCII strings. Unfortunately, a correct implementation will need to get both values. Since this is also somewhat dangerous, it doesn't seem like a great idea for my data. At any rate: I'll try to find some time to try and prepare a protocol buffer patch with my encode to a temporary ByteBuffer trick, which does make things a bit faster. I won't necessarily advocate this patch to be included, but after having wasted this much time on this stuff, I'll certainly try to maintain the patch for a while, in case others are interested. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Java UTF-8 encoding/decoding: possible performance improvements
On May 18, 2010, at 0:33 , Kenton Varda wrote: What if you did a fast scan of the bytes first to see if any are non- ASCII? Maybe only do this fast scan if the data is short enough to fit in L1 cache? I didn't try this exact idea, but my guess is that it may not be a win: to get fast access to the char[] in the string, you need to copy them into a separate char[] array. If you are doing this much work, you might as well encode them into UTF-8 while you are at it. Conclusion: 10% performance win for ASCII (garbage collection savings); no win for general UTF-8 text. Not worth it for protocol buffers, but I'll try digging into the decoding. The approach I found worked the best: 1. Copy the string into a pre-allocated and re-used char[] array. This is needed since the JDK does not permit access to the String's char[] ,to enforce immutability. This is a performance loss VS the JDK, which can access the char[] directly. 2. Encode into a pre-allocated byte[] array. This will completely handle short strings. For long strings, you end up needing to allocate additional temporary space. This is better than the JDK, which allocates a new temporary buffer = 4 * str.length(). 3. Allocate the final byte[] array, and System.arraycopy into it. This is the same as the JDK. Conclusion: This is only better than the JDK in that it reduces allocation and garbage collection. It is worse than the JDK because it requires a copy from the String into another char[]. On my tests with ASCII-only data, it ends up ~10% faster. In my tests with UTF-8 data, it ends up about the same speed. In other words: this probably isn't worth it for Protocol Buffers: this is a small performance improvement, but complicates the code significantly. However: For applications that can encode the string directly to the output buffer, this custom code can be significantly faster. However, since protocol buffers needs to encode to another buffer first to get the string length, this I may spend some time poking around at string *decoding*, because there we should be able to decode directly from the input buffer, saving a copy and an allocation of a temporary byte[] array. Unclear if this will actually be significantly faster, but it might be slightly faster. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.