[protobuf] Re: Issue 66 in protobuf: cannot install using easy_install
Comment #20 on issue 66 by spaans: cannot install using easy_install http://code.google.com/p/protobuf/issues/detail?id=66 you could release a 2.3.1 specifically to fix this issue - I've had to add a 2.5 and a 2.6 egg to our svn tree to prevent buildout from fetching the (wrong) zipfile when using python2.6 -- You received this message because you are listed in the owner or CC fields of this issue, or because you starred this issue. You may adjust your issue notification preferences at: http://code.google.com/hosting/settings -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
[protobuf] ProtocolBuffer + compression in hadoop?
I tried to use protocol buffer in hadoop, so far it works fine with SequenceFile, after I hook it up with a simple wrapper, but after I put in a compressor in sequenceFile, it fails, because it read all the messages and yet still wants to advance the read pointer, and then readTag() returns 0, so the mergeFrom() returns a message with no fields set. anybody familiar with both SequenceFile and protocol buffer has an idea why it fails like this? I find it difficult to understand because the InputStream is simply the same, whether it comes through a compressor or not thanks Yang -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
[protobuf] Trying to debug coredump in/near protobuf
We have a core dump here: #0 0xf749b0d9 in std::string::size () from /usr/lib/libstdc++.so.6 #1 0x0822119c in bd::Header::ByteSize () #2 0x082323b3 in google::protobuf::internal::WireFormat::MessageSizeNoVirtualbd::Header () #3 0x082257cd in bd::Request::ByteSize () #4 0x08233773 in google::protobuf::internal::WireFormat::MessageSizeNoVirtualbd::Request () #5 0x082258b8 in bd::Data::ByteSize () #6 0x08233797 in google::protobuf::internal::WireFormat::MessageSizeNoVirtualbd::Data () #7 0x0822598e in bd::Main::ByteSize () Question #1: Any clue what could cause this? If bd::Request defines a repeated bd::Header field and this core dump is already in bd::Header::ByteSize, does that mean there must be at least one bd::Header in the repeated list, or could it be empty still (never added anything)? We are trying to track down the state of the data in the frame that has a pointer to the protobuf message (which is named bd::Main i.e. package bd; message Main { ... } in the proto). We are trying in gdb to get to it, without success. For example: (gdb) p *pb_pip_thread_data-bd_pb_data_msg.main_msg_hnd Attempt to dereference a generic pointer. (gdb) p *(bd::Main *)pb_pip_thread_data-bd_pb_data_msg.main_msg_hnd A syntax error in expression, near `)pb_pip_thread_data-bd_pb_data_msg.main_msg_hnd'. Question #2: How do we dereference the pointer to the message in GDB? Note that it is defined as a void * but we can't convince gdb to cast it properly - any clues? Thanks --edan -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
[protobuf] How to prevent different protocol buffers from being parseFrom(byte[])
Hi All. I am trying to detect when I receive a protocol buffer that is not of the expected type. For example. void myMethod(byte[] theData) { TheMsgTypeExpected theMsg = TheMsgTypeExpected.parseFrom(data); } Now, if I pass in a byte[] of a protocol buffer that is of a type that I do not expect, the parseFrom() still returns correctly, (I would have though the IOException would have been thrown or something), and some of the fields get merged (the fields with the same IDs?) I am confused on how to detect this scenario, and ultimately prevent such things occurring. -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
[protobuf] Re: Java Protocol Buffers classes serialization
Just need to add my 10c worth, seeing that my name was mentioned :-) Final fields are considered immutable by the JMM. The fact that you CAN change them does not matter to that definition. Final fields are allowed to be inlined at runtime, which means that changing them might cause visibility issues. Serializable also modifies final fields. The ObjectInputStream first constructs an object, using a special method in sun.reflect (see http://www.javaspecialists.eu/archive/Issue175.html). It then sets the final fields. Incidentally in Java 1.1 you could NOT set final fields. Then in 1.2 you COULD. In 1.3 and 1.4 you could NOT. In 1.5+ you CAN. That is with just standard reflection. It's actually bad, because setting a final field could cause serious concurrency issues, but the reflection API does not give you a warning when you try to do it. Externalizable is quite dangerous as a general solution and does not offer many performance benefits over Serializable with custom read/ writeObject methods. Have a look at my talk from JavaZone: http://www.javaspecialists.eu/talks/oslo09/ReflectionMadness.pdf and search for Externalizable Hack. Heinz On Feb 17, 10:47 pm, David Dabbs dmda...@gmail.com wrote: Obviously, if you want you can use it to mutate the object, but if you like you can mutate even perfectly final and immutable objects in Java (using some trickery). I'm pretty sure that's not true. Effective Java, for example, suggests using immutable classes as asecurity measure, which would be a terrible suggestion if they were not really immutable. Actually, unless there's a SecurityManager in place any member (even final) is accessible. For instance, I've used code similar to the following (valid on Sun JVM only, I think) to directly read a String's char[] to avoid data copying. I *could* have modified the value. Cheers, david /// import java.lang.reflect.*; public final class ReflectionUtils { public static final Field STRING_VALUE_FIELD = getFieldAccessible(String.class, value); public static final Field STRING_OFFSET_FIELD = getFieldAccessible(String.class, offset); public static final Constructor? STRING_PP_CTOR = getConstructor(String.class, 0, int.class, int.class, char[].class); public static char[] getChars(final String s) { // // use reflection to read the char[] value from the string. . . // try { return (char[]) STRING_VALUE_FIELD.get(s); } catch (Throwable e) { // } return null; } public static String sharedString(final char[] chars, final int offset, final int length) { try { return (String) STRING_PP_CTOR.newInstance(offset, length, chars); } catch (InstantiationException e) { e.printStackTrace(); } catch (IllegalAccessException e) { e.printStackTrace(); } catch (InvocationTargetException e) { e.printStackTrace(); } catch (Exception e) { e.printStackTrace(); } return null; } public static Field getFieldAccessible(final Class? clazz, final String fieldName) { Field fld = null; try { fld = clazz.getDeclaredField(fieldName); fld.setAccessible(true); } catch (NoSuchFieldException e) { e.printStackTrace(); } catch (SecurityException e) { e.printStackTrace(); } return fld; } public static Constructor? getConstructor(final Class? clazz, final int searchModifier, Class?... paramTypes) { // // N.B. there is no explicit value for package-private, so pass 0. // try { Constructor?[] allConstructors = clazz.getDeclaredConstructors(); for (Constructor? ctor : allConstructors) { if (searchModifier == (ctor.getModifiers() (Modifier.PUBLIC | Modifier.PRIVATE | Modifier.PROTECTED))) { // // access modifier match. . . // final Class?[] parameterTypes = ctor.getParameterTypes(); if (parameterTypes.length == paramTypes.length) { // // same number of parameters. . . // for (int i = 0; i parameterTypes.length; i++) { if (!parameterTypes[i].equals(paramTypes[i])) { ctor = null; break; } else { // Type[] gpType = ctor.getGenericParameterTypes(); // for (int j = 0; j gpType.length; j++) { // char ch = (gpType[j].equals(paramTypes[i]) ? '*'
Re: [protobuf] How to prevent different protocol buffers from being parseFrom(byte[])
I am confused on how to detect this scenario, and ultimately prevent such things occurring. You can't, at least in the simple way. Protocol buffers on the wire have no type information associated with them, and they're explicitly designed so that they can accept fields they don't expect to be present (for forwards compatibility), so serializing a message of one type and deserializing it as another type will work sometimes. A couple potential solutions come to mind. One is to send all your messages wrapped in another message that includes a type identifier. Another would be to have all your messages include a key field that must be filled in in a specific way (eg, tag 1 in all messages is a int field named tag, and there's a single value for each message type that you require to be filled in). What is most appropriate depends on how you expect a message of one type to be sent to something expecting a different type. (Obviously, the best solution is to set up your well-typed language APIs to keep that from happening.) - Adam -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
[protobuf] Re: Issue 165 in protobuf: can not link for mips architecture
Comment #6 on issue 165 by ken...@google.com: can not link for mips architecture http://code.google.com/p/protobuf/issues/detail?id=165 As I said before, after running configure, just edit config.h (in the root directory of the package) and change the hash-related parts to use hash_map instead of unordered_map. These values should work: #define HASH_MAP_H ext/hash_map #define HASH_NAMESPACE stdext #define HASH_SET_H ext/hash_set If you re-run configure you may have to re-edit config.h, as it is generated as part of the configure script. -- You received this message because you are listed in the owner or CC fields of this issue, or because you starred this issue. You may adjust your issue notification preferences at: http://code.google.com/hosting/settings -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] ProtocolBuffer + compression in hadoop?
Please reply-all so the mailing list stays CC'd. I don't know anything about the libraries you are using so I can't really help you further. Maybe someone else can. On Thu, Feb 18, 2010 at 12:46 PM, Yang tedd...@gmail.com wrote: thanks Kenton, I thought about the same, what I did was that I use a splitter stream, and split the actual input stream into 2, dumping out one for debugging, and feeding the other one to PB. my code for Hadoop is Writable.readFields( Datainput in ) { SplitterInputStream ios = new SplitterInputStream(in); pb_object = MyPBClass.parseFrom(ios); } SplitterInputStream dumps out the actual bytes, and the resulting byte stream is indeed (decimal) 10 2 79 79 16 1 ... repeating 20 times\ which is 20 records of message { 1: string name ; // taking a value of yy 2: i32 Id; //taking a value of 1 } indeed, in compression or non-compression mode, the dumped out bytestream is the same. On Thu, Feb 18, 2010 at 12:03 PM, Kenton Varda ken...@google.com wrote: You should verify that the bytes that come out of the InputStream really are the exact same bytes that were written by the serializer to the OutputStream originally. You could do this by computing a checksum at both ends and printing it, then inspecting visually. You'll probably find that the bytes differ somehow, or don't end at the same point. On Thu, Feb 18, 2010 at 2:48 AM, Yang tedd...@gmail.com wrote: I tried to use protocol buffer in hadoop, so far it works fine with SequenceFile, after I hook it up with a simple wrapper, but after I put in a compressor in sequenceFile, it fails, because it read all the messages and yet still wants to advance the read pointer, and then readTag() returns 0, so the mergeFrom() returns a message with no fields set. anybody familiar with both SequenceFile and protocol buffer has an idea why it fails like this? I find it difficult to understand because the InputStream is simply the same, whether it comes through a compressor or not thanks Yang -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/protobuf?hl=en. -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] how to get RepeatedField object
On Wed, Feb 17, 2010 at 12:19 AM, Romain Francois romain.francois.r.enthusi...@gmail.com wrote: On 02/17/2010 12:51 AM, Kenton Varda wrote: The Reflection interface already provides a way -- FieldSize() and GetRepeatedX(). The only problem is that it's a bit slower than the generated accessors because these methods aren't inlineable. Sure. I meant STL algorithms iterating. You could easily write an STL-like iterator on top of these if you really need it. BTW, have you observed an actual performance problem or are you just speculating that this performance difference may be a problem for you? In similar (non protobuf-related) settings, we have observed quite a bit of difference between using plain loops and accessors as opposed to iterators. It would obviously depend on the data structure involved. For example, on an stl vector, the following two loops will have equivalent performance: vectorint v; for (int i = 0; i v.size(); ++i) { DoSomehting(v[i]); } for (vectorint::const_iterator i = v.begin(); i != v.end(); ++i) { DoSomehting(*i); } In fact, a good compiler may even produce identical assembly code for both loops. But this is mainly speculation on the performance. It's generally a bad idea to try to fix theoretical performance problems. But when you use std::vector, it is best to first reserve the target size as opposed to create an empty vector and push_back each element. It's slightly faster, but either way is still O(n). Reserving is often not worth the effort, especially if you are good about reusing objects, in which case they will already have space reserved from the previous use. -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.