[protobuf] Re: Issue 66 in protobuf: cannot install using easy_install

2010-02-18 Thread protobuf


Comment #20 on issue 66 by spaans: cannot install using easy_install
http://code.google.com/p/protobuf/issues/detail?id=66

you could release a 2.3.1 specifically to fix this issue - I've had to add  
a 2.5 and
a 2.6 egg to our svn tree to prevent buildout from fetching the (wrong)  
zipfile when

using python2.6

--
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings

--
You received this message because you are subscribed to the Google Groups Protocol 
Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] ProtocolBuffer + compression in hadoop?

2010-02-18 Thread Yang
I tried to use protocol buffer in hadoop,

so far it works fine with SequenceFile, after I hook it up with a simple
wrapper,

but after I put in a compressor in sequenceFile, it fails, because it read
all the messages and yet still wants to advance the read pointer, and
then readTag() returns 0, so the mergeFrom() returns a message with no
fields set.

anybody familiar with both SequenceFile and protocol buffer has an idea why
it fails like this?
I find it difficult to understand because the InputStream is simply the
same, whether it comes through a compressor or not


thanks
Yang

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] Trying to debug coredump in/near protobuf

2010-02-18 Thread edan
We have a core dump here:

#0  0xf749b0d9 in std::string::size () from /usr/lib/libstdc++.so.6
#1  0x0822119c in bd::Header::ByteSize ()
#2  0x082323b3 in
google::protobuf::internal::WireFormat::MessageSizeNoVirtualbd::Header ()
#3  0x082257cd in bd::Request::ByteSize ()
#4  0x08233773 in
google::protobuf::internal::WireFormat::MessageSizeNoVirtualbd::Request ()
#5  0x082258b8 in bd::Data::ByteSize ()
#6  0x08233797 in
google::protobuf::internal::WireFormat::MessageSizeNoVirtualbd::Data ()
#7  0x0822598e in bd::Main::ByteSize ()

Question #1: Any clue what could cause this?  If bd::Request defines a
repeated bd::Header field and this core dump is already in
bd::Header::ByteSize, does that mean there must be at least one bd::Header
in the repeated list, or could it be empty still (never added anything)?

We are trying to track down the state of the data in the frame that has a
pointer to the protobuf message (which is named bd::Main i.e. package bd;
message Main { ... } in the proto).
We are trying in gdb to get to it, without success.  For example:

(gdb) p *pb_pip_thread_data-bd_pb_data_msg.main_msg_hnd
Attempt to dereference a generic pointer.
(gdb) p *(bd::Main *)pb_pip_thread_data-bd_pb_data_msg.main_msg_hnd
A syntax error in expression, near
`)pb_pip_thread_data-bd_pb_data_msg.main_msg_hnd'.

Question #2: How do we dereference the pointer to the message in GDB?  Note
that it is defined as a void * but we can't convince gdb to cast it
properly - any clues?

Thanks
--edan

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] How to prevent different protocol buffers from being parseFrom(byte[])

2010-02-18 Thread flyte
Hi All.

I am trying to detect when I receive a protocol buffer that is not of
the expected type. For example.

void myMethod(byte[] theData)
{

TheMsgTypeExpected theMsg = TheMsgTypeExpected.parseFrom(data);
}


Now, if I pass in a byte[] of a protocol buffer that is of a type that
I do not expect, the parseFrom() still returns correctly, (I would
have though the IOException would have been thrown or something), and
some of the fields get merged (the fields with the same IDs?)

I am confused on how to detect this scenario, and ultimately prevent
such things occurring.

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] Re: Java Protocol Buffers classes serialization

2010-02-18 Thread HeinzMK
Just need to add my 10c worth, seeing that my name was mentioned :-)

Final fields are considered immutable by the JMM.  The fact that you
CAN change them does not matter to that definition.  Final fields are
allowed to be inlined at runtime, which means that changing them might
cause visibility issues.

Serializable also modifies final fields.  The ObjectInputStream first
constructs an object, using a special method in sun.reflect (see
http://www.javaspecialists.eu/archive/Issue175.html).  It then sets
the final fields.

Incidentally in Java 1.1 you could NOT set final fields.  Then in 1.2
you COULD.  In 1.3 and 1.4 you could NOT.  In 1.5+ you CAN.  That is
with just standard reflection.  It's actually bad, because setting a
final field could cause serious concurrency issues, but the reflection
API does not give you a warning when you try to do it.

Externalizable is quite dangerous as a general solution and does not
offer many performance benefits over Serializable with custom read/
writeObject methods.  Have a look at my talk from JavaZone:
http://www.javaspecialists.eu/talks/oslo09/ReflectionMadness.pdf and
search for Externalizable Hack.

Heinz

On Feb 17, 10:47 pm, David Dabbs dmda...@gmail.com wrote:
  Obviously, if you want you can use it to mutate the object, but if you like

 you can mutate even

 perfectly final and immutable objects in Java (using some trickery).
 I'm pretty sure that's not true.  Effective Java, for example, suggests

 using immutable classes as asecurity measure, which would be a terrible 
 suggestion if they were not

 really immutable.
  

 Actually, unless there's a SecurityManager in place any member (even final)
 is accessible. For instance,
 I've used code similar to the following (valid on Sun JVM only, I think) to
 directly read a String's char[] to avoid data copying.
 I *could* have modified the value.

 Cheers,

 david

 ///

 import java.lang.reflect.*;

 public final class ReflectionUtils {

     public static final Field STRING_VALUE_FIELD =
 getFieldAccessible(String.class, value);
     public static final Field STRING_OFFSET_FIELD =
 getFieldAccessible(String.class, offset);
     public static final Constructor? STRING_PP_CTOR =
 getConstructor(String.class, 0, int.class, int.class, char[].class);

     public static char[] getChars(final String s) {
         //
         // use reflection to read the char[] value from the string. . .
         //
         try {
             return (char[]) STRING_VALUE_FIELD.get(s);
         } catch (Throwable e) {
             //
         }
         return null;
     }

     public static String sharedString(final char[] chars, final int offset,
 final int length) {
         try {

             return (String) STRING_PP_CTOR.newInstance(offset, length,
 chars);

         } catch (InstantiationException e) {
             e.printStackTrace();  
         } catch (IllegalAccessException e) {
             e.printStackTrace();  
         } catch (InvocationTargetException e) {
             e.printStackTrace();  
         } catch (Exception e) {
             e.printStackTrace();  
         }
         return null;
     }

     public static Field getFieldAccessible(final Class? clazz, final
 String fieldName) {
         Field fld = null;
         try {

             fld = clazz.getDeclaredField(fieldName);
             fld.setAccessible(true);

         } catch (NoSuchFieldException e) {
             e.printStackTrace();  
         } catch (SecurityException e) {
             e.printStackTrace();  
         }
         return fld;
     }

     public static Constructor? getConstructor(final Class? clazz, final
 int searchModifier, Class?... paramTypes) {
         //
         // N.B. there is no explicit value for package-private, so pass 0.
         //
         try {

             Constructor?[] allConstructors =
 clazz.getDeclaredConstructors();

             for (Constructor? ctor : allConstructors) {

                 if (searchModifier == (ctor.getModifiers() 
 (Modifier.PUBLIC | Modifier.PRIVATE | Modifier.PROTECTED))) {
                     //
                     // access modifier match. . .
                     //
                     final Class?[] parameterTypes =
 ctor.getParameterTypes();
                     if (parameterTypes.length == paramTypes.length) {
                         //
                         // same number of parameters. . .
                         //
                         for (int i = 0; i  parameterTypes.length; i++) {
                             if (!parameterTypes[i].equals(paramTypes[i])) {
                                 ctor = null;
                                 break;
                             } else {
 //                                Type[] gpType =
 ctor.getGenericParameterTypes();
 //                                for (int j = 0; j  gpType.length; j++) {
 //                                    char ch =
 (gpType[j].equals(paramTypes[i]) ? '*' 

Re: [protobuf] How to prevent different protocol buffers from being parseFrom(byte[])

2010-02-18 Thread Adam Vartanian
 I am confused on how to detect this scenario, and ultimately prevent
 such things occurring.

You can't, at least in the simple way.  Protocol buffers on the wire
have no type information associated with them, and they're explicitly
designed so that they can accept fields they don't expect to be
present (for forwards compatibility), so serializing a message of one
type and deserializing it as another type will work sometimes.

A couple potential solutions come to mind.  One is to send all your
messages wrapped in another message that includes a type identifier.
Another would be to have all your messages include a key field that
must be filled in in a specific way (eg, tag 1 in all messages is a
int field named tag, and there's a single value for each message type
that you require to be filled in).  What is most appropriate depends
on how you expect a message of one type to be sent to something
expecting a different type.  (Obviously, the best solution is to set
up your well-typed language APIs to keep that from happening.)

- Adam

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] Re: Issue 165 in protobuf: can not link for mips architecture

2010-02-18 Thread protobuf


Comment #6 on issue 165 by ken...@google.com: can not link for mips  
architecture

http://code.google.com/p/protobuf/issues/detail?id=165

As I said before, after running configure, just edit config.h (in the root  
directory

of the package) and change the hash-related parts to use hash_map instead of
unordered_map.  These values should work:

#define HASH_MAP_H ext/hash_map
#define HASH_NAMESPACE stdext
#define HASH_SET_H ext/hash_set

If you re-run configure you may have to re-edit config.h, as it is  
generated as part

of the configure script.

--
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings

--
You received this message because you are subscribed to the Google Groups Protocol 
Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] ProtocolBuffer + compression in hadoop?

2010-02-18 Thread Kenton Varda
Please reply-all so the mailing list stays CC'd.  I don't know anything
about the libraries you are using so I can't really help you further.  Maybe
someone else can.

On Thu, Feb 18, 2010 at 12:46 PM, Yang tedd...@gmail.com wrote:

 thanks Kenton,

 I thought about the same,
 what I did was that I use a splitter stream, and split the actual input
 stream into 2, dumping out one for debugging, and feeding the other one to
 PB.


 my code for Hadoop is

 Writable.readFields( Datainput in ) {

 SplitterInputStream ios = new SplitterInputStream(in);

 pb_object = MyPBClass.parseFrom(ios);
 }

 SplitterInputStream dumps out the actual bytes, and the resulting byte
 stream is
 indeed (decimal)

 10 2 79 79  16 1  ... repeating 20 times\

 which is 20 records of
 message {
   1: string name ;  // taking a value of yy
   2: i32 Id;   //taking a value of 1
 }



 indeed, in compression or non-compression mode, the dumped out bytestream
 is the same.



 On Thu, Feb 18, 2010 at 12:03 PM, Kenton Varda ken...@google.com wrote:

 You should verify that the bytes that come out of the InputStream really
 are the exact same bytes that were written by the serializer to the
 OutputStream originally.  You could do this by computing a checksum at both
 ends and printing it, then inspecting visually.  You'll probably find that
 the bytes differ somehow, or don't end at the same point.

 On Thu, Feb 18, 2010 at 2:48 AM, Yang tedd...@gmail.com wrote:

 I tried to use protocol buffer in hadoop,

 so far it works fine with SequenceFile, after I hook it up with a simple
 wrapper,

 but after I put in a compressor in sequenceFile, it fails, because it
 read all the messages and yet still wants to advance the read pointer, and
 then readTag() returns 0, so the mergeFrom() returns a message with no
 fields set.

 anybody familiar with both SequenceFile and protocol buffer has an idea
 why it fails like this?
 I find it difficult to understand because the InputStream is simply the
 same, whether it comes through a compressor or not


 thanks
 Yang

 --
 You received this message because you are subscribed to the Google Groups
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.





-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] how to get RepeatedField object

2010-02-18 Thread Kenton Varda
On Wed, Feb 17, 2010 at 12:19 AM, Romain Francois 
romain.francois.r.enthusi...@gmail.com wrote:

 On 02/17/2010 12:51 AM, Kenton Varda wrote:

 The Reflection interface already provides a way -- FieldSize() and
 GetRepeatedX().  The only problem is that it's a bit slower than the
 generated accessors because these methods aren't inlineable.


 Sure. I meant STL algorithms iterating.


You could easily write an STL-like iterator on top of these if you really
need it.



  BTW, have you observed an actual performance problem or are you just
 speculating that this performance difference may be a problem for you?


 In similar (non protobuf-related) settings, we have observed quite a bit of
 difference between using plain loops and accessors as opposed to iterators.


It would obviously depend on the data structure involved.  For example, on
an stl vector, the following two loops will have equivalent performance:

  vectorint v;
  for (int i = 0; i  v.size(); ++i) {
DoSomehting(v[i]);
  }

  for (vectorint::const_iterator i = v.begin(); i != v.end(); ++i) {
DoSomehting(*i);
  }

In fact, a good compiler may even produce identical assembly code for both
loops.


 But this is mainly speculation on the performance.


It's generally a bad idea to try to fix theoretical performance problems.

But when you use std::vector, it is best to first reserve the target size as
 opposed to create an empty vector and push_back each element.


It's slightly faster, but either way is still O(n).  Reserving is often not
worth the effort, especially if you are good about reusing objects, in which
case they will already have space reserved from the previous use.

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.