Re: [protobuf] Why does protobuf-net append to a byte[] Property/field on deserialize?

2011-02-01 Thread Marc Gravell
I'll see if I can add something when I get chance, but note that you could
also work around this by using serialization callbacks to detect when
deserialization is starting, and wipe the default BLOB. Also, the elusively
unreleased v2  also optionally supports ctor-skipping (like
WCF/DataContractSerializer), so any default field values will not be
assigned (the object prior to deserialization is a vanilla all-zero chunk of
memory, regardless of any constructor / field-initializer logic).

Marc

On 1 February 2011 00:31, Richard Geary tristra...@googlemail.com wrote:

 You can repro this with a trivial example with a byte[] as an initialized
 member field. In my particular case, it was storing some XML describing the
 layout of the window class
 (AvalonDock.DockingManager.SaveLayout/RestoreLayout), and the bug was
 duplicating the XML. This subsequently caused the XmlDocument to throw an
 exception as there were 2 roots to the xml tree. I fixed it by altering
 BlobSerializer.RequiresOldValue to return false.



 Now you mention it, I see it’s part of the “append to the list of existing
 items”/Merge feature. Might be useful to have an attribute to set this
 append behaviour per field.



 Cheers,

 Richard



 *From:* Marc Gravell [mailto:marc.grav...@gmail.com]
 *Sent:* 31 January 2011 01:04
 *To:* NYCBrit
 *Cc:* Protocol Buffers
 *Subject:* Re: [protobuf] Why does protobuf-net append to a byte[]
 Property/field on deserialize?



 Hmmm - good question. A bit of an edge case, really, deserializing over the
 top of an existing byte[], or having duplicated byte[]. But thinking about
 it, it probably should adhere to the singular scalar fields logic and
 replace rather than accumulate.



 I'll log that as a bug. Out of curiosity, is there a specific scenario
 where this is noticed? (i.e. where it is common)



 Marc

 On 30 January 2011 20:06, NYCBrit richard...@gmail.com wrote:

 I'm using protobuf-net v2 (trunk) to serialize my C# app. I'm confused
 by the byte[] serializer (BlobSerializer) which always appends the
 deserialized byte[] array to the initialized value, rather than
 replacing it. Why does it do this?

 Thanks,
 Richard

 --
 You received this message because you are subscribed to the Google Groups
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.




 --
 Regards,

 Marc




-- 
Regards,

Marc

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Recursion support in Protobuf-net

2011-02-01 Thread Marc Gravell
Yes, that is already on my list of things I really, really want it to do. It
will inevitably be in v2 at some point, but to repeat your point: this
will be 100% implementation specific and not generically portable between
platforms, so it will have to be by explicit opt in.

Marc

On 1 February 2011 02:49, NYCBrit tristra...@googlemail.com wrote:

 Would it be possible to support recursive graphs in protobuf-net, as a
 implementation-specific custom message extension? For example, if
 protobuf-net could automatically insert an optional int32 uniqueID =
 1000 field in to each message, it could store an auto-generated
 unique identifier for each object. Then, if you come across a
 previously serialized object, you need only read/write the uniqueID
 field and leave the other optional fields blank. The protobuf-net
 deserializer could output the previously constructed object. The
 uniqueIDs would only be valid for a defined scope, eg. one contiguous
 wire message.

 --
 You received this message because you are subscribed to the Google Groups
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.




-- 
Regards,

Marc

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Serializing Dictionarystring,object

2011-02-01 Thread Marc Gravell
Both, really. And of course by the time you are embedding **type** metadata
into the file you have ceased to be portable (or contract-based). This is
potentially something that could be added as a non-portable feature of a
single implementation, but *in most cases* there are ways to do it
**inside** the portable spec, and I'd rather push people towards using
portable options unless there is no feasible alternative.

Marc

On 1 February 2011 01:58, Richard Geary tristra...@googlemail.com wrote:

 What sort of restrictions are in the Google protobuf spec? Is this possible
 as a user-extension of the protobuf-net library? eg. Could I create an
 object surrogate, serializing the name of the derived type as a string, then
 dynamically read/write the type to the stream based on the type string? Or
 does the spec make this not possible because it forces a static message
 structure?



 Thanks!





 *From:* Marc Gravell [mailto:marc.grav...@gmail.com]
 *Sent:* 31 January 2011 01:06
 *To:* NYCBrit
 *Cc:* Protocol Buffers
 *Subject:* Re: [protobuf] Serializing Dictionarystring,object



 I assume you mean with protobuf-net there; in which case, no.



 Because protobuf-net follows the general protobuf spec, there is no
 type-specific metadata that would allow me to encode/decode an arbitrary
 object, or to store the details of which type of object is stored.



 Marc (protobuf-net)

 On 29 January 2011 16:03, NYCBrit richard...@gmail.com wrote:

 Is it possible to serialize a Dictionarystring,object where all the
 objects are pointing to classes marked up with [ProtoContract]?

 --
 You received this message because you are subscribed to the Google Groups
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.




 --
 Regards,

 Marc




-- 
Regards,

Marc

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Support for generic surrogates, eg. QueueT

2011-02-01 Thread Marc Gravell
Nice - I'm glad that was a v2 patch, though ;p

Before I look at that at any length, though, can you confirm that you can
freely release this patch under the existing license terms? (just my CYA)

Marc

On 1 February 2011 03:03, NYCBrit tristra...@googlemail.com wrote:

 I've made a small modification to add support for user-specified
 surrogate classes for generic types, via the same mechanism as
 KeyValueSurrogate. This will let you serialize generic types such as
 QueueT by defining and registering a [ProtoContract] surrogate type,
 QueueSurrogateT.

 I thought I'd share this in case anyone else has the same issue. Link
 to the modified source :
 http://www.filedropper.com/supportgenericsurrogates

 --
 You received this message because you are subscribed to the Google Groups
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.




-- 
Regards,

Marc

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] Repeated Fields Encoding

2011-02-01 Thread Timothy Parez
Hello,

Considering the following proto file:

message FileDescriptor
{
required string Filename = 1;
optional int64 Size = 2 [default = 0];
}

message FileList
{
repeated FileDescriptor Files = 1;
}

If you create something like this:
(and I'm duplicating the data because it made it easier to spot in a
hex editor)

files.Files.Add(new FileDescriptor() { Filename = AAA, Size =
100 });
files.Files.Add(new FileDescriptor() { Filename = AAA, Size =
100 });
files.Files.Add(new FileDescriptor() { Filename = AAA, Size =
100 });
files.Files.Add(new FileDescriptor() { Filename = AAA, Size =
100 });

and then serialize it using the Protobuf.Serializer I expected it to
generate something like

Tag for the FileList - Id 1, WireType 2 = 0x0A
Length of the payload (all the bytes for all the files that follow)

But instead I found everything is simply repeated.

0A 0B 0A 07 41 41 41 41 41 41 41 10 64
0A 0B 0A 07 41 41 41 41 41 41 41 10 64
0A 0B 0A 07 41 41 41 41 41 41 41 10 64
0A 0B 0A 07 41 41 41 41 41 41 41 10 64

I'm wondering, is this an implementation detail (and allowed by the
protobol buffer specifications)
or a requirement of the google protocol buffer specifications ?

It does seem to add quite a bit of overhead, imagine the FileList has
other properties,
they would be repeated for every instance of FileDescriptor ?

Or am I missing something ?


-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Repeated Fields Encoding

2011-02-01 Thread Marc Gravell
I think this also came to me directly and I answered earlier, but this is the 
expected layout of repeated data, where each item in a list is mapped 
separately in the data stream.

Marc

On 1 Feb 2011, at 11:01, Timothy Parez timothypa...@gmail.com wrote:

 Hello,
 
 Considering the following proto file:
 
 message FileDescriptor
 {
required string Filename = 1;
optional int64 Size = 2 [default = 0];
 }
 
 message FileList
 {
repeated FileDescriptor Files = 1;
 }
 
 If you create something like this:
 (and I'm duplicating the data because it made it easier to spot in a
 hex editor)
 
 files.Files.Add(new FileDescriptor() { Filename = AAA, Size =
 100 });
 files.Files.Add(new FileDescriptor() { Filename = AAA, Size =
 100 });
 files.Files.Add(new FileDescriptor() { Filename = AAA, Size =
 100 });
 files.Files.Add(new FileDescriptor() { Filename = AAA, Size =
 100 });
 
 and then serialize it using the Protobuf.Serializer I expected it to
 generate something like
 
 Tag for the FileList - Id 1, WireType 2 = 0x0A
 Length of the payload (all the bytes for all the files that follow)
 
 But instead I found everything is simply repeated.
 
 0A 0B 0A 07 41 41 41 41 41 41 41 10 64
 0A 0B 0A 07 41 41 41 41 41 41 41 10 64
 0A 0B 0A 07 41 41 41 41 41 41 41 10 64
 0A 0B 0A 07 41 41 41 41 41 41 41 10 64
 
 I'm wondering, is this an implementation detail (and allowed by the
 protobol buffer specifications)
 or a requirement of the google protocol buffer specifications ?
 
 It does seem to add quite a bit of overhead, imagine the FileList has
 other properties,
 they would be repeated for every instance of FileDescriptor ?
 
 Or am I missing something ?
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] New protobuf feature proposal: Generated classes for streaming / visitors

2011-02-01 Thread Kenton Varda
Hello open source protobuf users,

*Background*

Probably the biggest deficiency in the open source protocol buffers
libraries today is a lack of built-in support for handling streams of
messages.  True, it's not too hard for users to support it manually, by
prefixing each message with its size as described here:

  http://code.google.com/apis/protocolbuffers/docs/techniques.html#streaming

However, this is awkward, and typically requires users to reach into the
low-level CodedInputStream/CodedOutputStream classes and do a lot of work
manually.  Furthermore, many users want to handle streams
of heterogeneous message types.  We tell them to wrap their messages in an
outer type using the union pattern:

  http://code.google.com/apis/protocolbuffers/docs/techniques.html#union

But this is kind of ugly and has unnecessary overhead.

These problems never really came up in our internal usage, because inside
Google we have an RPC system and other utility code which builds on top of
protocol buffers and provides appropriate abstraction. While we'd like to
open source this code, a lot of it is large, somewhat messy, and highly
interdependent with unrelated parts of our environment, and no one has had
the time to rewrite it all cleanly (as we did with protocol buffers itself).

*Proposed solution:  Generated Visitors*

I've been wanting to fix this for some time now, but didn't really have a
good idea how.  CodedInputStream is annoyingly low-level, but I couldn't
think of much better an interface for reading a stream of messages off the
wire.

A couple weeks ago, though, I realized that I had been failing to consider
how new kinds of code generation could help this problem.  I was trying to
think of solutions that would go into the protobuf base library, not
solutions that were generated by the protocol compiler.

So then it became pretty clear:  A protobuf message definition can also be
interpreted as a definition for a streaming protocol.  Each field in the
message is a kind of item in the stream.

  // A stream of Foo and Bar messages, and also strings.
  message MyStream {
option generate_visitors = true;  // enables generation of streaming
classes
repeated Foo foo = 1;
repeated Bar bar = 2;
repeated string baz = 3;
  }

All we need to do is generate code appropriate for treating MyStream as a
stream, rather than one big message.

My approach is to generate two interfaces, each with two provided
implementations.  The interfaces are Visitor and Guide.
 MyStream::Visitor looks like this:

  class MyStream::Visitor {
   public:
virtual ~Visitor();

virtual void VisitFoo(const Foo foo);
virtual void VisitBar(const Bar bar);
virtual void VisitBaz(const std::string baz);
  };

The Visitor class has two standard implementations:  Writer and Filler.
 MyStream::Writer writes the visited fields to a CodedOutputStream, using
the same wire format as would be used to encode MyStream as one big message.
 MyStream::Filler fills in a MyStream message object with the visited
values.

Meanwhile, Guides are objects that drive Visitors.

  class MyStream::Guide {
   public:
virtual ~Guide();

// Call the methods of the visitor on the Guide's data.
virtual void Accept(MyStream::Visitor* visitor) = 0;

// Just fill in a message object directly rather than use a visitor.
virtual void Fill(MyStream* message) = 0;
  };

The two standard implementations of Guide are Reader and Walker.
 MyStream::Reader reads items from a CodedInputStream and passes them to the
visitor.  MyStream::Walker walks over a MyStream message object and passes
all the fields to the visitor.

To handle a stream of messages, simply attach a Reader to your own Visitor
implementation.  Your visitor's methods will then be called as each item is
parsed, kind of like SAX XML parsing, but type-safe.

*Nonblocking I/O*

The Reader type declared above is based on blocking I/O, but many users
would prefer a non-blocking approach.  I'm less sure how to handle this, but
my thought was that we could provide a utility class like:

  class NonblockingHelper {
   public:
template typename MessageType
NonblockingHelper(typename MessageType::Visitor* visitor);

// Push data into the buffer.  If the data completes any fields,
// they will be passed to the underlying visitor.  Any left-over data
// is remembered for the next call.
void PushData(void* data, int size);
  };

With this, you can use whatever non-blocking I/O mechanism you want, and
just have to push the data into the NonblockingHelper, which will take care
of calling the Visitor as necessary.

*C++ implementation*

I've written up a patch implementing this for C++ (not yet including the
nonblocking part):

  http://codereview.appspot.com/4077052

*Feedback*

What do you think?

I know I'm excited to use this in some of my own side projects (which is why
I spent my weekend working on it), but before adding this to the official
implementation we 

Re: [protobuf] Repeated Fields Encoding

2011-02-01 Thread Kenton Varda
The encoding is documented in detail here:

http://code.google.com/apis/protocolbuffers/docs/encoding.html

http://code.google.com/apis/protocolbuffers/docs/encoding.htmlThe short
answer is, yes, repeated fields are literally encoded as repeated individual
values, unless you use packed encoding.

On Tue, Feb 1, 2011 at 3:01 AM, Timothy Parez timothypa...@gmail.comwrote:

 Hello,

 Considering the following proto file:

 message FileDescriptor
 {
required string Filename = 1;
optional int64 Size = 2 [default = 0];
 }

 message FileList
 {
repeated FileDescriptor Files = 1;
 }

 If you create something like this:
 (and I'm duplicating the data because it made it easier to spot in a
 hex editor)

 files.Files.Add(new FileDescriptor() { Filename = AAA, Size =
 100 });
 files.Files.Add(new FileDescriptor() { Filename = AAA, Size =
 100 });
 files.Files.Add(new FileDescriptor() { Filename = AAA, Size =
 100 });
 files.Files.Add(new FileDescriptor() { Filename = AAA, Size =
 100 });

 and then serialize it using the Protobuf.Serializer I expected it to
 generate something like

 Tag for the FileList - Id 1, WireType 2 = 0x0A
 Length of the payload (all the bytes for all the files that follow)

 But instead I found everything is simply repeated.

 0A 0B 0A 07 41 41 41 41 41 41 41 10 64
 0A 0B 0A 07 41 41 41 41 41 41 41 10 64
 0A 0B 0A 07 41 41 41 41 41 41 41 10 64
 0A 0B 0A 07 41 41 41 41 41 41 41 10 64

 I'm wondering, is this an implementation detail (and allowed by the
 protobol buffer specifications)
 or a requirement of the google protocol buffer specifications ?

 It does seem to add quite a bit of overhead, imagine the FileList has
 other properties,
 they would be repeated for every instance of FileDescriptor ?

 Or am I missing something ?


 --
 You received this message because you are subscribed to the Google Groups
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.



-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] v2.4 question

2011-02-01 Thread David Dabbs
Hello.

 

From what I gather catching up on list messages, 2.4 is final, but not
officially released, perhaps because some documentation is in the works.

I ask because I would like to incorporate 2.4 in my product's upcoming build
cycle, but only if there is reasonable certainty that the bits won't change
(or will change very little) and that it will be officially released in,
say, six weeks.

Are these reasonable assumptions?

 

Thanks,

 

David

 

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] Services not compiling

2011-02-01 Thread jdoliner
Hi,

I'm new to protocol buffers so apologies if this is something stupid.
(I searched the forum and didn't find anything). I'm having a problem
getting services to compile. I have the following population.proto
file:


package population;

message addrinfo {
required fixed32 ip = 1;
required uint32 port = 2;
required uint32 id = 3;
}

message Join_initial {
required addrinfo addr = 1;
}

message Join_propose {
required addrinfo addr = 1;
}

message Join_respond {
required bool agree = 1;
}

message Join_mk_official {
required addrinfo addr = 1;
}

message Join_ack_official {
}

message Join_welcome {
repeated addrinfo addrs= 1;
}

service JoinService {
rpc Join (Join_initial) returns (Join_welcome);
rpc Propose (Join_propose) returns (Join_respond);
rpc Make_official (Join_mk_official) returns (Join_ack_official);
}

And when I compile it with:
protoc --cpp_out=. population.proto

I get population.pb.{cc|h} but nowhere in either one is there mention
of the JoinService class. Am I doing something wrong?

My libprotoc version is 2.4.0.

Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Services not compiling

2011-02-01 Thread Kenton Varda
The *_generic_services options now default to false.  You must explicitly
set them to true, e.g.:

  option cc_generic_services = true;
  option java_generic_services = true;
  option py_generic_services = true;

We decided to make these off-by-default because we want to encourage RPC
implementations to provide their own code generator plugins that generate
code appropriate for that implementation, rather than relying on protoc's
least-common-denominator generic code.

The documentation has not been updated for 2.4.0 yet.

On Tue, Feb 1, 2011 at 12:46 PM, jdoliner jdoli...@gmail.com wrote:

 Hi,

 I'm new to protocol buffers so apologies if this is something stupid.
 (I searched the forum and didn't find anything). I'm having a problem
 getting services to compile. I have the following population.proto
 file:


 package population;

 message addrinfo {
required fixed32 ip = 1;
required uint32 port = 2;
required uint32 id = 3;
 }

 message Join_initial {
required addrinfo addr = 1;
 }

 message Join_propose {
required addrinfo addr = 1;
 }

 message Join_respond {
required bool agree = 1;
 }

 message Join_mk_official {
required addrinfo addr = 1;
 }

 message Join_ack_official {
 }

 message Join_welcome {
repeated addrinfo addrs= 1;
 }

 service JoinService {
rpc Join (Join_initial) returns (Join_welcome);
rpc Propose (Join_propose) returns (Join_respond);
rpc Make_official (Join_mk_official) returns (Join_ack_official);
 }

 And when I compile it with:
 protoc --cpp_out=. population.proto

 I get population.pb.{cc|h} but nowhere in either one is there mention
 of the JoinService class. Am I doing something wrong?

 My libprotoc version is 2.4.0.

 Thanks.

 --
 You received this message because you are subscribed to the Google Groups
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.



-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] Re: New protobuf feature proposal: Generated classes for streaming / visitors

2011-02-01 Thread Jason Hsueh
Conceptually this sounds great, the big question to me is whether this
should be implemented as an option in the compiler or as a separate plugin.
I haven't taken a thorough look at the patch, but I'd guess it adds a decent
amount to the core code generator. I have a preference for the plugin
approach, but of course I'm primarily an internal protobuf user, so I'm
willing to be convinced otherwise :-) Would using a plugin, possibly even
shipped with the standard implementation, make this feature too inconvenient
to use? Or is there enough demand for this that it warrants implementing as
an option?

Regarding the proposed interfaces: I can imagine some applications where the
const refs passed to the visitor methods may be too restrictive - the user
may instead want to take ownership of the object. e.g., suppose the stream
is a series of requests, and each of the visitor handlers needs to start
some asynchronous work. It would be good to hear if users have use cases
that don't quite fit into this model (or at least if the existing use cases
will work).

On Tue, Feb 1, 2011 at 10:45 AM, Kenton Varda ken...@google.com wrote:

 Hello open source protobuf users,

 *Background*

 Probably the biggest deficiency in the open source protocol buffers
 libraries today is a lack of built-in support for handling streams of
 messages.  True, it's not too hard for users to support it manually, by
 prefixing each message with its size as described here:


 http://code.google.com/apis/protocolbuffers/docs/techniques.html#streaming

 However, this is awkward, and typically requires users to reach into the
 low-level CodedInputStream/CodedOutputStream classes and do a lot of work
 manually.  Furthermore, many users want to handle streams
 of heterogeneous message types.  We tell them to wrap their messages in an
 outer type using the union pattern:

   http://code.google.com/apis/protocolbuffers/docs/techniques.html#union

 But this is kind of ugly and has unnecessary overhead.

 These problems never really came up in our internal usage, because inside
 Google we have an RPC system and other utility code which builds on top of
 protocol buffers and provides appropriate abstraction. While we'd like to
 open source this code, a lot of it is large, somewhat messy, and highly
 interdependent with unrelated parts of our environment, and no one has had
 the time to rewrite it all cleanly (as we did with protocol buffers itself).

 *Proposed solution:  Generated Visitors*

 I've been wanting to fix this for some time now, but didn't really have a
 good idea how.  CodedInputStream is annoyingly low-level, but I couldn't
 think of much better an interface for reading a stream of messages off the
 wire.

 A couple weeks ago, though, I realized that I had been failing to consider
 how new kinds of code generation could help this problem.  I was trying to
 think of solutions that would go into the protobuf base library, not
 solutions that were generated by the protocol compiler.

 So then it became pretty clear:  A protobuf message definition can also be
 interpreted as a definition for a streaming protocol.  Each field in the
 message is a kind of item in the stream.

   // A stream of Foo and Bar messages, and also strings.
   message MyStream {
 option generate_visitors = true;  // enables generation of streaming
 classes
 repeated Foo foo = 1;
 repeated Bar bar = 2;
 repeated string baz = 3;
   }

 All we need to do is generate code appropriate for treating MyStream as a
 stream, rather than one big message.

 My approach is to generate two interfaces, each with two provided
 implementations.  The interfaces are Visitor and Guide.
  MyStream::Visitor looks like this:

   class MyStream::Visitor {
public:
 virtual ~Visitor();

 virtual void VisitFoo(const Foo foo);
 virtual void VisitBar(const Bar bar);
 virtual void VisitBaz(const std::string baz);
   };

 The Visitor class has two standard implementations:  Writer and Filler.
  MyStream::Writer writes the visited fields to a CodedOutputStream, using
 the same wire format as would be used to encode MyStream as one big message.
  MyStream::Filler fills in a MyStream message object with the visited
 values.

 Meanwhile, Guides are objects that drive Visitors.

   class MyStream::Guide {
public:
 virtual ~Guide();

 // Call the methods of the visitor on the Guide's data.
 virtual void Accept(MyStream::Visitor* visitor) = 0;

 // Just fill in a message object directly rather than use a visitor.
 virtual void Fill(MyStream* message) = 0;
   };

 The two standard implementations of Guide are Reader and Walker.
  MyStream::Reader reads items from a CodedInputStream and passes them to the
 visitor.  MyStream::Walker walks over a MyStream message object and passes
 all the fields to the visitor.

 To handle a stream of messages, simply attach a Reader to your own Visitor
 implementation.  Your visitor's methods will then be called 

Re: [protobuf] Protocol Buffers Python extension in C

2011-02-01 Thread Atamurad Hezretkuliyev
I've implemented code generator so others can test / use our C
implementation in Python.

Another new thing is Lazy decoding support. Message are decoded on the fly
as attributes are accessed for the first time.

More info:
http://blog.connex.io/introducing-cypb-improving-the-performance-of

You can get the source here:
https://github.com/connexio/cypb

As always, feedback and contributions are welcome!

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



RE: [protobuf] Support for generic surrogates, eg. QueueT

2011-02-01 Thread Richard Geary
Yes, I've no problem with releasing it under the existing (Apache 2) license
terms.

 

From: Marc Gravell [mailto:marc.grav...@gmail.com] 
Sent: 01 February 2011 03:18
To: NYCBrit
Cc: Protocol Buffers
Subject: Re: [protobuf] Support for generic surrogates, eg. QueueT

 

Nice - I'm glad that was a v2 patch, though ;p

 

Before I look at that at any length, though, can you confirm that you can
freely release this patch under the existing license terms? (just my CYA)

 

Marc

On 1 February 2011 03:03, NYCBrit tristra...@googlemail.com wrote:

I've made a small modification to add support for user-specified
surrogate classes for generic types, via the same mechanism as
KeyValueSurrogate. This will let you serialize generic types such as
QueueT by defining and registering a [ProtoContract] surrogate type,
QueueSurrogateT.

I thought I'd share this in case anyone else has the same issue. Link
to the modified source :
http://www.filedropper.com/supportgenericsurrogates

--
You received this message because you are subscribed to the Google Groups
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to
protobuf+unsubscr...@googlegroups.com
mailto:protobuf%2bunsubscr...@googlegroups.com .
For more options, visit this group at
http://groups.google.com/group/protobuf?hl=en.




-- 
Regards, 

Marc

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] Re: New protobuf feature proposal: Generated classes for streaming / visitors

2011-02-01 Thread Kenton Varda
On Tue, Feb 1, 2011 at 3:17 PM, Jason Hsueh jas...@google.com wrote:

 Conceptually this sounds great, the big question to me is whether this
 should be implemented as an option in the compiler or as a separate plugin.
 I haven't taken a thorough look at the patch, but I'd guess it adds a decent
 amount to the core code generator. I have a preference for the plugin
 approach, but of course I'm primarily an internal protobuf user, so I'm
 willing to be convinced otherwise :-) Would using a plugin, possibly even
 shipped with the standard implementation, make this feature too inconvenient
 to use? Or is there enough demand for this that it warrants implementing as
 an option?


First of all, note that this feature is off by default.  You have to turn it
on with the generate_visitors message-level option.  The only new code added
to the base library is a couple templates in WireFormatLite, which are of
course never instantiated if you don't generate visitor code.

There are a few reasons I prefer to make this part of the base code
generator:

- If you look at the patch, you'll see that the code generation for the two
Guide classes actually shares a lot with the code generation for
MergeFromCodedStream and SerializeWithCachedSizes.  To make this a plugin,
either we'd have to expose parts of the C++ code generator internals
publicly (eww) or we'd have to reproduce a lot of code (also eww).

- The Reader and Writer classes directly use WireFormatLite, which is a
private interface.

- It seems clear that this feature is widely desired by open source users.
 We're not talking about a niche use case here.


 Regarding the proposed interfaces: I can imagine some applications where
 the const refs passed to the visitor methods may be too restrictive - the
 user may instead want to take ownership of the object. e.g., suppose the
 stream is a series of requests, and each of the visitor handlers needs to
 start some asynchronous work. It would be good to hear if users have use
 cases that don't quite fit into this model (or at least if the existing use
 cases will work).


Interesting point.  In the Reader case, it's creating new objects, so in
theory it ought to be able to hand off ownership to the Visitor it calls.
 But, the Walker is walking an existing object and thus clearly cannot give
up ownership.  It seems clear that some use cases need const references,
which means that the only way we could support ownership passing is by
adding another parallel set of methods.  I suppose they could have default
implementations that delegate to the const reference versions, in which case
only people who wanted to optimize for them would need to override them.
 But I'd like to see that this is really desired first -- it's easy enough
to add later.

Also note that my code currently doesn't reuse message objects, but
improving it to do so would be straightforward.  A Reader could allocate one
object of each sub-message type for reuse.  But, it seems like that wouldn't
play well with ownership-passing.



 On Tue, Feb 1, 2011 at 10:45 AM, Kenton Varda ken...@google.com wrote:

 Hello open source protobuf users,

 *Background*

 Probably the biggest deficiency in the open source protocol buffers
 libraries today is a lack of built-in support for handling streams of
 messages.  True, it's not too hard for users to support it manually, by
 prefixing each message with its size as described here:


 http://code.google.com/apis/protocolbuffers/docs/techniques.html#streaming

 However, this is awkward, and typically requires users to reach into the
 low-level CodedInputStream/CodedOutputStream classes and do a lot of work
 manually.  Furthermore, many users want to handle streams
 of heterogeneous message types.  We tell them to wrap their messages in an
 outer type using the union pattern:

   http://code.google.com/apis/protocolbuffers/docs/techniques.html#union

 But this is kind of ugly and has unnecessary overhead.

 These problems never really came up in our internal usage, because inside
 Google we have an RPC system and other utility code which builds on top of
 protocol buffers and provides appropriate abstraction. While we'd like to
 open source this code, a lot of it is large, somewhat messy, and highly
 interdependent with unrelated parts of our environment, and no one has had
 the time to rewrite it all cleanly (as we did with protocol buffers itself).

 *Proposed solution:  Generated Visitors*

 I've been wanting to fix this for some time now, but didn't really have a
 good idea how.  CodedInputStream is annoyingly low-level, but I couldn't
 think of much better an interface for reading a stream of messages off the
 wire.

 A couple weeks ago, though, I realized that I had been failing to consider
 how new kinds of code generation could help this problem.  I was trying to
 think of solutions that would go into the protobuf base library, not
 solutions that were generated by the protocol compiler.

 So then it became 

Re: [protobuf] Re: inheritance.. well sort of... and FieldDescriptors

2011-02-01 Thread Kenton Varda
You can't use the same field descriptors for the four classes.  But, note
that one thing you *can* do is define a base type that just contains the
shared fields, and then parse any of the other types as this type in order
to access the common fields.  Since the field numbers match, they are
compatible.

On Sat, Jan 29, 2011 at 1:32 PM, koert koertkuip...@gmail.com wrote:

 I thought about these options and settled on multiple classes that all
 share a few fields. So indeed duplicate every shared field.

 On Jan 28, 2:28 pm, TJ Rothwell tj.rothw...@gmail.com wrote:
  Is there a best practice for this use case?
 
  Here are some options.
 
  // Duplicate every field (sounds like you're doing this)
  message FooRequest {
required string prompt = 1;
required int64 timeout = 2;
required Foo foo = 3;
 
  }
 
  message BarRequest {
required string prompt = 1;
required int64 timeout = 2;
required Bar bar = 3;
 
  }
 
  // Client code--something like this anyway
  Foo foo = Foo.newBuilder().build();
  FooRequest fooRequest =
 FooRequest.newBuilder().setPrompt().setTimeout(0).
  setFoo(foo).build();
  Bar bar = Bar.newBuilder().build();
  BarRequest barRequest =
 BarRequest.newBuilder().setPrompt().setTimeout(0).
  setBar(bar).build();
 
  // Share a base message and extend it
  message BaseRequest {
required string prompt = 1;
required int64 timeout = 2;
extensions 1000 to max; // reserved for extensions}
 
  message BazRequest {
extends BaseRequest {
  optional BazRequest baz = 1000;
}
required Baz baz = 1;
 
  }
 
  // Client Code
  Baz baz = Baz.newBuilder().build();
  BazRequest bazRequest = BazRequest.newBuilder().setBaz(baz).build();
  BaseRequest request =
 
 BaseRequest.newBuilder().setPrompt().setTimeout(0).setExtension(BazRequest.baz,
  bazRequest).build();
  // so given a BaseRequest, how do we know if it's is a Baz or something
  else? -- lots of hasExtension() calls? that doesn't sound good and you
 have
  to keep track of an ExtensionRegistry if you're using a generic
  Reader/Writer.
  // baz is a base? not so much
  // baz has a base? sounds right
  // this approach doesn't seem right
 
  // Composition
  message CommonRequest {
required string prompt = 1;
required int64 timeout = 2;}
 
  message QuxRequest {
required CommonRequest common = 1;
required Qux qux = 2;
 
  }
 
  // Client Code
  Qux baz = Qux.newBuilder().build();
  CommonRequest.Builder crb =
  CommonRequest.newBuilder().setPrompt().setTimeout(0);
  QuxRequest bazRequest
  = QuxRequest.newBuilder().setCommon(crb).setQux(qux).build();
  // the challenge here would be determining how the common message is
  defined. If you have dozens of requests that match on some fields here,
 some
  there... it may get complicated.
 
  YMMV,
  -- TJ
 
  On Fri, Jan 28, 2011 at 12:59 PM, koert koertkuip...@gmail.com wrote:
   i have several proto message definitions that all share the first 4
   fields. its as if they are all subclasses of one protobuf message
   format.
   in java can i create the FieldDescriptors for these 4 fields once and
   use them for the getters and setters of all these message classes? it
   would save me a lot in terms of logic and maybe also lead to somewhat
   bettter performance.
   best koert
 
   --
   You received this message because you are subscribed to the Google
 Groups
   Protocol Buffers group.
   To post to this group, send email to protobuf@googlegroups.com.
   To unsubscribe from this group, send email to
   protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com
 protobuf%2bunsubscr...@googlegroups.comprotobuf%252bunsubscr...@googlegroups.com
 
   .
   For more options, visit this group at
  http://groups.google.com/group/protobuf?hl=en.

 --
 You received this message because you are subscribed to the Google Groups
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.



-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.