Thanks for your input. I think your comment helps me clarify my query:
Most applications or services that are "producers" will generate data with
N fields in it. Consumers may be interested in only m fields- m could be 5
and N could be 20. For example: An address book service will generate an
address with 25 fields in it. An application that consumes the service will
want only 3- say name, phone number, and zip code
In the current implementation, there is a way of picking 5 fields only.
Ideally, the time taken to pick only 3 fields, should be a lot less than
picking 25 fields.
An even better implementation will screen records based on field values. I
do not agree that this is "making it a database". XML has allowed query
processing for at least 10 years. XML even allows joining 2 XML records
based on a common key. In a database, whether the traditional RDBMS or a
NoSQL kind, one has to pay the price for ACID properties or for "CAP" -
consistency, availability and partitioning. These problems do not exist if
one is screening 10,000 protocol buffers looking for a particular field.
I would imagine that there are many applications which read Protocol Buffers
for thousands of records, picking only a small fraction of them.
I appreciate the simplicity of Protocol Buffers, but adding features like
these have nothing to do with complicating the original simplicity, as it is
like a layer that adds value without overhead- Those applications which want
to screen based on field values, can screen.
On Fri, May 14, 2010 at 11:52 PM, Marc Gravell <marc.grav...@gmail.com>wrote:
> Firstly, I must note that those benchmarks are specific to protobuf-net (a
> specific implementation), not "protocol buffers" (which covers a range of
> implementations). Re "is it not more realistic"; well, that depends entirely
> on what your use-case *is*. It /sounds/ like you are really talking about
> querying ad-hoc data; if so a file-based database may be more appropriate.
> But it depends entirely on your scenario.
> It /would/ be possible (with protobuf-net at least; I can't comment beyond
> that) to construct a type that represents the data that you *are* interested
> in - the other fields would be quietly dropped without having to fully
> process them, avoiding some CPU. Likewise, it is possible to read items in a
> non-buffered way (i.e. you only have 1 object directly available in memory;
> any others are discarded immediately, available for GC). However; again - it
> sounds like you *really* want a database. Which "protocol buffers" isn't.
> Marc Gravell
> On 14 May 2010 11:31, Kevin Apte- SOA and Cloud Computing Architect <
> technicalarchitect2...@gmail.com> wrote:
>> I saw that ProtoBuf has been benchmarked using the Northwind data
>> set- a data set of size 130K, with 3000 objects including orders and
>> order line items.
>> This is an excellent review:
>> Is it not more realistic, to have a benchmark with a much larger file,
>> in which we are interested only in a few records, and a few fields
>> within those records.
>> For example: 10,000 order line items, we want only a line item with a
>> particular product code.
>> Or we want to pick orders for a particular customer type, or with a
>> particular description.
>> Are there use cases where data is stored in Protocol Buffer Format in
>> a file, and read into memory?
>> Another issue is that the size seems rather small- it is only 256
>> bytes per object,- I would imagine there are many use cases where the
>> objects are much bigger.
>> Many use cases are going to be with much larger objects and will
>> select m out N fields- where m will be 5 and N will be 20. This is
>> because very rarely can an application want all of the information in
>> a protocol buffer generated by another program.
>> Any comments?
>> You received this message because you are subscribed to the Google Groups
>> "Protocol Buffers" group.
>> To post to this group, send email to proto...@googlegroups.com.
>> To unsubscribe from this group, send email to
>> For more options, visit this group at
You received this message because you are subscribed to the Google Groups
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to
For more options, visit this group at