Re: Immutability of generated data structures

2009-04-23 Thread Jon Skeet sk...@pobox.com

On Apr 23, 1:03 pm, Kannan Goundan kan...@cakoose.com wrote:
 The code generated by protoc seems to go to great lengths to make sure
 that once a message object is created, it can't be modified.  I'm
 guessing that this is to avoid cycles in the object graph, so that the
 serialization routine doesn't have to detect cycles.  Is this
 correct?  Would a cycle in the object graph put the current serializer
 into an infinite loop?

I think it's more because our experience in Google is that immutable
objects are easier to reason about - basically a lesson from
functional programming.

But yes, I suspect the side-benefit of preventing cycles is a
generally good thing too :)

Jon
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: Performance comparison of Thrift, JSON and Protocol Buffers

2009-04-19 Thread Jon Skeet sk...@pobox.com

On Apr 18, 4:23 am, TimYang iso1...@gmail.com wrote:
 Alkis is quite right, sorry for the typo.

Which JIT were you using, by the way? I found that using the -server
option made the Java ProtoBuf code run more than twice as quickly. Of
course, it could be that the Thrift code would get the same boost...

Jon


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: Protocol Buffers Vs. XML Fast Infoset

2009-04-08 Thread Jon Skeet sk...@pobox.com

On Apr 3, 10:40 am, ShirishKul shirish...@gmail.com wrote:
 I worked to see the difference between the *XML fast infoset* and the
 *Protocol Buffers* (although I'm not aware about what are internal
 things happening therein).

 I found that for a typical data to be transferred across the wire for
 size of 500KB that a XML file would represent has corresponding file
 size as 300KB for PB binary and around 130KB for XML Fast Infoset
 binary file.

Just going back to these numbers, a less-than-50% benefit for going
from XML to PB is surprisingly bad.

Do you have a sample file with non-confidential data that we could
look at?

Jon


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: Streaming different types of messages

2009-03-27 Thread Jon Skeet sk...@pobox.com

On Mar 27, 1:32 pm, achin...@gmail.com wrote:
 If I understand correctly there is no good way to use proto buffers to
 stream different types of messages, right? For example if my stream
 has a mix of several messages of type m1 and m2, I will have to device
 a scheme outside of proto buffers to separate it into 2 streams and
 then pass it through parsers for each.

 In other words is there a way to do event based parsing using proto
 buffers, or even a way to say don't parse a repetitive field unless
 needed.

No, you don't have to do it into separate streams. Instead, stream a
sequence of messages each of which has either an m1, or an m2, or an
m3 etc. This basically ends up being (tag) (message) (tag) (message),
where the tag is effectively identifying the type of message.

All you need to do is create the wrapper message, and the rest should
work fine.

Jon
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: optimize_for option default

2009-03-06 Thread Jon Skeet sk...@pobox.com

On Mar 5, 11:39 pm, Kenton Varda ken...@google.com wrote:
 As you know if you've read the docs carefully, when using C++ or Java
 protocol buffers, for best performance you need to add a line to your .proto
 files:

   option optimize_for = SPEED;

snip commentary

I think there are three issues here:

1) Yes, it's really easy to miss that. Shortly after PBs were released
I saw a blog post showing how slow PBs are - and then I pointed out
the optimize_for option...
2) It's a pain to have to use a whole different .proto file just to
specify this option. While I believe many options *should* be in
the .proto file (particularly where they might affect individual
fields etc) I think this would make sense to have as a compiler/
generator flag (it could be in either place, for situations where the
two are split). For instance, you may have a memory-limited client
where speed doesn't matter, and a memory-rich server processing
gazillions of these things - they should be able to use the
same .proto file.
3) Backward compatibility.

I suspect we could really do with the compiler working in four
different modes:

1) Default to SPEED when otherwise undefined; obey proto file
otherwise
2) Default to CODE_SIZE when otherwise undefined; obey proto file
otherwise (current mode)
3) Generate code using SPEED regardless of proto file
4) Generate code using CODE_SIZE regardless of proto file

I think it would we should at least be able to specify I want the old
behaviour on the command line just because that makes the backward
compatibility story easy: use this argument and it's all as it was -
but I'd be happy for the actual default to be changed.

(Evil thought: make the default a build-time setting for the compiler
itself, so if you want to build protoc with the old behaviour you can.
Almost certainly not a good idea, but it's amusing to think of the
number of places this *could* be set...)

Jon

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: Initial benchmarking committed to svn (r100)

2009-03-06 Thread Jon Skeet sk...@pobox.com

On Mar 6, 1:24 pm, Justin  Azoff justin.az...@gmail.com wrote:
 On Mar 6, 1:13 am, Justin  Azoff justin.az...@gmail.com wrote: I did a 
 quick port to python(pasted at the end, hopefully it wont be
  garbled)

 well, that didn't work.
 I threw it up athttp://bouncybouncy.net/ramblings/files/ProtoBench.py
 if anyone is interested.

Sounds like a good thing to include in svn, if you're happy with the
licensing of it etc. The more languages we can benchmark, the
better :) (I'd be really interested in seeing the C++ results, but
I've no idea how the reflection side of things would work, if at all.
I'm not a C++ person...)

Kenton, who's our most Python-aware committer?

Jon

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: optimize_for option default

2009-03-06 Thread Jon Skeet sk...@pobox.com

On Mar 6, 2:23 pm, aepensky apen...@gmail.com wrote:
 +1 for making it a compiler command-line option.

 Pretty much all other IDLs get this wrong to some degree also.
 Having annotations or options in the IDL file is nice, but make sure
 they are only helping to define the message and the service, not the
 implementation.
 When I get a service definition from a service author I don't want to
 be told how to optimize, or what namespace my generated classes should
 go into.
 Those things can be different for every client.  As it is now, a
 client developer would have to mark up the .proto file that s/he
 received from the service developer.

Obviously I agree about the optimisation, but why the namespace?
Surely the provider of the proto owns which namespace it should be
in, don't they?

Jon
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: optimize_for option default

2009-03-06 Thread Jon Skeet sk...@pobox.com

On Mar 6, 4:55 pm, aepensky apen...@gmail.com wrote:
 Sorry, I realize that wasn't a very clear statement...

 What I mean is, if there is an option which does not leave any
 fingerprint in either the serialized message or the
 FileDescriptorSet, so that you can't tell how the option was set by
 looking at either of these, then the option is controlling only code
 generation and is not affecting the service contract.  So it should
 not be in the .proto file.

 I think that applies to the package statement as well as
 optimize_for.  Protocol Buffers does not put globally unique
 signatures into the messages or descriptors based on your package
 declaration.  It only affects the code generation.

It's definitely in the descriptor set - because that's what my C#
generator uses!

I agree that it doesn't affect the wire format of the messages
themselves, but I still think a world in which everyone who uses the
same package/namespace for the same proto for each language is a saner
one. (i.e. all Java users will see one package; all C# users will see
one namespace, etc. There can be differences between languages, but at
least two users of the same language have a common namespace).

It's certainly a personal thing, and again maybe you should be able to
*override* it from the command line, but I think it makes sense to at
least put default package/namespace options into the proto file.

Jon
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: Initial benchmarking committed to svn (r100)

2009-03-05 Thread Jon Skeet sk...@pobox.com

On Mar 5, 11:47 am, ijuma isma...@gmail.com wrote:
  I haven't optimised with a profiler very recently - I suspect there
  are some improvements which could be made by skipping the null
  handling when merging/parsing (as it should be unnecessary). I didn't
  use any particular options when running the Java version (1.6.0_11-
  b03) so I'm sure there are tweaks to be made there too.

 Before any other settings are tried, it would be worth benchmarking it
 with -server as it can make a large difference when compared to -
 client. The default varies based on OS and machine specification so it
 makes sense to use an explicit setting to make it clear what JIT was
 used.

Right. Somewhat embarrassingly, this laptop doesn't actually *have*
the server JIT installed. I'm mostly working on the C# code at the
moment, but I'll come back and rerun the test with the server JVM when
I've got a bit more time.

Jon
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: Performance comparison of Thrift, JSON and Protocol Buffers

2009-03-03 Thread Jon Skeet sk...@pobox.com

On Mar 3, 7:37 pm, Dave Bailey d...@daveb.net wrote:
 Thanks for writing this up; I think it's a nice real world example.

 I ran an equivalent test (using your same .proto files) in Perl to
 compare JSON::XS, protobuf-perlxs, and Storable.  I did this on an
 x86_64 quad-core Xeon (2.5 GHz) and found:

 1) Your original dns.proto (with strings), serializing and
 deserializing a DnsResponse with 5000 random DnsRecord elements:

snip

Could I ask you to keep hold of the .proto files and generated files?
I'm hoping to commit the Java version of my C# benchmark code on
Thursday... it would be nice to have more data.

Jon


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: protocol buffers in .net

2009-03-03 Thread Jon Skeet sk...@pobox.com

On Mar 3, 11:06 pm, Kenton Varda ken...@google.com wrote:
 sk...@pobox.com wrote:
  There's one major blocker at the moment though: all my copies of the
  Google test .proto files are decorated with the C# options. *At the
  moment* that means the Java and C++ code would have to build the C#
  options as well. I'm looking into adding something to the import
  syntax so that an import could be marked as options only which would
  mean that the types within it could only be used when applying options
  (rather than defining fields) but the import wouldn't be in the list
  of dependencies. At that point, the C# options could be applied
  harmlessly in terms of the Java/C++. (If anyone has any better
  suggestions for getting round this, I'm all ears.)

 You could also just fork those files.

Well, the files are effectively forked now. But there's a bigger issue
- I don't want the C# options to force anyone to generate C++ and Java
files if I can help it. It'll only get worse if different languages
add their own options. Imagine in 5 years - an open source project
comes out with a .proto file with support for 5 languages. Suddenly
your Java code has dependencies on Perl options, Python options, C#
options, Ruby options etc. Ick.

Anyway, I'll experiment for a while and report when I get back.

  There's also the matter of working out how the C# port would end up
  getting built - I currently use NAnt/MSBuild, and my make knowledge is
  very limited. That's a relatively minor issue though.

 Well, currently we only use make for C++.  Java uses Maven, Python uses
 setuptools.  Every language seems to have their own favorite
 ostensibly-language-independent build system.

:) I've spoken to the Mono folks recently and apparently the C# port
*very nearly* builds now on Mono - I'm going to try to get that
working very soon. I'd really like to be able to give perf figures for
both .NET and Mono.

Jon


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: Performance comparison of Thrift, JSON and Protocol Buffers

2009-03-02 Thread Jon Skeet sk...@pobox.com

On Mar 2, 10:14 am, Adewale Oshineye adew...@gmail.com wrote:
 This article has some surprising results from it's performance
 comparison of Thrift,  Protocol Buffers and 
 JSON:http://bouncybouncy.net/ramblings/posts/thrift_and_protocol_buffers/

More specifically, it's comparing the performance of the Python
implementations of all of those. That only really says that our Python
implementation is relatively slow. I think the numbers for C++/Java
are somewhat better :)

Jon

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: protocol buffers in .net

2009-03-02 Thread Jon Skeet sk...@pobox.com

On Mar 2, 9:35 pm, Marc Gravell marc.grav...@gmail.com wrote:
 I'm pretty happy with ongoing general maintenance, but with Jon
 working for Google, and it being a closer port of the existing
 implementations... I'd realistically expect things may be a little
 stacked in favour of that one.

I hope that working at Google wouldn't be a significant factor - but I
think the closeness of the port is. Basically it would mean that if
someone downloaded just the main distribution, they could use a very
similar API in C++, Java and C# (and Python? I don't know how close
the API is there).

In a similar vein, a change in Google's Java code currently prompts me
to change the C# code (I recently caught up in terms of
Message.toBuilder and nullity checks, for example).

There's one major blocker at the moment though: all my copies of the
Google test .proto files are decorated with the C# options. *At the
moment* that means the Java and C++ code would have to build the C#
options as well. I'm looking into adding something to the import
syntax so that an import could be marked as options only which would
mean that the types within it could only be used when applying options
(rather than defining fields) but the import wouldn't be in the list
of dependencies. At that point, the C# options could be applied
harmlessly in terms of the Java/C++. (If anyone has any better
suggestions for getting round this, I'm all ears.)

There's also the matter of working out how the C# port would end up
getting built - I currently use NAnt/MSBuild, and my make knowledge is
very limited. That's a relatively minor issue though.

In short, I don't think that integration would be appropriate right
now, but I'd be happy for it to happen some time in the future. I
think we'd have to work out what the tangible benefits and costs would
be though.

 Even if that comes to pass, I'll still maintain protobuf-net anyway,
 simply because it seems useful to a reasonable number of people ;-p

I'd be shocked and disappointed for you to say anything else :) We
must port my benchmark test to protobuf-net at some point...

Jon


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---