from:"Christopher Smith"

Re: [protobuf] Integers are being padded in protobufs on ARM

2016-05-24 Thread Christopher Smith

I'm guessing your cross compiler setup is somehow mucking up endian logic.
Do you have a simple reproducible set up.
On Apr 28, 2016 2:58 PM, "Brian Savage"  wrote:

> I am using protobufs on an ARM processor running Ubuntu Linux.  Everything
> seems to be working, except some integers are being padded with an extra
> byte in LSW.
>
> Take for example the raw message 0x10 0x01 0x48 0x32
>
> I'm passing an unsigned integer with a value of 50 decimal (0x32).
> Printing the raw data on both sides (the other side is RedHat on Intel)
> shows that the message matches up.  However, when I call the method to get
> the data from the proto message, it comes back as 0x32 0x00 (12800
> decimal).
>
> Anyone have an idea as to what is going on?  This happens with multiple
> messages.
>
> Background -  I obtained the protobuf source from github and cross
> compiled for the ARM.  As I said, there doesn't seem to be any problems
> other than the integers.  If the integer is large enough that it uses all
> 32 bits, it works OK.)
>
> Thanks
>
> --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to protobuf+unsubscr...@googlegroups.com.
> To post to this group, send email to protobuf@googlegroups.com.
> Visit this group at https://groups.google.com/group/protobuf.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to protobuf+unsubscr...@googlegroups.com.
To post to this group, send email to protobuf@googlegroups.com.
Visit this group at https://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.

Re: [protobuf] NULL character in serialized string

2015-07-21 Thread Christopher Smith

How are you determining the end of the encoded protocol buffer? What
language/data type are you using? It sounds to me like you are null
termination (perhaps with a C char*?), which isn't going to work to well
with a binary structure. Decoding with a null byte, particularly for an
integer field, shouldn't ever cause a problem with protocol buffers.

Having null bytes in the encoded bytes should be expected. In C terms,
assume the encoded data is an array of unsigned chars (so a null value is
of no particular significance).
On Jul 21, 2015 2:22 PM, Devesh Gupta drax...@gmail.com wrote:

 Hello,

 as per various forums it is mentioned that in serialized string can have
 null character in between.
 I just wanted to know how is protobuf able to find the end of the string
 being passed for deserialization.

 Actually, when i am deserialization a the string it is failing. This
 happens when the int32 value passed for serialization is 0. In case if we
 pass any value above 0 gets de-serialized successfully,

 I have checked the serialized string characters and in case of 0 value,
 there is a null character representing it.

 --
 You received this message because you are subscribed to the Google Groups
 Protocol Buffers group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to protobuf+unsubscr...@googlegroups.com.
 To post to this group, send email to protobuf@googlegroups.com.
 Visit this group at http://groups.google.com/group/protobuf.
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to protobuf+unsubscr...@googlegroups.com.
To post to this group, send email to protobuf@googlegroups.com.
Visit this group at http://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.

Re: [protobuf] Re: Why to reinvent the wheel ?

2013-03-28 Thread Christopher Smith

Not to flog a dead horse TOO much, but while researching some Thrift
related frustrations today, I stumbled across a somewhat old blog entry by
Lev Walkin, developer of the asn1c compiler.
http://lionet.info/asn1c/blog/2010/07/18/thrift-semantics/

I think he does a pretty good job pointing out some of the problems with
ASN.1, while pointing out mistakes some of the newer serialization
formats made where they failed to learn from ASN.1's history. Of course, I
imagine lots of people disagree about what is a mistake and what is a
design decision, but in those differences of opinion lies precisely the
reason why one would reinvent the wheel: one person's yet another wheel
is another person's train tracks.


On Mon, Mar 25, 2013 at 4:11 PM, Christopher Smith cbsm...@gmail.comwrote:

 On Mon, Mar 25, 2013 at 4:25 AM, Vic Devin vfn...@gmail.com wrote:

 Oliver,
 yes, Protobuf can do many more other things than simply remoter
 communication, but again my point is that these are all low level stuff
 which is again in the reinventig wheel territory; if software industry
 wishes to make a giant leap forward we should start building applications
 without even thinking about all these low level details, RPC, serialization
 formats, etc.


 I very much doubt that any serialization framework *ever* created a giant
 leap forward. Sometimes it makes sense to revisit the plumbing and build a
 better wheel because the original design wasn't well suited to how it is
 currently used (ironically this has, in fact, happened several times with
 the wheel... nobody uses the original design). In the case of protocol
 buffers, there were lots of problems with existing frameworks. If you don't
 perceive any problems, you probably shouldn't bother using protocol
 buffers. Indeed, your perception of what protocol buffers are suggests a
 use case where they'd be a bit of a square peg to your round hole anyway.

 In fact thanks to high level programming languages we are able to forget
 the complicated modern CPU architectures which we would have to think about
 if we were stuck with programming in assembler!


 There is a certain philosophy that goes along that way. More than a few
 hundred times it has been demonstrated to be problematic, particularly when
 working on solutions that are unique in some capacity (sometimes just scale
 or efficiency requirements). Certainly it's not hard to talk to folks who
 have switched to protobufs from other serialization frameworks and realized
 significant wins. It's been a big enough deal that other projects have spun
 up trying to improve on the advantages of protobufs.


 Ideally I want to concentrate on the business logic, relying on the
 fact that I dont need to care about the rest.


 Ideally, I want hardware constraints  failures not to happen. We don't
 often get to live in an ideal world. I agree though, it'd be nice if it
 weren't ever an issue. I don't see how fixing the plumbing so it works
 better somehow gets in the way of other people living in a world where the
 plumbing is never an issue...


 The standard that defines remote communication is (or used to be!?) CORBA.


 Umm... no.

 Which version fo CORBA would be that standard then? Would that be CORBA
 1.0 (hard to believe since it didn't even have a standardized serialization
 format for over the wire communications), Corba 2.0, which generally
 doesn't support encrypted transport and won't get through most firewalls?
 Or was that Corba 3, with it's common Component Model? Is that using IIOP
 (which ironically, doesn't work so well over large portions of the
 Internet), or HTIOP, or SSLIOP, or ZIOP? Presumably it is an implementation
 with a POA because without that most stuff doesn't work together at all?

 CORBA never really took over. It had a brief window of opportunity when
 Netscape integrated IIOP in to the browser, but that has long since passed.
 To this day NFS still talks ONC-RPC, and SMB is basically Microsoft's evil
 DCE RPC variant. I think there are still some ugly DCOM things floating
 around as well. If you talk to Facebook/Twitter/most other web services,
 you're mucking around with JSON or XML (over SOAP or REST), and of course
 there's all that WSDL out there. Heck, even Java programs that used RMI
 (mostly all dead and buried at this point), where CORBA compatibility was
 in theory just a configuration flag you set, generally eschewed using
 RMI-IIOP and instead went with JRMP whenever possible.

 More importantly though, lots of people don't even using protocol buffers
 for remote communications, but rather for storing data (arguably this was
 the primary use case at Google originally as well), and CORBA's solutions
 in that space were *anything* but broadly used and for most people would be
 solving the wrong problem (seriously, who would want to store web log data
 in something like ObjectStore?!).


 Using a standard is like talking the same language, so there is a bigger
 chance that we might better

Re: [protobuf] Optimizing protoc for Java

2012-11-28 Thread Christopher Smith

Interested.

--Chris

On Nov 19, 2012, at 4:07 AM, Ryan Fogarty ryan.foga...@gmail.com wrote:

 I have a repeated primitive field array optimization for the protoc-generated 
 Java source, but before I discuss I would like to gauge interest (and get 
 access to the Protocol Buffer Group).
 
 Thanks,
 Ryan
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To view this discussion on the web visit 
 https://groups.google.com/d/msg/protobuf/-/ym9XqRQ9tbMJ.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] How to implement packed=true option for a list of enum?

2012-08-30 Thread Christopher Smith

Same way as any other varint wiretype field.

--Chris

On Aug 30, 2012, at 11:23 AM, Sean Nguyen sontran...@gmail.com wrote:

 Hi all,
 
 Does anybody know how to implement packed=true option for a list of enum?
 
 Thanks,
 
 Sean Nguyen
 -- 
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To view this discussion on the web visit 
 https://groups.google.com/d/msg/protobuf/-/JwhC6CsbJaMJ.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] implications of compiling protocol buffers without stl debugging info

2012-08-08 Thread Christopher Smith

On Wed, Aug 8, 2012 at 3:08 PM, Eric J. Holtman e...@holtmans.com wrote:
 On 8/8/2012 4:51 PM, Chris Morris wrote:
 I want to keep STL debugging *for the rest of my project*. This leads me to
 consider compiling the protocol buffers project without STL debugging info.

 What are the implications of this?

 Unless you are *very* careful, this is going to lead to
 problems:

Let me second this. Microsoft themselves is very clear that if the
destructor doesn't do its cleanup on an STL container that was built
with debug features, bad things will happen.

I think the best you could do is build all your protocol buffer stuff
(including the base library) with _HAS_ITERATOR_DEBUGGING=0. Your
container types likely won't match, so you'll have to use std::copy
and such to move data between other parts of your code and protocol
buffers, but otherwise it might work.

This would likely introduce as many bugs as it might fix, and the
debug stuff for STL containers has some other nasty side effects as
well (scoping rules for for loops change). I'd recommend instead just
doing without... maybe use a 3rd party checked iterator library in the
places you really want it.

-- 
Chris

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Re: Issue 401 in protobuf: Java generated message field getters should return null or throw an exception for uninitialized fields

2012-07-26 Thread Christopher Smith

Your doBar(Request) should be codegen'd by a plugin. That's the bug. You should 
only be coding against
doBar(Bar).

--Chris

On Jul 25, 2012, at 3:33 AM, proto...@googlecode.com wrote:

 
 Comment #2 on issue 401 by martin.c...@gmail.com: Java generated message 
 field getters should return null or throw an exception for uninitialized 
 fields
 http://code.google.com/p/protobuf/issues/detail?id=401
 
 Well, I know about this usage, but even it is intended it is prone to 
 programming bugs that are very hard to discover during development. Please do 
 not close this issue, but consider an enhancement to add an option to 
 generate the code so that it will would fail or return null for uninitialized 
 field. E.g. suppose we have defined a Request message which has several 
 concrete Requests like Request.Foo, Request.Bar, Request.Baz, ...
 
 message Request {
message Foo {
...
}
 
message Bar {
...
}
 
message Baz {
...
}
 
optional Request.Foo foo = 1;
optional Request.Bar bar = 2;
optional Request.Baz baz = 3;
...
 }
 
 We build Request messages so that only one of the foo, bar, baz is 
 initialized at a time (union technique) and send it to the RPC service as an 
 argument for the specific RPC call. We have this service:
 
 service rpcService {
rpc doFoo(Request) returns (Response);
rpc doBar(Request) returns (Response);
rpc doBaz(Request) returns (Response);
 }
 
 It happened too many times that in the doBar(Request request) we called 
 request.getBaz()... instead of request.getBar()... without ever noticing it 
 in the development phase.
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

[protobuf] Some difficulties with protobuf on windows

2012-07-20 Thread Christopher Smith

I'm trying, for the first time, to get protocol buffers to work sanely
on Windows. It's proving to be harder than I expected.

Partly, it's my own fault: I'm building the project with VS11-beta,
which I'm sure nobody has bothered to make work (and for good reason,
it is right there in the name: beta). I'm also building for 64-bit.
I'm also trying using a simple plugin I wrote in python.

However, I thought it was worth sharing some of the fun. Discoveries so far:

1) There are some unfortunate places where std::make_pairT,U is
invoked with the template arguments specified, which kind of begs the
question, why not just use the std::pair constructor? This causes some
confusion for VS11-beta (which in one case is kind of sad). There is
one case where the template argument specification is clearly used to
provoke a conversion from char[] to std::string. Another case, I'm not
sure what the value is. I recommend the following changes (should make
compilers happier and will use move semantics automagically in the
first case):

784c784
proto_path_.push_back(make_pairstring, string(, .));
---
proto_path_.push_back(make_pair(string(), string(.)));
913c913
   proto_path_.push_back(make_pairstring, string(virtual_path,
disk_path));
---
  proto_path_.push_back(make_pair(virtual_path, disk_path));

2) The gtest library fails miserably due its crufty use of std::tuple.
This is separate project and is likely the compilers fault anyway, so
I won't go in to details.

3) The extract_includes.bat script does not copy descriptor.proto
along with everything else. This seems like an oversight/mistake.

4) Plugins have to be exe's. That kind of sucks if the plugin is in a
scripting language. I was looking at adding logic so that if you use
the --plugin flag, and if the executable's name ends in .cmd or
.bat, it would invoke cmd.exe /c ... style, but that seems to have a
bug in it I haven't addressed yet. Sigh.. Windows does such a good job
of killing the wondrous portability of scripting languages. :-(

-- 
Chris

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Best way to organize collections?

2012-07-20 Thread Christopher Smith

Three points:

There isn't much value in your Attributes message, so DummyAttributes2 would be 
the way I'd go.

List is probably not an ideal name for an field, as it may overlap with other 
names depending on programming language.

Per the style guide, field names should be like_this, not LikeThis. The 
compiler will adjust it to fit your local language. Consequently, the right 
answer really should be: dummy_attributes_2.

--Chris

On Jul 20, 2012, at 6:13 PM, Ditat i...@ditat.net wrote:

 Let's say I have .proto like this:
  
 message Dummy
 {
 required Attributes DummyAttributes = 1;
 required string DummyString = 2;
 repeated Attribute DummyAttributes2 = 3;
 }
 message Attributes
 {
 repeated Attribute List = 1;
 }
 message Attribute
 {
 required string Name = 1;
 required string Value = 2;
 }
  
 What is better inside Dummy message? DummyAttributes or DummyAttributes2 ?
 -- 
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To view this discussion on the web visit 
 https://groups.google.com/d/msg/protobuf/-/OTZR2bpElagJ.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

[protobuf] Re: Some difficulties with protobuf on windows

2012-07-20 Thread Christopher Smith

On Fri, Jul 20, 2012 at 5:41 PM, Christopher Smith cbsm...@gmail.com wrote:
 4) Plugins have to be exe's. That kind of sucks if the plugin is in a
 scripting language. I was looking at adding logic so that if you use
 the --plugin flag, and if the executable's name ends in .cmd or
 .bat, it would invoke cmd.exe /c ... style, but that seems to have a
 bug in it I haven't addressed yet. Sigh.. Windows does such a good job
 of killing the wondrous portability of scripting languages. :-(

I found the cause of my pain/suffering. I created an issue for the
change and included a patch to make it work.

http://code.google.com/p/protobuf/issues/detail?id=399

-- 
Chris

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Read custom binary file with inner conditions

2012-05-22 Thread Christopher Smith

Protobuf is a library for serializing in its own format, not other predefined 
binary formats. It is also by design fairly simple with limited integrity 
checking features intended to help with backwards and forwards compatibility.

I don't think it can help you with editing Diablo2 files.

--Chris

On May 21, 2012, at 1:15 AM, HarpyWar harpy...@gmail.com wrote:

 Hello, I want to make a program that can edit D2S files (Diablo 2
 character).
 There are several fields in D2S format with conditions and I can't
 find this functionality in the protobuf docs. For example, Golem
 Item field is exists only when bHasGolem value equals 1 (see here
 http://www.ladderhall.com/ericjwin/html/udietooinfo/udietood2s.html)
 
 Is it possible to deserialize/serialize this data using protobuf?
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

[protobuf] Automated dependency generation?

2012-04-15 Thread Christopher Smith

I have been reworking my Makefile's to do automatic dependency
generation proper. I realized that with modern gcc  GNU make it gets
pretty simple to do this, but protocol buffers were my one stumbling
point, as they have dependencies that gcc can't extract. It occurred
to me that it ought to be trivial for the protoc compiler to do this
for you, but rather than try to hack that in and get a patch accepted,
I thought I'd just try out writing a plugin. It turned out to be
probably the simplest plugin anyone has ever written for protoc.

The code is up on gist. Take a look if you are interested:
https://gist.github.com/2396439

-- 
Chris

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Re: Odd problem with protobuf implementation in C++

2012-04-05 Thread Christopher Smith

This sounds like something *very* weird is going wrong in your runtime. What 
does the debugger show as the memory locations of the string fields? the memory 
locations the allocator provided for the string data?

I think if you make a stand alone unit test for this, you won't see this 
behaviour.

--Chris

On Apr 5, 2012, at 1:03 AM, G. geula.vainap...@mediamind.com wrote:

 Hi,
 
 Thanks for responding. This is exactly the way I access the fields.
 
 Code snippet:
 ...
b.set_billableseat(Mediamind);
b.set_category(5);
b.set_clickthroughurl(http://www.ynet.co.il;);
b.set_creativeid(ab15a);
 ...
 I walked through this with the debugger. When billableseat is set,
 Clickthroughurl and creativeId both become Mediamind as well. Same
 with clickthroughUrl and creativeId - setting each one of them turns
 all 3 fields the same value. Only setting Category didn't affect
 anything except category.
 
 Thanks,
 
 G.
 
 
 
 On Apr 4, 7:45 pm, Jeremiah Jordan jeremiah.jor...@gmail.com wrote:
 How are you setting the data?
 You should be using something like:
 bid.set_HtmlSnippet(Stuff);
 and
 std::string html = bid.HtmlSnippet();
 
 See:https://developers.google.com/protocol-buffers/docs/reference/cpp/goo...
 
 
 
 
 
 
 
 On Wednesday, April 4, 2012 7:25:15 AM UTC-5, G. wrote:
 
 Hi all,
 
 I am using protobuf 2.4.1, and I encountered a weird issue:
 
 I created the following .proto file:
 
 message Auction {
 // Bid request id
 required bytes Id = 1;
 optional bytes Ip = 2;
 required int32 adId = 3;
 required int32 adHeight = 4;
 required int32 adWidth = 5;
 optional string domain = 6;
 optional string country = 7;
 optional string region = 8;
 required string exchangeUserId = 9;
 optional string pageUrl = 10;
 optional int32 publisherId = 11;
 optional int32 timezoneOffset = 12;
 optional string userAgent = 13;
 required string identifier = 14;
 }
 
 message Bid {
 // Bid request Id
 required bytes Id = 1;
 required int32 processingTime = 2;
 required int32 adId = 3;
 required float bid = 4;
 required int32 advertiserId = 5;
 required string creativeId = 6;
 required string billableSeat = 7;
 required int32 category = 8;
 required int32 vendorType = 9;
 required int32 strategyId = 10;
 required string clickthroughUrl = 11;
 required string HtmlSnippet = 12;
 }
 
 It compiles fine with protoc.exe.
 
 However, when I tried assigning the fields, I noticed the following
 phenomenon: the fields id, billableseat and htmlsnippet in Bid
 structure share the same address! When one is assigned, so are the
 other two.
 
 What am I doing wrong? Has anyone encountered such a thing before?
 
 Thanks,
 
 G.
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Additional data types

2012-04-04 Thread Christopher Smith

AFAIK the answer is no. A lot of the value of protocol buffers derives from 
keeping their functionality simple. There are plenty of all singing/all dancing 
serialization frameworks already. ;-)

I think date in particular is fraught with peril. I'd recommend against 
encodung them as strings. What I've done is encode all date/time date as 
int64's, with the value being milliseconds since the UTC epoch. Even that has 
complications, but it is a good enough approach.

In theory, BigInteger could be encoded using the existing varint encoding, so 
you could write a module fairly easily, and of course once you can do that and 
encode floats BigDecimal is straightforward. Alternatively you could store the 
raw bytes of the BigDecimal in a raw field.

To make BigInteger a part of the standard protocol buffer definition, there's a 
lot more work involved, and a price to be paid. The challenge is having a 
consistent, tested, efficient mechanism for handling this in the plethora of 
languages that protocol buffers support. Without that, you undermine the 
ability of protocol buffer's to always be parsed consistently everywhere, which 
is a very important feature. This is a big undertaking, particularly given that 
some languages don't have a standard type equivalent. Given that it is a data 
type so much less often needed, You can see why it likely doesn't make a lot of 
sense to put it in the standard implementation/language.

--Chris

On Apr 4, 2012, at 12:37 PM, jhakim jawaid.ha...@codestreet.com wrote:

 Any plans to provide out-of-the-box for commonly used data types such
 as Date (encoded as String) and BigDecimal/BigInteger types? Seems
 this would be of interest to a lot of users.
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Additional data types

2012-04-04 Thread Christopher Smith

Nothing prevents you from making a module available for everyone's benefit. If 
it is broadly useful, it will undoubtedly be universally adopted.

--Chris

P.S.: What is a decimal type?

On Apr 4, 2012, at 2:21 PM, Jawaid Hakim jawaid.ha...@codestreet.com wrote:

 Date and decimal types are ubiquitous and in wide use.  Language specific 
 bindings could easily be created - this is exactly what we do in some other 
 open source projects that I contribute to. The way I envision it, protocol 
 buffers would provide 'date' and 'decimal' types - protoc compiler would 
 compile these into language specific data types (e.g. java.util.Date for Java 
 and DateTime for C#).
 
 Jawaid Hakim
 Chief Technology Officer
 CodeStreet LLC
 646 442 2804
 www.codestreet.com
 
 
 
 
 -Original Message-
 From: Alexandru Turc [mailto:alex.t...@gmail.com] 
 Sent: Wednesday, April 04, 2012 5:09 PM
 To: Jawaid Hakim
 Cc: Protocol Buffers
 Subject: Re: [protobuf] Additional data types
 
 
 proto files are mapped to many languages, Date and BigDecimal are java 
 specific. 
 
 On Apr 4, 2012, at 9:37 AM, jhakim wrote:
 
 Any plans to provide out-of-the-box for commonly used data types such
 as Date (encoded as String) and BigDecimal/BigInteger types? Seems
 this would be of interest to a lot of users.
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Additional data types

2012-04-04 Thread Christopher Smith

On Apr 4, 2012, at 2:54 PM, Jawaid Hakim jawaid.ha...@codestreet.com wrote:

 My group builds applications using use multiple languages, including Java and 
 C#, so a simple int64 for date representation does not work. 

That there isn't a simple way to do it is a pretty nasty strike against having 
a standard implementation.

I'm surprised though that int64 wouldn't suffice. Any language that supports 
more than a couple popular OS platforms is going to need have some logic 
somewhere for moving back and forth between whatever its preferred date/time 
objects and something that looks an awful lot like an int64, and usually it's 
easily available.

So far I've done this with C++ (using boost's date time objects), Java, C#, 
Python, JavaScript, and I think Perl once too; it hasn't needed more than a few 
lines of code for any of them (which is really saying something in the case of 
Java). Have I unwittingly made a bug, or do your complications come from a 
different scope?

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Message thread safety in Java

2012-02-20 Thread Christopher Smith

Message objects *don't* have mutators and are conceptually a copy of the 
relevant builder object.

--Chris

On Feb 20, 2012, at 10:22 AM, Tom Swirly t...@swirly.com wrote:

 The documentation says it's immutable: 
 http://code.google.com/apis/protocolbuffers/docs/reference/java-generated.html#message
  and this code is heavily used in production, so you can bank on that.
 
 The only way I can see that this would be accomplished would be by returning 
 a copy of the underlying protocol buffer, wrapped in something without 
 mutators.  Copying protocol buffers is quite cheap and this wouldn't require 
 volatile or any locks to work. But I don't have access to code right here, 
 right now to check this...
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Build tagged versions of protobuf?

2011-10-22 Thread Christopher Smith

You could also just specify the static library name and/or use the static link 
flag. Not sure why you'd solve this via a rename

--Chris

On Oct 22, 2011, at 10:43 AM, JonathonS thejunk...@gmail.com wrote:

 Hi all,
 
 is there a way to have protobuf build the libraries with tags? (like
 boost does)
 
 For example...
 - the static debug version of protobuf lite will be named:  protobuf-lite-sd.a
 - the dynamic debug version of protobuf lite will be named: protobuf-lite-d.so
 - the static version of protobuf lite will be named: protobuf-lite-s.a
 etc.
 
 The issue I am currently having is that since protobuf names their
 libraries with all the same name (just the extension is different),
 gcc always wants to link the dynamic version of protobuf first instead
 of the static version due to gcc's linking priorities of searching for
 .so first, then .a.
 
 My linker command will be:  -lprotobuf-lite , which links to the .so
 first if one exists, and then links to the .a.  One way for me to get
 around this, is to either manually rename the libraries myself, or
 just remove the .so file, but this is a bit inconvenient.
 
 Thanks,
 J
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] C++0x enum class

2011-08-24 Thread Christopher Smith

You can always modify the compiler with a plugin.

--Chris

On Tue, Aug 23, 2011 at 12:36 AM, Christopher Head hea...@gmail.com wrote:
 Hi,
 Does anyone know if Protobuf's C++ output is ever going to support
 enum classes? I'd like to be able to define the elements of my enum in
 just one place (i.e. in the message definition file) and then use that
 definition throughout my program, but I'd like it to be defined as an
 enum class rather than an old-style non-type-safe enum.

 Chris

 --
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.





-- 
Chris

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] C/C++ to java

2011-08-01 Thread Christopher Smith

I'm going to assume you aren't trying to use protobuf's as a JNI workaround, 
and rather want it as an rpc solution.

I am not aware of a tool that can go from C declarations to protobuf ones. In a 
lot of ways, the syntaxes are similar, so it isn't much work to convert from C. 
That said, there are important differences that aid in portability and language 
neutrality, so it is time well spent. You may find you have somewhat different 
declarations after you have been through the exercise.

The other thing is you need to select an rpc framework to work with:  
http://code.google.com/p/protobuf/wiki/ThirdPartyAddOns#RPC_Implementations

--Chris

On Jul 30, 2011, at 6:27 AM, Nav navni...@gmail.com wrote:

 I have a C/C++ header file with some structures and functions in it. I
 would like to generate a .proto file from it so that i could use my c
 library in Java using protocol buffers. Is there any script or command
 to do so?
 Thx
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Re: New types?

2011-07-08 Thread Christopher Smith

On Fri, Jul 8, 2011 at 9:46 AM, Eric Hopper omnifari...@gmail.com wrote:
 I guess. This is an interesting and general problem. Practically every
 system like protobuf needs to solve it. Python itself, for example,
 solves it for pickle by allowing you to write custom methods for your
 classes to reduce them to well known types like tuples.

That's a very different system, and a language specific solution. A
lot of the success of protobufs is due to keeping the feature set very
slim. Type aliasing adds a fair bit of complexity and really doesn't
add much: you can always have common message types with locked in
fields and code which knows how to transform those messages in to
whatever representation your internal runtime has.

Truth is, solutions like you are describing will have a lot of
language specific issues, and I think it'd be hard to make a case for
all that added effort. For many language, you don't need hooks in the
library/generated code in order to handle this problem... others you
need all kinds of work. At some point, you add all the interesting
and general problems and protobufs start to look like ASN.1 or XML.
;-)

 I don't think the custom translation can be avoided. But I do think it
 can be better integrated into the system.

Modules seem like the logical way to do custom translations. Not sure
what is wrong with that?

 Protobuf's integer type can already represent integers of arbitrary
 precision, it's just that not every language has an arbitrary
 precision integer type. My idea would solve this problem by requiring
 you to specify the (for example) C++ type to use when deserializing a
 large integer. If you didn't, the protobuf compiler would generate an
 error.

In C++, you can accomplish this simply by overloading the conversion
operator for whatever type you want that integer casted to, no? Yes it
requires an intermediate state, but already having a mechanism to have
custom transformations is going to cost you all kinds of performance
optimization opportunities, so I don't feel there is much lost there.

-- 
Chris

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Re: Exceptions when reading protobuf messages in Java which were created in C++

2011-07-06 Thread Christopher Smith

Basic rule: if the memory is used outside the lifetime of the function call, 
you don't want it on the stack. Async_write very much requires the memory to be 
around later.

--Chris

On Jul 6, 2011, at 10:10 AM, platzhirsch konrad.rei...@googlemail.com wrote:

 I find it difficult to check this by so many messages. This approach
 was suggested by some people already, I don't know how I would realize
 this.
 
 On Jul 6, 6:56 pm, Jason Hsueh jas...@google.com wrote:
 I'm not familiar with the boost libraries you're using, but the use of
 async_write and the stack-allocated streambuf looks suspect. If nothing
 jumps out there, I would first check that the data read from the socket in
 Java exactly matches the data on the C++ side.
 
 On Sat, Jul 2, 2011 at 8:55 AM, platzhirsch 
 konrad.rei...@googlemail.comwrote:
 
 
 
 
 
 
 
 Some other exceptions I receive as well:
 
 Protocol message tag had invalid wire type.
 Protocol message end-group tag did not match expected tag.
 While parsing a protocol message, the input ended unexpectedly in the
 middle of a field.  This could mean either than the input has been
 truncated or that an embedded message misreported its own length.
 
 --
 You received this message because you are subscribed to the Google Groups
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] New types?

2011-06-30 Thread Christopher Smith

You could always extend the compiler, but I bet you could get away with a 
simple preprocessor that aliases types and represents those larger integers as 
raw bytes.

--Chris

On Jun 30, 2011, at 7:30 AM, Eric Hopper omnifari...@gmail.com wrote:

 I'm building an application around protocol buffers. The types in my
 application can be mapped to types that protocol buffers supports, so
 I can sort of on-the-fly translate. For example, I have a timestamp
 field that can be translated to a 64 bit unsigned integer. But it
 would be really nice to do this within protocol buffers.
 
 Additionally, there are some types that protocol buffers supports
 internally, but has no language types for. For example, I would like
 the ability to work with integers that were larger than 2^64-1.
 
 Is there a way to use options and hook the compiler to generate your
 own types for fields? Is there a way to hook the validation code so
 you can make types for integers larger than 2^64-1?
 
 Thanks,
 --
 Eric Hopper
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Re: Serialization of primitive types

2011-06-16 Thread Christopher Smith

I think Gabor wants to avoid the overhead of implementing all that
additional bookkeeping as it'd slow down development. Something that
would effectively generate a protobuf descriptor so that it'd stay
consistent with changes in the Java code.

I would suggest looking at the protostuff project:

http://code.google.com/p/protostuff/

I think it has all that is needed to achieve the goals Gabor is looking for.

--Chris

2011/6/16 Miguel Muñoz swingguy1...@yahoo.com:
 I agree with Marc. When things get complicated, it's a good idea to
 separate your tasks. It seems like your java class, which generates
 some of the data based on other data, is one issue, and your
 serialization is a separate issue. (I know it would be nice to just
 make that class serializable, but that may be where you make things
 complicated.)

 When I want to serialize my classes with protobufs, I create a
 separate protobuf object to just handle serialization. Then I create a
 utility class that transfers data between my protobuf object and my
 java class. Then it's easy to add a constructor to my java class that
 takes a protobuf object and defers the work to the utility class.

 When I transfer data using protobufs, I don't convert to the protobuf
 format until the last possible moment before sending, and I
 immediately convert to the java class on receiving data. That lets me
 put my protobuf objects behind a facade, so I don't need to know the
 serialization details.

 -- Miguel Muñoz


 On Jun 15, 7:07 am, gabor.dicso gabor.di...@gmail.com wrote:
 Hi all,

 I would like to be able to serialize primitive types platform-
 independently. I have hand-written Java data classes and I want to
 serialize their primitive fields using a cross-platform framework.
 These classes can not be generated, they must be written by hand,
 additional code is generated based upon them. Also, serializing the
 object as a whole isn't an option either, because the fields sometimes
 have to be processed before serializing their values. I have to
 serialize the fields separately. It must be made cross-platform
 because the values will be stored in a database and they may be read
 from other platforms. Creating wrapper PB-objects for each primitive
 type is an overhead I must avoid because the operation will be done
 very frequently and with large amounts of data.

 I found that Protocol Buffers addresses cross-platform serialization
 of objects, but I could not figure out how to use it as a
 serialization framework for primitive types (without having
 created .proto descriptors). Is it possible to use PB as a cross-
 platform serializer-deserializer framework for primitive types?
 Thanks,

 Gabor Dicso

 --
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.





-- 
Chris

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Best way to deserialize

2011-04-11 Thread Christopher Smith

Best thing is when encoding, group messages by type and preface each group with 
a type name and FileDescriptorSet which will allow you to decode the rest using 
a DynamicMessage (see notes on self describing messages on the wiki). As per 
usual, use coded stream encoding with length prefixing for the 
FileDescriptorSet, the group, and each message in the group.

--Chris

On Apr 11, 2011, at 6:25 PM, yaroslav chinskiy yar...@gmail.com wrote:

 Hi,
 
 I will have heterogeneous set of protocol buffers.
 
 I will not know the types and the number until the runtime.
 I am planing to create a factory which will convert byte[] to protocol
 buffer object.
 
 What is the best practice? How can I figure out the protocol buffer
 out of the byte array?
 
 Thank you!
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Genereating .proto from the existing java models

2011-03-18 Thread Christopher Smith

Are these POJO's or object generated by protoc? If the former, I would say you 
are SOL. If the latter, you can ask the object for its descriptors and actually 
dump to protocol buffer version of the .proto, at which point making the .proto 
is fairly straightforward.

--Chris

On Mar 18, 2011, at 2:13 AM, ProtoBuffU sumant...@gmail.com wrote:

 Does anyone know how to generate a .proto file from existing Java
 objects?
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to protobuf@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Re: omitting tag numbers

2010-11-16 Thread Christopher Smith

On Tue, Nov 16, 2010 at 7:28 PM, Kenton Varda ken...@google.com wrote:

 On Tue, Nov 9, 2010 at 10:42 PM, Christopher Smith cbsm...@gmail.comwrote:

 This aspect could be mostly mitigated by integrating a metadata header in
 to files. For systems with this kind of an approach look at Avro  Hessian.


 Problems with that:
 1) Protobufs are routinely used to encode small messages of just a few
 bytes.  Metadata would almost certainly be larger than the actual messages
 in such cases.
 2) This metadata would add an extra layer of indirection into the parsing
 process which would probably make it much slower than it is today.
 3) Interpreting the metadata itself to build that table would add
 additional time and memory overhead.  Presumably this would have to involve
 looking up field names in hash maps -- expensive operations compared to the
 things the protobuf parser does today.


Sorry, wasn't meaning to suggest that changes be made to protobuf. Mostly
just meaning that if that you want that, there are other solutions that are
a better fit. I think Avro in particularly has a solution that mitigates
drawbacks 1-3, at the expense of some additional complexity.

You can hack this in to a protobuf solutions though. You just encode the
FileDescriptorSet in to your file header. Then when you start a scan, you
read it in, find out the field numbers that correspond to the field names
you want, and then parse the protobuf's as before. The key thing is the
overhead is only once per file (which presumably has tons of small messages)
and that you transform the parse/query after reading the header to exactly
what you'd have had if you used the field numbers to start with.

Honestly, for me the win with the field numbers tends to be with long term
forward and backward compatibility.

-- 
Chris

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Re: Why to reinvent the wheel ?

2010-11-09 Thread Christopher Smith

On Tue, Nov 9, 2010 at 6:15 AM, Kalki70 kalki...@gmail.com wrote:

 On Nov 9, 2:59 am, Kenton Varda ken...@google.com wrote:
  OK, I looked into this again (something I do once every few years when
  someone points it out).
 
  ASN.1 *by default* has no extensibility, but you can use tags, as I see
 you
  have done in your example.  This should not be an option.  Everything
 should
  be extensible by default, because people are very bad at predicting
 whether
  they will need to extend something later.

 You can extend it even without using tags. I used tags to show a more
 similar encoding as Protobuf.


Without tags it is not extensible in the same sense as protocol buffers.

  The bigger problem with ASN.1, though, is that it is way
 over-complicated.
   It has way too many primitive types.  It has options that are not
 needed.
   The encoding, even though it is binary, is much larger than protocol
  buffers'.  The definition syntax looks nothing like modern programming
  languages.  And worse of all, it's very hard to find good ASN.1
  documentation on the web.
 

 You saw on my example that syntax is quite similar to that of
 protobuf. Yes, it CAN be very complicated, but it doesn't need to be.
 You can use it in a simpler way. You are not forced to use all
 primitive types.


You are looking at it merely from the perspective of someone wishing to use
ASN.1, not someone implementing it. The problem is the complexity of
implementing ASN.1 in itself brings with it a number of shortcomings.


 The encoding can be shorter or bigger, depending on
 the enconding rules used. PER is a good example of short encoding, if
 length is important in a specific project.


PER's encoding of ints is a great example of ASN.1's disadvantages.

Most of the compactness in PER severely limits extensibility, as it relies
on the decoder having a complete knowledge of the encoded data structure.
Even in such cases, if you have fields which normally have small values
(2^21 or less) but occasionally may have larger values (and this is a pretty
common scenario), the protocol buffer encoding mechanism is going to be much
more compact. Even outside of that case, the sheer number of types supported
by ASN.1 requires that in order for PER encodings to be extensible, the
preamble for fields must take up far more space than it does with protocol
buffers.


 And the best part is that all these encodings are STANDARD. Why to
 create a propietary implementation if there is a standard?


I think this question has already been answered, but it is worth pointing
out that the fact that the market place of ideas has produced Hessian, Avro,
Thrift, etc., suggests there


 It is like microsoft using their propietary formats for offiice
 documents, instead on open standards.


No, it actually is quite different. The initial implementation of PB was
meant for encoding data that was not to be shared with outside parties (and
we are all glad that that data isn't going to be shared). Secondly, the PB
implementation is far, far simpler than the ASN.1 standard. Finally, Google
provides their complete implementation of an encoder/decoder as open source.


 Wht if tomorrow Microsoft says : Oh, I need something simpler than
 ASN.1, so we will create a different model: And then we wil have a
 different version of protobuf. And like this, many companies could
 develop their own implementations.


This is the reality now.


  It is also hard to draw a fair comparison without identifying a
 particular
  implementation of ASN.1 to compare against.  Most implementations I've
 seen
  are rudimentary at best.  They might generate some basic code, but they
  don't offer things like descriptors and reflection.

 Well, Google, with all their resources, could have, instead of


When protocol buffers were developed, with all their resources wasn't
nearly as impressive sounding as it was now. The reality is that Google had
very limited resources and more importantly it would have wasted them
without realizing any advantage (and certainly realizing several
disadvantages) for its business.


 creating something like ASN.1, but different, put some effort
 developing some apis, like those from protobuf, but for ASN.1. They
 could have supported maybe a subset of full ASN.1, but they would
 still be using an standard, and it would be easier to communicate with
 existing systems that support ASN.1


I think you are assuming that being able to communicate with existing ASN.1
systems would be one of the goals. That's a pretty huge assumption. But hey,
let's assume that for a moment.

There are, last I checked, a half dozen encoding formats for ASN.1. Let's
say you implemented just one (PER). There are two variants of PER (aligned
and not aligned). Even if you restrict yourself to one variant, you still
have ~18 different field types to handle. Even if you restrict yourself to a
subset of about 5 that represent functionality inside protocol buffers, you
have range encodings

Re: [protobuf] Re: Why to reinvent the wheel ?

2010-11-09 Thread Christopher Smith

On Tue, Nov 9, 2010 at 6:21 AM, Kalki70 kalki...@gmail.com wrote:

 On Nov 9, 10:13 am, multijon multi...@gmail.com wrote:
  As a side note, the company I worked at used ASN.1 for five years to
  encode all of its product's communication messages (Using PER
  encoding), with what was supposed to be a highly optimized
  implementation of ASN.1.
 
  One of my last projects in the company was to try and convert our
  encoding method (and the underlying data structure) from ASN.1 to
  Protobuf. A project that was estimated to be long and tiring turned
  out to be rather easy, eliminating plenty of unnecessary (in protobuf,
  but necessary in ASN.1) memory allocations, thus both speeding
  performance and decreasing the memory footprint of our product by
  50-70% (!).

 Again I must insist about this. ASN.1 doesn't use memory allocations.


Yes, but the implementations do. Try getting an ASN.1 implementation as
efficient as protocol buffers. It takes a lot more effort than implementing
protocol buffers from scratch. That's part of the advantage.


 There are some very good, like from OSS Novalka.


First, they provide about a dozen different products for ASN.1, which by
itself saying a lot about ASN.1's complexity. Secondly, the tool isn't
available as open source. Additionally, the solution is so cheap that they
don't list pricing (I'm trying to remember the pricing the last time I
looked at it, but it escapes me). Finally, the last time I tested it, the
encode/decode wasn't nearly as fast in C++, let alone Java. There isn't even
an implementation for Python or a variety of other languages that have very
fast and fully compatible implementations of protocol buffers. Those are
some huge advantages for protocol buffers in my mind, despite OSS having
devoted far more resources to tackling the problem than everyone
collectively has on protocol buffers.

-- 
Chris

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Re: Why to reinvent the wheel ?

2010-11-09 Thread Christopher Smith

On Tue, Nov 9, 2010 at 6:44 AM, Kalki70 kalki...@gmail.com wrote:

 Oh, I just found out that you are the developer. It seems I am not the
 only one who thinks you reinvented the wheel :


 http://google-opensource.blogspot.com/2008/07/protocol-buffers-googles-data.html


Yes, this is not a new line of thinking.


 As someone mentioned there :

 The apparent complexity of ASN.1 is largely due to its flexibility -
 if you're using only the sort of functionality that pbuffer gives you,
 it would be pretty much the same, I would think.


I think what you are failing to appreciate is that that flexibility in and
of itself imposes a huge toll. Think of C vs. C++.

-- 
Chris

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Re: omitting tag numbers

2010-11-09 Thread Christopher Smith

On Mon, Oct 25, 2010 at 4:11 PM, Henner Zeller henner.zel...@googlemail.com
 wrote:

 On Mon, Oct 25, 2010 at 16:10, maninder batth batth.manin...@gmail.com
 wrote:
  I disagree. You could encode field name in the binary. Then at de-
  serialization, you can read the field descriptor and reconstruct the
  field. There is absolutely no need for tags. They are indeed
  cumbersome.

 If you include the field name, then your throw out part of the
 advantages of protocol buffers out of the window: speed and compact
 binary encoding.


This aspect could be mostly mitigated by integrating a metadata header in to
files. For systems with this kind of an approach look at Avro  Hessian.

-- 
Chris

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] WSDL Vs PB

2010-11-09 Thread Christopher Smith

On Tue, Nov 9, 2010 at 7:56 AM, maninder batth batth.manin...@gmail.comwrote:

 In typical WS-* webservice,  WSDL describes a service interface,
 abstracts from underlying communication protocol and serialization and
 deserialization as well as service implementation platform.
 Where does PB fits in this picture? Is  .proto file, equivalent to
 WSDL? Or should i view it as simply serialization and deserialization
 description file ?


The .proto could serve this role, or you could use a FileDescriptorSet PB.

-- 
Chris

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Why to reinvent the wheel ?

2010-11-08 Thread Christopher Smith

On Mon, Nov 8, 2010 at 9:59 PM, Kenton Varda ken...@google.com wrote:

 The bigger problem with ASN.1, though, is that it is way over-complicated.


THIS


   It has way too many primitive types.  It has options that are not needed.
  The encoding, even though it is binary, is much larger than protocol
 buffers'.


Actually, the PER encoding isn't too bad, although it doesn't encode ints
using varint style encoding, which tends to help with most data sets.


  The definition syntax looks nothing like modern programming languages.
  And worse of all, it's very hard to find good ASN.1 documentation on the
 web.


Yup, this one too. On the plus side, you can find the standards well
defined, which helps when building independent implementations.


 It is also hard to draw a fair comparison without identifying a particular
 implementation of ASN.1 to compare against.  Most implementations I've seen
 are rudimentary at best.  They might generate some basic code, but they
 don't offer things like descriptors and reflection.


Complexity yields rudimentary implementations.


 So yeah.  Basically, Protocol Buffers is a simpler, cleaner, smaller,
 faster, more robust, and easier-to-understand ASN.1.


:-)

-- 
Chris

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] ParseFromArray -- in Java

2010-10-25 Thread Christopher Smith

Might as well use a new object. Creating a new object is if anything likely 
less overhead.

--Chris

On Oct 25, 2010, at 1:52 PM, ury ury.se...@gmail.com wrote:

 Hi!
 
 Is it possible to reuse the same Protobuf object when unserializing a
 large number of objects in Java using the equivalent  of the C++
 ParseFromArray  ? i.e.  Does the Java implementation has the Clear()
 method ?
 
 Thanks!
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Re: can i use protocol buffer to encrypt and send a file

2010-10-18 Thread Christopher Smith

On Sun, Oct 17, 2010 at 10:01 PM, nit nithin.shubhan...@gmail.com wrote:


 On Oct 14, 4:51 pm, Adam Vartanian flo...@google.com wrote:
 I have a file which i need to encrypt and send so can i use
   protocol buffer for this buffer? please do reply me.
 
  Protocol buffers don't provide any built-in encryption or anything
  like that, no, all they provide is a serialization format.  You could
  use protocol buffers to send a file you encrypted via some other
  method, but I don't think that's what you're asking.
 
  - Adam


 Hi Adam,
  Thanks a lot for your reply. But I saw in the google code website
 that encoding will be done in Base 128 Varint. So what i understood by
 default was it will be done to all. Is it right?


Base 128 Varint is an encoding scheme, not an encryption scheme.

-- 
Chris

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

[protobuf] Re: What's the point of Protocol Buffers?

2010-07-24 Thread Christopher Smith

There is also the other end of the spectrum, where I am at. With a
large Hadoop cluster and terabytes of data, the efficient storage and
zippy parsing of protobufs is a huge deal.

In many ways, protobufs allow you do do what XML promised, but much
more efficiently. The other way to look at them is they are ASN.1
reduced to the simplest usefull feature set.

--Chris

On Jul 23, 4:43 pm, Timothy Parez timothypa...@gmail.com wrote:
 Also note,

 while your computer might have 4 cores running at 2.5Ghz the hardware I'm
 talking about
 has 1 core, runs at 100Mhz and that's it... processing XML on devices like
 that... a real pain in the ...

 But I have to admit, when I write computer to computer software, I go REST
 or SOAP all the way

 On Sat, Jul 24, 2010 at 1:40 AM, Timothy Parez timothypa...@gmail.comwrote:



  Hi,

  The reason we use it is because we don't just develop software but also
  hardware solutions.
  Hardware solutions which are connected through GPRS or even RS232
  connections.

  GPRS is slow and in most cases you pay for the amount of data your send,
  so we have to keep the packages as small as  possible.

  RS232 doesn't work well with large packets, so again size is very
  important.

  Web Services, REST, SOAP, ... they are all very verbose... to
  expensive/large for our needs.

  If you need data to be as small as possible, protocol buffers are a good
  option.

  Timothy

  On Wed, Jul 21, 2010 at 12:57 PM, Tim Acheson tim.ache...@gmail.comwrote:

  I generally create web services using WCF or ASP.NET MVC. I don't get
  the point of Protocol Buffers. Am I missing something?

  Out of the box, WCF web services and ASP.NET MVC actions serialise my
  objects to JSON or XML, using the serialisation libraries provided by
  the framework. I don't need to do anything to achieve encoding
  structured data in an efficient yet extensible format -- I just
  define my objects as normal and the .NET framework does everything for
  me.

  I don't need to write any code to do the serialisation, either. I just
  define the return type of the web method in my WCF project, or define
  an ASP.NET MVC Action that returns the object. The framework does the
  rest.

  Also, I rarely come accross a web service that returns anything other
  than strings, 32-bit integers and booleans. If I did, I'd probably
  question the architecture.

  Perhaps somebody could explain why I would want or need to use
  Protocol Buffers?

  Thanks! :)

  --
  You received this message because you are subscribed to the Google Groups
  Protocol Buffers group.
  To post to this group, send email to proto...@googlegroups.com.
  To unsubscribe from this group, send email to
  protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.c
   om
  .
  For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Re: What's the point of Protocol Buffers?

2010-07-24 Thread Christopher Smith

On Sat, Jul 24, 2010 at 3:22 PM, Oliver Jowett oliver.jow...@gmail.comwrote:

 Christopher Smith wrote:
  There is also the other end of the spectrum, where I am at. With a
  large Hadoop cluster and terabytes of data, the efficient storage and
  zippy parsing of protobufs is a huge deal.
 
  In many ways, protobufs allow you do do what XML promised, but much
  more efficiently. The other way to look at them is they are ASN.1
  reduced to the simplest usefull feature set.

 Amusingly enough, we use protocol buffers to transport ASN.1-encoded
 data (SS7 TCAP messages). The protobuf API is far better than the API
 produced by the commercial ASN.1 compiler we use.


Yes, one of the nice things about simplicity is it makes it easier to do the
few things you do do well. Sometimes we developers tend to forget this. ;-)

-- 
Chris

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Java UTF-8 encoding/decoding: possible performance improvements

2010-05-18 Thread Christopher Smith

That seems simple enough and likely to produce a net win often enough.

--Chris

On May 17, 2010, at 9:33 PM, Kenton Varda ken...@google.com wrote:

 What if you did a fast scan of the bytes first to see if any are non-ASCII?  
 Maybe only do this fast scan if the data is short enough to fit in L1 cache?
 
 On Mon, May 17, 2010 at 7:59 PM, Christopher Smith cbsm...@gmail.com wrote:
 This does somewhat suggestive that it might be worthwhile specifically
 tagging a field as ASCII only. There are enough cases of this that it
 could be a huge win.
 
 
 On 5/17/10, Evan Jones ev...@mit.edu wrote:
  On May 17, 2010, at 15:38 , Kenton Varda wrote:
  I see.  So in fact your code is quite possibly slower in non-ASCII
  cases?  In fact, it sounds like having even one non-ASCII character
  would force extra copies to occur, which I would guess would defeat
  the benefit, but we'd need benchmarks to tell for sure.
 
  Yes. I've been playing with this a bit in my spare time since the last
  email, but I don't have any results I'm happy with yet. Rough notes:
 
  * Encoding is (quite a bit?) faster than String.getBytes() if you
  assume one byte per character.
  * If you guess the number bytes per character poorly and have to do
  multiple allocations and copies, the regular Java version will win. If
  you get it right (even if you first guess 1 byte per character) it
  looks like it can be slightly faster or on par with the Java version.
  * Re-using a temporary byte[] for string encoding may be faster than
  String.getBytes(), which effectively allocates a temporary byte[] each
  time.
 
 
  I'm going to try to rework my code with a slightly different policy:
 
  a) Assume 1 byte per character and attempt the encode. If we run out
  of space:
  b) Use a shared temporary buffer and continue the encode. If we run
  out of space:
  c) Allocate a worst case 4 byte per character buffer and finish the
  encode.
 
 
  This should be much better than the JDK version for ASCII, a bit
  better for short strings that fit in the shared temporary buffer,
  and not significantly worse for the rest, but I'll need to test it to
  be sure.
 
  This is sort of just a fun experiment for me at this point, so who
  knows when I may get around to actually finishing this.
 
  Evan
 
  --
  Evan Jones
  http://evanjones.ca/
 
  --
  You received this message because you are subscribed to the Google Groups
  Protocol Buffers group.
  To post to this group, send email to proto...@googlegroups.com.
  To unsubscribe from this group, send email to
  protobuf+unsubscr...@googlegroups.com.
  For more options, visit this group at
  http://groups.google.com/group/protobuf?hl=en.
 
 
 
 --
 Sent from my mobile device
 
 Chris
 

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] EnumValueDescriptor doesn't provide toString()?

2010-05-17 Thread Christopher Smith

I grok the problem now. This is the only descriptor that is also a value. 
Probably should have a method/visitor specifically for getting the value string 
of an object that isn't implemented by descriptors.

--Chris

On May 17, 2010, at 12:35 PM, Kenton Varda ken...@google.com wrote:

 Right, because none of the other field value types are descriptors.  I see 
 your point -- since getField() returns an Object, it would certainly be nice 
 to be able to call toString() on it without knowing the type.  But, it's also 
 important that EnumValueDescriptor be consistent with other descriptor 
 classes, so we want to be careful not to mess up that consistency.
 
 Instead of calling toString(), you could call TextFormat.printFieldToString() 
 to get a string representation of the field, although it will include the 
 field name.
 
 On Mon, May 10, 2010 at 9:17 PM, Christopher Smith cbsm...@gmail.com wrote:
 Actually, toString() seems to work for me for every other value I get from a 
 dynamic message *except* enums.
 
 --Chris
 
 On May 10, 2010, at 8:32 PM, Kenton Varda ken...@google.com wrote:
 
 I don't think we should add toString() to any of the descriptor classes 
 unless we are going to implement it for *all* of them in some consistent 
 way.  If we fill them in ad-hoc then they may be inconsistent, and we may 
 not be able to change them to make them consistent without breaking users.
 
 On Mon, May 10, 2010 at 9:49 AM, Christopher Smith cbsm...@gmail.com wrote:
 I noticed EnumValueDescriptor uses the default toString() method. Why not 
 override it to call getFullName()?
 
 --Chris
 
 --
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.
 
 
 

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Java UTF-8 encoding/decoding: possible performance improvements

2010-05-17 Thread Christopher Smith

This does somewhat suggestive that it might be worthwhile specifically
tagging a field as ASCII only. There are enough cases of this that it
could be a huge win.


On 5/17/10, Evan Jones ev...@mit.edu wrote:
 On May 17, 2010, at 15:38 , Kenton Varda wrote:
 I see.  So in fact your code is quite possibly slower in non-ASCII
 cases?  In fact, it sounds like having even one non-ASCII character
 would force extra copies to occur, which I would guess would defeat
 the benefit, but we'd need benchmarks to tell for sure.

 Yes. I've been playing with this a bit in my spare time since the last
 email, but I don't have any results I'm happy with yet. Rough notes:

 * Encoding is (quite a bit?) faster than String.getBytes() if you
 assume one byte per character.
 * If you guess the number bytes per character poorly and have to do
 multiple allocations and copies, the regular Java version will win. If
 you get it right (even if you first guess 1 byte per character) it
 looks like it can be slightly faster or on par with the Java version.
 * Re-using a temporary byte[] for string encoding may be faster than
 String.getBytes(), which effectively allocates a temporary byte[] each
 time.


 I'm going to try to rework my code with a slightly different policy:

 a) Assume 1 byte per character and attempt the encode. If we run out
 of space:
 b) Use a shared temporary buffer and continue the encode. If we run
 out of space:
 c) Allocate a worst case 4 byte per character buffer and finish the
 encode.


 This should be much better than the JDK version for ASCII, a bit
 better for short strings that fit in the shared temporary buffer,
 and not significantly worse for the rest, but I'll need to test it to
 be sure.

 This is sort of just a fun experiment for me at this point, so who
 knows when I may get around to actually finishing this.

 Evan

 --
 Evan Jones
 http://evanjones.ca/

 --
 You received this message because you are subscribed to the Google Groups
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.



-- 
Sent from my mobile device

Chris

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Deadlock problems with protobuf static initialization in Java

2010-05-15 Thread Christopher Smith

Never seen it before... and the Java code is pretty extensively used. Surely 
someone would have hit this before.

--Chris

On May 15, 2010, at 9:03 PM, Igor Gatis igorga...@gmail.com wrote:

 Have anyone experienced deadlock problems related to Java protobuf generated 
 messages static initialization?
 
 My multithreaded app seems to be stuck around internalBuildGeneratedFileFrom 
 method. Workaround so far was to move first reference to one of my generated 
 classes to out side of a synchronized block. I'm wondering whether protobuf 
 initialization is deadlock proof/free.
 
 -Gatis
 -- 
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

[protobuf] EnumValueDescriptor doesn't provide toString()?

2010-05-10 Thread Christopher Smith

I noticed EnumValueDescriptor uses the default toString() method. Why not 
override it to call getFullName()?

--Chris

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] EnumValueDescriptor doesn't provide toString()?

2010-05-10 Thread Christopher Smith

Actually, toString() seems to work for me for every other value I get from a 
dynamic message *except* enums.

--Chris

On May 10, 2010, at 8:32 PM, Kenton Varda ken...@google.com wrote:

 I don't think we should add toString() to any of the descriptor classes 
 unless we are going to implement it for *all* of them in some consistent way. 
  If we fill them in ad-hoc then they may be inconsistent, and we may not be 
 able to change them to make them consistent without breaking users.
 
 On Mon, May 10, 2010 at 9:49 AM, Christopher Smith cbsm...@gmail.com wrote:
 I noticed EnumValueDescriptor uses the default toString() method. Why not 
 override it to call getFullName()?
 
 --Chris
 
 --
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.
 
 

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Re: How to fed PB's generated serialized file to map reduce framework

2010-04-23 Thread Christopher Smith

PB serialization is deterministic, but not necessarily canonical depending on 
your canonicalization rules. I imagine that can cause issues. In particular, 
fields with default values look different depending on whether the default 
value was actually set, and of course the ordering of repeated fields is not 
enforced. You also might have versioning issues (a new field might not be 
relevant if in the context of the sort, said field is not defined). This can be 
addressed by some effort to canonicalize prior to serialization, but perhaps 
that is the issue. 

--Chris

On Apr 23, 2010, at 12:00 PM, Kenton Varda ken...@google.com wrote:

 If you are assuming that the serialization is canonical, can't you just 
 compare the raw bytes?
 
 On Fri, Apr 23, 2010 at 6:08 AM, Owen O'Malley omal...@apache.org wrote:
 On Apr 22, 8:33 pm, stuti awasthi stutic...@gmail.com wrote:
 
  I wanted to pass the Protocol Buffer generated serialized file
  directly to map reduce.
 
 I actually have a patch for Hadoop that does this. When my work
 load on security calms down, I'll clean it up and post it on Hadoop's
 jira.
 
 The one spot that Protocol Buffers doesn't give me what I need is
 in defining a RawComparator to support sorting Protocol Buffer keys.
 For those of you not in Hadoop, that means I need to be able to
 implement:
 
 int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2)
 
 for serialized Messages. The best approach that I can currently see
 is to walk through the Message's fields via getDescriptorForType
 and use the field's getType to compare the next field in each of the
 keys. It would have to assume the key's fields were in the sorted
 order, but that seems like a reasonable assumption for a single
 MapReduce job. Am I missing something? Is there already code
 that does this, in an Apache license friendly project?
 
 -- Owen
 
 
 --
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Re: Usage of Protocol Buffer with map reduce framework

2010-04-19 Thread Christopher Smith

We do stuff very much like this. Generally we try to encode the PB's as 
sequence files so that they are splittable. If that doesn't make sense, then we 
use non-splittable coded streams.

--Chris

On Apr 19, 2010, at 9:22 PM, stuti awasthi stutic...@gmail.com wrote:

 
 Thanks Jason,
 
 Yes map reduce works with the raw bytes of data. I was planning to
 have log analysis over hadoop using map reduce. Suppose I have ever
 increasing log files and I want to serialize its data and then pass
 over the Hadoop map reduce to get it analyzed. If this is the use case
 then do we have any mechanism through which we can feed the serialized
 data into the map reduce?
 
 Bye :)
 
 On Apr 19, 10:32 pm, Jason Hsueh jas...@google.com wrote:
 As far as I recall, mapreduce just works with raw bytes. If you want to use
 protocol buffers within map reduce, you just need to use the serialization
 routines to convert from the raw bytes to your proto.
 
 
 
 On Mon, Apr 19, 2010 at 4:06 AM, stuti awasthi stutic...@gmail.com wrote:
 Hi all,
 
 I just started looking at the Protocol Buffer for serialization of
 data structure. I wanted to use this with the map reduce framework. I
 searched on web but do not get any specific pointers on how to do it.
 
 If any body can please tell me how can we use PB with the map reduce
 framework. Im new to this concept of distributed serialization.
 
 Any pointers will be great.. Thanks in advance.
 
 Bye
 
 --
 You received this message because you are subscribed to the Google Groups
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.
 
 --
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group 
 athttp://groups.google.com/group/protobuf?hl=en.
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to 
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/protobuf?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Re: 2.3.0 released

2010-01-11 Thread Christopher Smith

I hate to quibble on this, but strictly speaking:

 for (int i = 0; i  some_vector.size(); i++)

is not perfectly valid unless you have verified that some_vector.size() 
static_castsize_t(std::numeric_limitsint::max());

This would be broken in cases of a very large vector (possible with a
vectorunsigned char on 32-bit Linux or that abomination vectorbool
almost anywhere), or if you had a platform with sizeof(int) 
sizeof(size_t), which is surprisingly common (for example, on Solaris:
http://docsun.cites.uiuc.edu/sun_docs/C/solaris_9/SUNWdev/SOL64TRANS/p6.html#CHAPTER3-TBL-3
).

Sometimes, compilers warn about really dangerous things.

--Chris

On Mon, Jan 11, 2010 at 10:35 AM, Kenton Varda ken...@google.com wrote:

 We get a lot of complaints about warnings in our code.  When the warnings
 occur in generated code, I generally fix them (because generated code
 generally must be compiled using your project's flags), but for warnings in
 protobuf code my answer is that you need to include the protobuf headers as
 system headers so that warnings are ignored.  Different projects make
 different choices about what warnings to enable on their code, so it seems
 unreasonable to expect that every one of your dependencies enable at least
 all of the warnings that you enable.  The sign comparison warnings are
 particularly annoying because they warn about every line that looks like:
   for (int i = 0; i  some_vector.size(); i++)
 which are very common and perfectly valid.  Therefore we compile with
 -fno-sign-compare.

 That said, this patch seems harmless, so I'll submit it.  But I won't be
 doing a new release just for this, and new warnings could easily be
 introduced before the next release.


 On Sun, Jan 10, 2010 at 3:13 AM, edan eda...@gmail.com wrote:

 I looks like a good workaround - thanks for the info.
 I will wait and see if Kenton is planning to fix this, then decide my next
 steps.
 --edan


 On Sun, Jan 10, 2010 at 10:59 AM, Monty Taylor mord...@inaugust.comwrote:



 edan wrote:
  I happily upgraded to 2.3.0 - I always like to take the latest and
 greatest.
  Unfortunately, and I think for the first time ever while upgrading
  protobuf, I ran into a problem!
  We compile our code with -Werror, and this bombed out on a header
 file:

 We build with errors on in our project too - our solution to this has
 been to:

 a) install the headers into a system locatiom, at which point gcc will
 not issue warnings for them. It looks like you did this in the context
 of your /devfs location - perhaps you need to change some system configs
 to completely understand that location as a chroot?

 b) If they aren't in a system location, include them via -isystem rather
 than -I, which will have the same effect.

  cc1plus: warnings being treated as errors
  ../../../devfs/usr/include/google/protobuf/io/coded_stream.h: In member
  function bool
 
 google::protobuf::io::CodedInputStream::ReadLittleEndian32(google::protobuf::uint32*):
  ../../../devfs/usr/include/google/protobuf/io/coded_stream.h:776:
  warning: comparison between signed and unsigned integer expressions
  ../../../devfs/usr/include/google/protobuf/io/coded_stream.h: In member
  function bool
 
 google::protobuf::io::CodedInputStream::ReadLittleEndian64(google::protobuf::uint64*):
  ../../../devfs/usr/include/google/protobuf/io/coded_stream.h:791:
  warning: comparison between signed and unsigned integer expressions
 
 
  My patch to fix this was:
 
  
 
 //depot/project/zenith/ports/protobuf/std/build/src/google/protobuf/io/coded_stream.h#2
  (ktext) -
 
 //depot/project/zenith/ports/protobuf/std/build/src/google/protobuf/io/coded_stream.h#3
  (ktext)  content
  776c776
 if (GOOGLE_PREDICT_TRUE(BufferSize() = sizeof(*value))) {
  ---
if (GOOGLE_PREDICT_TRUE(BufferSize() =
  static_castint(sizeof(*value {
  791c791
 if (GOOGLE_PREDICT_TRUE(BufferSize() = sizeof(*value))) {
  ---
if (GOOGLE_PREDICT_TRUE(BufferSize() =
  static_castint(sizeof(*value {
 
  Any chance you can patch this and re-release?  I'd really like to have
  un-patched code in our product, but I can't use 2.3.0 without this
 patch.




 --
 You received this message because you are subscribed to the Google Groups
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.




-- 
Chris
-- 

You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.

To post to this group, send email to proto...@googlegroups.com.

To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Re: Protocol Buffers using Lzip

2009-12-10 Thread Christopher Smith

One compression algo that I thought would be particularly useful with PB's
would be LZO. It lines up nicely with PB's goals of being fast and compact.
Have you thought about allowing an integrated LZO stream?

--Chris

On Wed, Dec 9, 2009 at 12:21 PM, Kenton Varda ken...@google.com wrote:

 Thanks for writing this code!  I'm sure someone will find it useful.

 That said, I'm wary of adding random stuff to the protobuf library.  gzip
 made sense because practically everyone has zlib, but lzlib is less
 universal.  Also, if we have lzip support built-in, should we also support
 bzip, rar, etc.?

 Also note that libprotobuf is already much larger (in terms of binary
 footprint) than it should be, so making it bigger is a tricky proposition.

 And finally, on a non-technical note, adding any code to the protobuf
 distribution puts maintenance work on me, and I'm overloaded as it is.

 Usually I recommend that people set up a googlecode project to host code
 like this, but lzip_stream may be a bit small to warrant that.  Maybe a
 project whose goal is to provide protobuf adaptors for many different
 compression formats?  Or a project for hosting random protobuf-related
 utility code in general?

 On Tue, Dec 8, 2009 at 4:17 AM, Jacob Rief jacob.r...@gmail.com wrote:

 Hello Brian, hello Kenton, hello list,
 as an alternative to GzipInputStream and GzipOutputStream I have
 written a compression and an uncompression stream class which are
 stackable into Protocol Buffers streams. They are named
 LzipInputStream and LzipOutputStream and use the Lempel-Ziv-Markov
 chain algorithm, as implemented by LZIP
 http://www.nongnu.org/lzip/lzip.html

 An advantage for using Lzip instead of Gzip is, that Lzip supports
 multi member compression. So one can jump into the stream at any
 position, forward up to the next synchronization boundary and start
 reading from there.
 Using the default compression level, Lzip has a better compression
 ratio at the cost of being slower than Gzip, but when Lzip is used
 with a low compression level, speed and output size of Lzip are
 comparable to that of Gzip.

 I would like to donate these classes to the ProtoBuf software
 repository. They will be released under an OSS license, compatible to
 LZIP and Google's. Could someone please check them and tell me in what
 kind of repository I can publish them. In Google's license agreements
 there is a passage telling: Neither the name of Google Inc. nor the
 names of its contributors may be used to endorse or promote products
 derived from this software without specific prior written permission.
 Since I have to use the name google in the C++ namespace of
 LzipIn/OutputStream, hereby I ask for permission to do so.

 Comments are appreciated,
 Jacob


  --
 You received this message because you are subscribed to the Google Groups
 Protocol Buffers group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.




-- 
Chris

--

You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

[protobuf] FileDescriptorSet mania

2009-11-29 Thread Christopher Smith

Given that there is no Java version of protoc, it has been tempting to see
how far one can get with just using FileDescriptorSet's and dynamic
messages. I ended up building this short function for iteratively adding all
the dependencies of a particular Message in to a FileDescriptorSet. While it
wasn't too much pain, it was a tad annoying, and I am wondering why this
kind of functionality isn't built in to the Java API. Here's what I ended up
with:

/**
 * Get the entire descriptor set needed to describe a message in
 * dependency order.
 *
 * @param aMessage the message to inspect
 * @return a complete descriptor set containing all the elements needed
to parse this message
 */

static FileDescriptorSet getDescriptorSetFor(final Message aMessage) {
final FileDescriptor file =
aMessage.getDescriptorForType().getFile();
final ListFileDescriptor fileDescriptors = new
ArrayListFileDescriptor();
fileDescriptors.add(file);

//Don't use an iter as we intend to keep looping until the list
//stops growing
for (int i = 0; i  fileDescriptors.size(); ++i) {
final FileDescriptor nextElement = fileDescriptors.get(i);
for (FileDescriptor dependency : nextElement.getDependencies())
{
if (!fileDescriptors.contains(dependency)) {
fileDescriptors.add(dependency);
}
}
}

final FileDescriptorSet.Builder builder =
FileDescriptorSet.newBuilder();
for (FileDescriptor descriptor : fileDescriptors) {
builder.addFile(descriptor.toProto());
}
return builder.build();
}

Is this really the right way to handle this, and if so, any chance we can
get this built in to the protobuf library?

-- 
Chris

--

You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: serialize message to UDP socket

2009-09-19 Thread Christopher Smith

Doesn't the UDP packet header effectively provide that length prefix for
you?
--Chris

On Fri, Sep 18, 2009 at 12:26 PM, jayt0...@gmail.com jayt0...@gmail.comwrote:


 One other thing I wanted to say was that I chose to use
 CodedOutputStream to send
 data because ultimately I have to manually encode a length prefix in
 front of my PB message.
 With the C++ environment, I understand that this is the only way to do
 this (ugh is right; I am sure this is a common problem with using PB
 over sockets that remain in use).
 I am fully aware that there are methods to serialize directly from the
 object but those will not serve my ultimate aim of getting a length
 prefix ahead of the data bytes.

 Thanks

 Jay

 On Sep 18, 12:19 pm, jayt0...@gmail.com jayt0...@gmail.com wrote:
  Hello all,
 
  I am having trouble figuring out how to serialize data over a socket
  utilizing UDP protocol.  I am in C++ environment.  When writing to the
  socket without protocol buffers, I use the standard sendto() socket
  call which allows me to specify the port and IP address of the
  intended receiver of my UDP message.  When trying to send a protocol
  buffers message, this seems to be the recommended strategy on the
  google docs:
 
  ZeroCopyOutputStream* raw_output   = new FileOutputStream
  (sock);
  CodedOutputStream*coded_output = new CodedOutputStream
  (raw_output);
  coded_output-WriteRaw(send_data,strlen(send_data));
 
  There is no way to specify what the port and IP address is here,
  analogous to when using the standard sendto() socket writing call.  So
  my message never gets received by the intended recipient on the
  network.  I am aware that this is a raw message, not a PB message.
  Getting this raw message over the network is a first step in
  accomplishing the ultimate goal of getting the PB message over the
  network.
 
  Is there a way to get all of the bytes of a serialized PB message into
  raw form and then send them with sendto()?
 
  Any ideas? Thanks for any help.
 
  Jay
 



-- 
Chris

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---

Re: java string literal too long when initializing java.lang.String descriptorData

2009-07-24 Thread Christopher Smith


Heh, I remember complaining about this to James when they first
published the byte code spec. It was annoying then and continues to be
annoying today.


On 7/24/09, Kenton Varda ken...@google.com wrote:
 How annoying.  I'll make sure this or something like it gets into the next
 release -- which I'm going to try to push next week.

 On Wed, Jul 22, 2009 at 8:36 AM, anonymous eric.pe...@hp.com wrote:


 Hello,

 I was not able to compile a Java file generated by protoc 2.10 from a
 rather big .proto file.
 It seems I have hit the upper limit for a Java string literal
 (65535???).

 I slightly modified src/google/protobuf/compiler/java/java_file.cc so
 that static initialization is performed from
 an array of literal strings in the case CEscape(file_data).size() 
 65535.

 Is this a real problem, or am I missing something ?

 Here is the patch:

 diff -r -u protobuf-2.1.0/src/google/protobuf/compiler/java/
 java_file.cc protobuf-2.1.0.new/src/google/protobuf/compiler/java/
 java_file.cc
 --- protobuf-2.1.0/src/google/protobuf/compiler/java/java_file.cc
 2009-05-13 16:36:30.0 -0400
 +++ protobuf-2.1.0.new/src/google/protobuf/compiler/java/java_file.cc
 2009-07-22 10:37:28.0 -0400
 @@ -207,6 +207,9 @@
   // This makes huge bytecode files and can easily hit the compiler's
 internal
   // code size limits (error code to large).  String literals are
 apparently
   // embedded raw, which is what we want.
 +  // In the case the FileDescriptorProto is too big for fitting into
 a string
 +  // literal, first creating ain array of string literals, then
 concatenating
 +  // them into the final FileDescriptorProto string.
   FileDescriptorProto file_proto;
   file_-CopyTo(file_proto);
   string file_data;
 @@ -218,22 +221,51 @@
   return descriptor;\n
 }\n
 private static com.google.protobuf.Descriptors.FileDescriptor\n
 -descriptor;\n
 -static {\n
 -  java.lang.String descriptorData =\n);
 -  printer-Indent();
 -  printer-Indent();
 +descriptor;\n);

 -  // Only write 40 bytes per line.
   static const int kBytesPerLine = 40;
 -  for (int i = 0; i  file_data.size(); i += kBytesPerLine) {
 -if (i  0) printer-Print( +\n);
 -printer-Print(\$data$\,
 -  data, CEscape(file_data.substr(i, kBytesPerLine)));
 -  }
 -  printer-Print(;\n);

 -  printer-Outdent();
 +  // Limit for a Java literal string is 65535
 +  bool stringTooLong = (CEscape(file_data).size()  65535);
 +
 +  if (stringTooLong) {
 +printer-Print(static {\n
 +java.lang.String descriptorDataArray[] = {\n);
 +printer-Indent();
 +printer-Indent();
 +
 +// Only write 40 bytes per line.
 +for (int i = 0; i  file_data.size(); i += kBytesPerLine) {
 +  if (i  0) printer-Print(,\n);
 +  printer-Print(\$data$\,
 +data, CEscape(file_data.substr(i, kBytesPerLine)));
 +}
 +printer-Outdent();
 +printer-Print(\n
 +};\n\n);
 +printer-Print(java.lang.String descriptorData = \\;\n);
 +printer-Print(for (String data : descriptorDataArray) {\n);
 +printer-Indent();
 +printer-Print(descriptorData += data;\n);
 +printer-Outdent();
 +printer-Print(}\n\n);
 +  } else {
 +printer-Print(static {\n
 +java.lang.String descriptorData =\n);
 +printer-Indent();
 +printer-Indent();
 +
 +// Only write 40 bytes per line.
 +static const int kBytesPerLine = 40;
 +for (int i = 0; i  file_data.size(); i += kBytesPerLine) {
 +  if (i  0) printer-Print( +\n);
 +printer-Print(\$data$\,
 +data, CEscape(file_data.substr(i, kBytesPerLine)));
 +  }
 +printer-Print(;\n);
 +
 +printer-Outdent();
 +  }

   //
 -
   // Create the InternalDescriptorAssigner.

 


 


-- 
Sent from my mobile device

Chris

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---

Re: Java deserialization - any best practices for performances?

2009-07-24 Thread Christopher Smith


The best way to think of it is:

Builder : Java Message :: C++ Message : const C++ Message

As far as performance goes, it is a common mistake to confuse C/C++
heap memory allocation costs to Java heap allocation. In the common
case, allocations in Java are just a few instructions... comperable to
stack allocations in C/C++. What normally gets you in Java is the
initialization cost, and in this particlar scenario there is no way
around that.

If you are worried, you could benchmark the difference between
constantly allocating builders as you go vs. starting with an array of
N builders (allocating the array would be done outside of the
benchmark). I am sure it will prove enlightening.


On 7/24/09, Kenton Varda ken...@google.com wrote:
 On Thu, Jul 23, 2009 at 7:15 PM, alopecoid alopec...@gmail.com wrote:

 Hmm... that strikes me as strange. I understand that the Message
 objects are immutable, but the Builders are as well? I thought that
 they would work more along the lines of String and StringBuilder,
 where String is obviously immutable and StringBuilder is mutable/
 reusable.


 The point is that it's the Message object that contains all the stuff
 allocated by the Builder, and therefore none of that stuff can actually be
 reused.  (When you call build(), nothing is copied -- it just returns the
 object that it has been working on.)  So reusing the builder itself is kind
 of useless, because it's just a trivial object containing one pointer (to
 the message object it is working on constructing).


 But while we're on the subject, I have been looking for some rough
 benchmarks comparing the performance of Protocol Buffers in Java
 versus C++. Do you (the collective you) have any [rough] idea as to
 how they compare performance wise? I am thinking more in terms of
 batch-style processing (disk I/O, parsing centric) rather than RPC
 centric usage patterns. Any experiences you can share would be great.


 I have some benchmarks that IIRC show that Java parsing and serialization is
 roughly half the speed of C++.  As I recall a lot of the speed difference is
 from UTF-8 decoding/encoding -- in C++ we just leave the bytes encoded, but
 in Java we need to decode them in order to construct standard String
 objects.

 I've been planning to release these benchmarks publicly but it will take
 some work and there's a lot of higher-priority stuff to do.  :/  (I think
 Jon Skeet did get the Java side of the benchmarks into SVN but there's no
 C++ equivalent yet.)

 


-- 
Sent from my mobile device

Chris

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---

Heuristic for getting a particular message type from a FileDescriptorSet

2009-07-09 Thread Christopher Smith

I'm trying to do some work using dynamic message, working from a
FileDescriptorSet produced by protoc. The heuristic I've come up with is
something like:
Parse the binary in to a FileDescriptorSet
for each FileDescriptorProto in the set
create the FileDescriptor from the FileDescriptorProto
invoke FindMessageTypeByName on the FileDescriptor
if I got a non-null answer, break from the loop
If I have no Descriptor ant this point, FAIL
Create a DynamicMessageFactory.
Get a Message by invoking GetPrototype on the message factory, while passing
the Descriptor

YEAH! We can now use the Message to parse new messages.

This seems fairly involved. Am I doing the right thing or is there an easier
way to do it? If not, does it make sense to maybe add methods to
FileDescriptorSet to simplify this (something like:
FileDescriptorSet.GetMessageByTypeName(String,DynamicMessageFactory))?

-- 
Chris

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---

Re: Thread-safe messages

2009-06-29 Thread Christopher Smith


I'd recommend using an atomic swap to do your updates. So you create
your new version of the PB localy, and then swap it in to the memory
location that is visible to all the other threads. The only real
downside is you stress the heap more, and that is probably
cheaper/simpler (particularly if you want the updates to be
transactional) than using extensive locking.

--Chris

On Mon, 2009-06-29 at 10:07 -0700, Jes wrote:
 I forgot to mention that we are generating C++ code in the project.
 
 Jes
 
 On 29 jun, 19:01, Jes damealpi...@hotmail.com wrote:
  Hi everybody,
 
  we are working on a distributed environment that uses PB, where
  different threads will access to the contents of messages that can be
  updated at any moment through the network.
 
  I wonder if there is an easy way to transform the (derived) Messages
  into a thread-safe version. Maybe the rough solution could be to
  include a Mutex in the Message class and a MutexLock on each method of
  the generated pb.h and pb.cc classes, but perhaps there are issues
  that can break the safety of this approach (such as existing friends
  or similar).
 
  Could you have any suggestion on this? :-)
 
  Thanks in advance!
 
  Jes
  


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---

Re: PB's vs ASN.1

2009-06-23 Thread Christopher Smith


No, but in short the advantages over ASN.1 can be summed up as
simpler, and for most cases, more efficient.


On 6/23/09, Jon M jonme...@yahoo.com wrote:

 Hello,

 The system I am currently working on uses ASN.1 at the heart of the
 client/server communication. I am evaluating PB's for another part of
 the system that hasn't been implemented yet and was curious if anyone
 can point me to any articles/blogs comparing and contrasting PB's and
 ASN.1?

 Thanks,
 Jon
 


-- 
Sent from my mobile device

Chris

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---

Re: 'Streaming' messages (say over a socket)

2009-06-15 Thread Christopher Smith

The normal way to do it is to send each Entity as a separate message.
CodedInput/OutputStream is handed for that kind of thing.

--Chris

On Sun, Jun 14, 2009 at 4:14 PM, Alex Black a...@alexblack.ca wrote:


 Is there a way to start sending a message before its fully composed?

 Say we have messages like this:

 message Entity
 {
required int32 id = 1;
required string name = 2;
 }

 message Entities
 {
   repeated Entity entity = 1;
 }

 If we're sending a message Entities with 1,000 Entity objects in it,
 is there a way to avoid composing the entire message in memory,
 serializing it, and then sending it out?

 I'd like to avoid allocating RAM for the entire message, and just send
 it out as I compose it...

 thx,

 - Alex
 



-- 
Chris

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---

Re: Communication methods

2009-05-24 Thread Christopher Smith

http://code.google.com/p/protobuf/wiki/RPCImplementations

On Sun, May 24, 2009 at 12:42 PM, SyRenity stas.os...@gmail.com wrote:


 Hi.

 Sorry if this was already asked in the past, but does Protocol Buffers
 provide some IPC communication methods?

 Or they provide the data encapsulation, and I should use any method I
 want?

 Regards.
 



-- 
Chris

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---

56 matches

Mail list logo