[protobuf] size of protoc byte array

2018-05-17 Thread hiyo828
Hello:

For protocol buffer Java implementation, May I know whether there is a 
official/unofficial document/benchmark about the size comparison between 
protoc byte array and raw jsong string. And performance of converting Json 
string to byte[] and from byte[] back to Json string? 

Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to protobuf+unsubscr...@googlegroups.com.
To post to this group, send email to protobuf@googlegroups.com.
Visit this group at https://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.


Re: [protobuf] Making message reader strict, rejecting non-canonical messages

2018-05-17 Thread Dmitry Timofeev

I see, thank you very much for the explanation!


On 16.05.18 21:44, Feng Xiao wrote:



On Wed, May 16, 2018 at 7:39 AM Dmitry Timofeev 
> wrote:


Hi,

I consider if Protocol Buffers can be possibly used in an
application that requires canonical representation of messages
coming from external source.

The encoding and proto3 guide [1, 2] include several requirements
for a parser that make it accept non-canonical data (this list is
probably not exhaustive):
  - Message fields may appear in any order
  - There are might be multiple instances of the same
/non-repeated/ field
  - Message may contain unknown fields
  - ¿Default values of primitives may appear on the wire
  - Map entries may appear in any order
  - Repeated fields of primitives may be packed or unpacked.

1. Is there any natural way to extend the parser with checks of
canonical form?

No.

By "natural" I mean a compiler and/or runtime plugin, something
that does not require a fork of the project.

2. If not, does such optional feature make sense in Protocol
Buffers? Would you accept an option that makes the generated
reader code 'strict', rejecting non-canonical representations,
and, consequently, not forward-compatible?

Also no here. Compatibility is one of the main reasons to use protobuf 
because it allows you to evolve your protocol without breaking anyone 
in a complex system. If you don't need compatibility at all (i.e., you 
will never change your protocol), using a C++ struct will be much more 
performant than protobuf because you can skip the whole 
parsing/serialization cost.


There is a way to mimic the behavior you want though:
1. parse the input data to a proto message
2. check if the proto message has any unknown fields; if any, report error
3. serialize the proto message using deterministic serialization 
(https://github.com/google/protobuf/blob/master/src/google/protobuf/io/coded_stream.h#L842)
4. compare the serialized data against the input data; if they match, 
the input data is in the "canonical form"; if not, report error


It will incur an additional serialization cost, but can get you close 
enough to the canonical form.



Thanks,
Dmitry

[1] https://developers.google.com/protocol-buffers/docs/encoding
[2] https://developers.google.com/protocol-buffers/docs/proto3

-- 
You received this message because you are subscribed to the Google

Groups "Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to protobuf+unsubscr...@googlegroups.com
.
To post to this group, send email to protobuf@googlegroups.com
.
Visit this group at https://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.








--
THIS COMMUNICATION AND ANY ATTACHMENTS MAY CONTAIN CONFIDENTIAL INFORMATION OF 
THE SENDER. ALL UNAUTHORIZED USE, DISCLOSURE OR DISTRIBUTION IS PROHIBITED. IF 
YOU ARE NOT THE INTENDED RECIPIENT, PLEASE NOTIFY THE SENDER IMMEDIATELY AND 
DESTROY ALL COPIES OF THIS COMMUNICATION. THANK YOU.

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to protobuf+unsubscr...@googlegroups.com.
To post to this group, send email to protobuf@googlegroups.com.
Visit this group at https://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.


Re: [protobuf] Protobuf3 InvalidProtocolBufferException with some strings

2018-05-17 Thread Marc Gravell
(mainly for the list) see also the stackoverflow question:
https://stackoverflow.com/q/50387660/23354

On Thu, 17 May 2018 at 10:42, Alexey Vishnyakov 
wrote:

> Hello
>
> We using protobuf v.3 to transfer messages from C# client to Java server
> over HTTP.
>
> The message proto looks like this:
>
> message CLIENT_MESSAGE {
> string message = 1;
> }
>
> Both client and server uses UTF-8 character encoding for strings.
>
> Everything is fine whe we are using short string values like "abc", but
> when we trying to transfer string with 198 chars in it, we catchig an
> Exception:
>
>
>com.google.protobuf.InvalidProtocolBufferException:
> While parsing a protocol message, the input ended unexpectedly in the
> middle of a field. This could mean either that the input has been
> truncated or that an embedded message misreported its own length.
>
>
> We tried to compare even byte array containing protobuf data, and didn't
> found a solution.
> For "aaa" string byte array starts with this bytes:
>
> 10 3 97 97 97
>
>
> Where 10 is protobuf field number, and 3 is string length, 69 65 67 is
> "aaa".
>
> For string
>
>
>> "aa"
>
>
> which contains 198 characters in it, byte array starts with this:
>
> 10 198 1 97 97 97
>
>
> Where 10 is protobuf field number, and 198 is string length, and 1 seems
> to be like string identifier, or what?
>
>
> And why protobuf cannot parse this message?
>
> Already spent almost a day on looking for solution for this problem, any
> help appreciated.
>
> Thanks in advance.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to protobuf+unsubscr...@googlegroups.com.
> To post to this group, send email to protobuf@googlegroups.com.
> Visit this group at https://groups.google.com/group/protobuf.
> For more options, visit https://groups.google.com/d/optout.
>


-- 
Regards,

Marc

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to protobuf+unsubscr...@googlegroups.com.
To post to this group, send email to protobuf@googlegroups.com.
Visit this group at https://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.


[protobuf] Protobuf3 InvalidProtocolBufferException with some strings

2018-05-17 Thread Alexey Vishnyakov
Hello

We using protobuf v.3 to transfer messages from C# client to Java server 
over HTTP.

The message proto looks like this:

message CLIENT_MESSAGE {
string message = 1;
}

Both client and server uses UTF-8 character encoding for strings.

Everything is fine whe we are using short string values like "abc", but 
when we trying to transfer string with 198 chars in it, we catchig an 
Exception:

  
   com.google.protobuf.InvalidProtocolBufferException: 
While parsing a protocol message, the input ended unexpectedly in the 
middle of a field. This could mean either that the input has been truncated 
or that an embedded message misreported its own length.


We tried to compare even byte array containing protobuf data, and didn't 
found a solution.
For "aaa" string byte array starts with this bytes:

10 3 97 97 97


Where 10 is protobuf field number, and 3 is string length, 69 65 67 is 
"aaa".

For string 

"aa"

 
which contains 198 characters in it, byte array starts with this:

10 198 1 97 97 97


Where 10 is protobuf field number, and 198 is string length, and 1 seems to 
be like string identifier, or what?


And why protobuf cannot parse this message?

Already spent almost a day on looking for solution for this problem, any 
help appreciated.

Thanks in advance.

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to protobuf+unsubscr...@googlegroups.com.
To post to this group, send email to protobuf@googlegroups.com.
Visit this group at https://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.