Status: New
Owner: liuj...@google.com
Labels: Type-Defect Priority-Medium
New issue 494 by r...@rdna.ru: In Python API
google.protobuf.text_format.Merge() fails if ``text'' param contains
semicolon(s).
http://code.google.com/p/protobuf/issues/detail?id=494
What steps will reproduce the problem?
1. Create simple .proto-file and compile it:
19:47 0 user@host:~/pb_bug>>cat reproduce.proto
package reproduce;
message Person {
optional int32 id = 1;
optional string name = 2;
}
message People {
repeated Person person = 1;
}
19:47 0 user@host:~/pb_bug>>protoc --python_out=. reproduce.proto
19:47 0 user@host:~/pb_bug>>
2. Create simple python script:
19:48 0 user@host:~/pb_bug>>cat reproduce.py
#!/usr/bin/env python
import sys
import reproduce_pb2
from google.protobuf.text_format import Merge
ppl_msg = reproduce_pb2.People()
with open(sys.argv[1]) as pt_fd:
Merge(pt_fd.read(), ppl_msg)
print ppl_msg
3. Create 2 ASCII files with and without using semicolons:
19:48 0 user@host:~/pb_bug>>cat without_semicolon.pb.txt
person {
id: 3
name: "foo"
}
19:48 0 user@host:~/pb_bug>>cat with_semicolon.pb.txt
person {
id: 3;
name: "foo";
}
4. Try to use content of files from step #3 as ``text'' param to
google.protobuf.text_format.Merge() function:
19:48 0 user@host:~/pb_bug>>./reproduce.py without_semicolon.pb.txt
person {
id: 3
name: "foo"
}
19:48 0 user@host:~/pb_bug>>./reproduce.py with_semicolon.pb.txt
Traceback (most recent call last):
File "./reproduce.py", line 9, in <module>
Merge(pt_fd.read(), ppl_msg)
File "/skynet/python/lib/python2.6/site-packages/protobuf-2.3.0-py2.6.egg/google/protobuf/text_format.py",
line 138, in Merge
_MergeField(tokenizer, message)
File "/skynet/python/lib/python2.6/site-packages/protobuf-2.3.0-py2.6.egg/google/protobuf/text_format.py",
line 216, in _MergeField
_MergeField(tokenizer, sub_message)
File "/skynet/python/lib/python2.6/site-packages/protobuf-2.3.0-py2.6.egg/google/protobuf/text_format.py",
line 172, in _MergeField
name = tokenizer.ConsumeIdentifier()
File "/skynet/python/lib/python2.6/site-packages/protobuf-2.3.0-py2.6.egg/google/protobuf/text_format.py",
line 406, in ConsumeIdentifier
raise self._ParseError('Expected identifier.')
google.protobuf.text_format.ParseError: 3:3 : Expected identifier.
19:49 1 user@host:~/pb_bug>>echo $?
1
What is the expected output? What do you see instead?
I expected that file with semicolons would be parsed successfully but
parser fails. As I can see C++ API would not fail on the same input.
What version of the product are you using? On what operating system?
protobuf 2.4.1 / 2.5.0 on FreeBSD 10.0-CURRENT / Linux 3.2.0-25-server.
Please provide any additional information below.
The issue can be fixed by attached patch. The patch uses the same approach
as in C++ API (not to fail on ``;'' and ``,'' symbols).
Attachments:
patch-python_google_protobuf_text_format.py 518 bytes
--
You received this message because this project is configured to send all
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings
--
You received this message because you are subscribed to the Google Groups "Protocol
Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to protobuf+unsubscr...@googlegroups.com.
To post to this group, send email to protobuf@googlegroups.com.
Visit this group at http://groups.google.com/group/protobuf?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.