Allow readFieldBegin() to pass back the field name instead of the field id
--------------------------------------------------------------------------
Key: THRIFT-1477
URL: https://issues.apache.org/jira/browse/THRIFT-1477
Project: Thrift
Issue Type: Improvement
Components: Java - Compiler
Reporter: Benjy Weinberger
Priority: Minor
[Apologies if this has been addressed in another issue. I couldn't find
anything relevant on JIRA or the mailing list archives.]
Background: I'm implementing a BSON protocol, in order to write Thrift messages
to MongoDB (technically the protocol generates the object representation that
the MongoDB driver expects, not a raw BSON string directly to the transport,
but that's an unimportant detail here).
BSON, like JSON, naturally uses human-readable string field names.
When reading, the generated Thrift code (at least in Java) requires that
readFieldBegin() pass back a TField with the id field set. It ignores the name
field. Therefore the ids must appear in the stream. It's possible to contort
these protocols to use ids instead of human-readable names (as TJSONProtocol
does) but this isn't helpful in dealing with prior BSON or JSON data that we're
trying to back-port into Thrift schemata.
However, the generated read() method already knows how to map names to ids. So
I propose allowing a TProtocol's readFieldBegin() method to pass back a TField
with the name set and no id set (indicated, say, by id==-1), and let the read()
method figure out the id to then switch on.
In some cases we could also allow the TField to omit the type information,
which, again, is not naturally present in JSON. (BSON does embed type
information, but its type system does not align fully with Thrift's, so it
can't be used without further context). If the field is unknown, the only use
for the type is for skipping the field value. But protocols like JSON and BSON
can skip fields without this type information, since fields are delimited in
the protocol in a type-independent way.
Basically, what I propose is that readFieldBegin() be allowed to pass back just
an id or just a name (and, for some protocols, no type information), since that
is all read() needs in order to figure out how to read or skip the field.
I'm wondering what the Thrift elders think of this. Has it been discussed?
Thanks!
PS This does have the downside that if Thrift were to implement a pass-through
feature for unrecognized fields (so that new messages read with old protocol
versions will serialize back out with no loss) - we wouldn't be able to
preserve fields for which we only had a name and no id. Or rather, we wouldn't
be able to write them out to a protocol that requires ids, like the binary
protocols. However this feature doesn't exist anyway, and I don't know if it's
on the roadmap.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira