On 03 Apr 2009, at 17:21, Chad Walters wrote:

In addition, he wants to have it all to work dynamically so, for example, a Python script used in Hadoop Streaming can read the header and pull fields out of records in the stream without needing to have the generated bindings.

FWIW, in OSIS I do something similar: Thrift serialization without generating the bindings, not generating Thrift IDL at all, basically, but using the Thrift protocol (and Python library) to serialize objects of 'my' type, infering all required object model information from a different model description format (Python classes inheriting from a certain base class with fields of specific types defined, think Django models).

Embedding model definition metadata in an object(list) should be trivial using a similar method.

Thrift related code is at [1].

OSIS also contains some (rather trivial) code for this whole 'data model description once, then several data rows' system, where the base input is a tuple of 2 items, the first being a list of all 'columns' (every item in the list is a list as well defining column name and data type), the second one being a list of lists where every sublist equals one 'document', see [2].

FWIW, maybe some ideas might be reusable (yes, that whole codebase needs more documentation).

Nicolas

[1] http://osisproject.org/browser/osis/model/serializers/_thrift.py
[2] http://osisproject.org/browser/osis/client/view.py

Reply via email to