To generate a file with a subset of fields you can specify a 'reader'
schema that contains only the desired fields. For example, if you
have a schema like:
{"type":"record","name":"Event","fields":[
{"name":"id","type":"int"},
{"name":"url","type":"string"},
{"name":"props","type":{"type":"array","items":{"type":"record","name":"Property","fields":[
{"name":"key","type":"int"},
{"name":"value","type":"string"}
]}]}
And you only want the ids and property values, then you can specify
the following when you create your GenericDatumReader:
{"type":"record","name":"Event","fields":[
{"name":"id","type":"int"},
{"name":"props","type":{"type":"array","items":{"type":"record","name":"Property","fields":[
{"name":"value","type":"string"}
]}]}
Perhaps we should add a --schema parameter to the tojson command-line
tool that does this?
Doug
On Fri, Mar 14, 2014 at 1:30 AM, Saravanan Nagarajan
<[email protected]> wrote:
> HI,
>
> I successfully converted the JSON file to avro format and i cloud able to
> see the json format using AVRO tool.
>
> But not i am trying to show only selected fields from the json file using
> java program and i cloud able to select specific column from the SIMPLE json
> file.
>
> In case of complex json file, i am not able to select column.
>
> For example:
>
> Assume, Employee records contain complex column with department details. Now
> i need to generate the JSON from avro with few column from employee and few
> column from departments.
>
> My program printed the selected column from the employee table, but not able
> to select from department columns. I used GenericDatumReader for reading the
> avro file.
>
> Please let me know if you have any suggestions.
>
> if you need the program, i can share separate mail.
>
> Thanks,
> Saravanan
>