Hi Doug, My name is Cui Peng. I want to implement the data communicate tool between json/xml/csv and avro data files as you described in the GSoC 2010 idea list. I exported AVRO source code,research its design and architect,then i got mainly idea about the tool, then i will show it to you,and expecting your advises :-)
I think there are mainly two parts of jobs to do: 1. Read/write json/xml/csv records from/to avro data files There are two steps: Step one: read/write json/xml/csv records to AVRO datum For json: AVRO supplies ParsingDecoder and JsonGenerator already,we can use these two classes to communicate data between AVRO datum and json data. For XML: I must extends the abstract ParsingDecoder,and build XMLDecoder class to parse data from XML file,and convert it to AVRO datum. And also,a XMLGenerator class which is used to change AVRO datum to XML data file is also necessary. This section need some XML parse jobs,may be Apache Xerces is a good choice, fortunately, i am familiar with it. For CSV: Also,i must build a CSVDecoder to convert CSV data to AVRO datum and a CSVGenerator class to convert AVRO datum to CSV files. This section need some operations with CSV data,I think Apache Commons csv can help us. Step two: read/write AVRO datum to avro data files AVRO has implemented this function already,so, i will not cost me much time and energy 2. A Swing based command-line tool,this tool will help us to execute some commands, collect data from user input etc. Step one give us data communicate support between json/xml/csv data files and avro data files,then,we should build the command-line tool and design its command system. 1).this tool will have three mode,json,xml or csv model,can use special command to swith working model 2).this tool will support two data input model,from keyborad or from exist data file 3).its command adopts command and argument form,for example,"input -f" means import data from existing data files,"input -k" means give user a graphics data input area,user can input data though keyboard 4).data output format function 5).if exception occurs, it will show in the tool That is all,if you have any ideas,please let me know. Thank you and best regards