We used mapreduce for ETL and storing results in Avro files, which are loaded to hive/impala for query.
Now we are trying to migrate to spark, but didn't find a way to write resulting RDD to Avro files. I wonder if there is a way to make it, or if not, why spark doesn't support Avro as well as mapreduce? Are there any plans? Or what's the recommended way to output spark results with schema? I don't think plain text is a good choice.