> > But when tired using Spark streamng I could not find a way to store the > data with the avro schema information. The closest that I got was to create > a Dataframe using the json RDDs and store them as parquet. Here the parquet > files had a spark specific schema in their footer. >
Does this cause a problem? This is just extra information that we use to store metadata that parquet doesn't directly support, but I would still expect other systems to be able to read it.