Hi,

On Sun, Feb 26, 2023 at 9:03 PM David Lacalle Castillo <
[email protected]> wrote:

> Good afternoon,
>
> I have some parquet data that was created using Avro Parquet Writer of
> Java. This parquet includes the Avro Schema inside the key
> parquet.avro.schema of the metadata, I want to convert this parquet data
> back to Avro using this schema and programmed in Rust. I have tried the
> following code, but I couldn't get this working:
>
>
>     let mut inputFile  = File::open("test.parquet").unwrap();
>     let builder =
> ParquetRecordBatchReaderBuilder::try_new(inputFile).unwrap();
>
>     let avroSchema =
> builder.schema().metadata.get("parquet.avro.schema").unwrap();
>
>     println!("Schema: {avroSchema}");
>     let avroSchema = Schema::parse_str(avroSchema).unwrap();
>

This is the Avro schema!


>
>     let mut reader = builder.build().unwrap();
>
>
>     let jsonTemp = File::create("file.json").unwrap();
>     // let buf: Vec<String> = Vec::new();
>     let mut jsonWriter = arrow_json::LineDelimitedWriter::new(jsonTemp);
>     for row in reader.into_iter() {
>         jsonWriter.write(row.unwrap()).unwrap();
>         jsonWriter.finish().unwrap();
>     }
>
>     let avroFile = File::create("res.avro").unwrap();
>     let mut avroWriter = Writer::new(&avroSchema, avroFile);
>
>     let jsonTemp = File::open("file.json").unwrap();
>     for row in serde_json::Deserializer::from_reader(jsonTemp).into_iter()
> {
>         let v: serde_json::Value = row.unwrap();
>         let avroSchema = Schema::parse(&v).unwrap();
>

This seems wrong!
&v is a row/record, not a schema.



>         let v: apache_avro::types::Value = v.into();
>
>         println!("Valid: {}", v.validate(&avroSchema));
>         //avroWriter.append(v).unwrap();
>     }
>
>     avroWriter.flush().unwrap();
>
> Any idea or advice? Has anyone tried to do the same?
>

Please share a demo application which we could use to debug the problem.
E.g. a Github project.



>
> Thanks in advance!
>
> Best regards,
> David
>

Reply via email to