Hi Salim,

You have to contact d...@parquet.apache.org instead.
https://github.com/apache/parquet-mr/blob/master/parquet-avro/src/main/java/org/apache/parquet/avro/AvroSchemaConverter.java

Martin

On Tue, Dec 12, 2023 at 6:30 PM Salim Memon <salim.me...@capitalone.com>
wrote:

> Morning Martin,
>
> The library we are using to parse the parquet file is
> org.apache.parquet-hadoop-1.12.0 (ParquetFileReader.java), and to convert
> the file from parquet schema to avro schema we are using
> org.apache.parquet-avro-1.12.0 (AvroSchemaConverter.java). Here is the code
> snippet doing the work.
>
> Path schemaPath =
> HadoopParquetUtils.getFirstFullParquetPath(hadoopFilePath, configuration);
> ParquetFileReader r =
> ParquetFileReader.open(HadoopInputFile.fromPath(schemaPath, configuration));
> MessageType messageType = r.getFooter().getFileMetaData().getSchema();
> AvroSchemaConverter converter = new AvroSchemaConverter(configuration);
> Schema schema = converter.convert(messageType);
>
> Best,
>
> Salim Memon
> Cell: (832) 314 5518
>
>
>
> On Tue, Dec 12, 2023 at 3:00 AM Martin Grigorov <mgrigo...@apache.org>
> wrote:
>
>> Hi Salim,
>>
>> Could you please give more details about the Avro tool/library you use ?
>> I have the feeling you use some third party library that is not supported
>> by the Apache Avro team.
>>
>> Martin
>>
>> On Mon, Dec 11, 2023 at 9:47 PM Salim Memon
>> <salim.me...@capitalone.com.invalid> wrote:
>>
>>> Hi Devs,
>>>
>>> We are currently running into an issue where the parquet schema when
>>> reading from the footer of the file, contains the logical type decimal with
>>> a precision and scale. The field also contains the optional primitive type
>>> of int64 or int32. When we pass this through the Avro converter, it ends up
>>> returning a Long as the first check within the avro converter looks for
>>> primitive types first and so loses the decimal value.
>>>
>>> eg: 8.25 -> 825
>>>
>>> Attached are screenshots of the MessageType (parquet schema) and the
>>> output of the Avro converter. Is there anything I can do to retain the
>>> precision?
>>>
>>> Parquet-Avro version: 1.12.0
>>> Language: Java
>>> AvroReadSupport.READ_INT96_AS_FIXED, true
>>>
>>> Best,
>>>
>>> Salim Memon
>>> ------------------------------
>>>
>>> The information contained in this e-mail may be confidential and/or
>>> proprietary to Capital One and/or its affiliates and may only be used
>>> solely in performance of work or services for Capital One. The information
>>> transmitted herewith is intended only for use by the individual or entity
>>> to which it is addressed. If the reader of this message is not the intended
>>> recipient, you are hereby notified that any review, retransmission,
>>> dissemination, distribution, copying or other use of, or taking of any
>>> action in reliance upon this information is strictly prohibited. If you
>>> have received this communication in error, please contact the sender and
>>> delete the material from your computer.
>>>
>>>
>>>
>>>
>>> ------------------------------
>
> The information contained in this e-mail may be confidential and/or
> proprietary to Capital One and/or its affiliates and may only be used
> solely in performance of work or services for Capital One. The information
> transmitted herewith is intended only for use by the individual or entity
> to which it is addressed. If the reader of this message is not the intended
> recipient, you are hereby notified that any review, retransmission,
> dissemination, distribution, copying or other use of, or taking of any
> action in reliance upon this information is strictly prohibited. If you
> have received this communication in error, please contact the sender and
> delete the material from your computer.
>
>
>
>
>

Reply via email to