[ https://issues.apache.org/jira/browse/AVRO-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Fokko Driesprong resolved AVRO-2203. ------------------------------------ Resolution: Cannot Reproduce Can't open the link provided. Please resubmit the issue when it still persists with Avro 1.8.2 > avro module in python generates different bytes while writing file to local > storage and s3 > ------------------------------------------------------------------------------------------- > > Key: AVRO-2203 > URL: https://issues.apache.org/jira/browse/AVRO-2203 > Project: Apache Avro > Issue Type: Bug > Components: python > Affects Versions: 1.8.0 > Environment: S3. UNIX, HDFS, python > Reporter: Vinuthna > Priority: Blocker > > Hi, > I am trying to convert a csv file to avro format and store it on S3 storage > using python. During this process, I see that there is data loss in the file > written to s3 storage. This is confirmed by converting the avro file on local > storage and avro file on s3 storage to json format by comparing the content > and total number of lines present in each file. > A deep investigation into this issue shows that avro data generated while > writing to local storage is not exactly same as the avro data generated while > writing to s3 storage. > I suspect issue is in getting a writer object using DatumWriter. > writer = avro.datafile.DataFileWriter(<fileobject>, avro.io.DatumWriter(), > schema) > Exact code is present in git hub link below- > https://github.com/mpenkov/smart_open/blob/209/integration-tests/test_209.py > Could you please help solve this issue? > > Thanks > Vinuthna > -- This message was sent by Atlassian JIRA (v7.6.3#76005)