[ 
https://issues.apache.org/jira/browse/BEAM-6748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16787140#comment-16787140
 ] 

Valentyn Tymofieiev commented on BEAM-6748:
-------------------------------------------

The block size is hardcoded in avro library and is different between Python2 
and Python3:

[https://github.com/apache/avro/blob/f173ae8d690b5f90e8cc5899b654762a9d11e17d/lang/py/src/avro/datafile.py#L39]

[https://github.com/apache/avro/blob/f173ae8d690b5f90e8cc5899b654762a9d11e17d/lang/py3/avro/datafile.py#L57]
 

Probably something similar happens in fastavro library.

> Splitting logic in Avro IO tests behaves unexpectedly in Python 3
> -----------------------------------------------------------------
>
>                 Key: BEAM-6748
>                 URL: https://issues.apache.org/jira/browse/BEAM-6748
>             Project: Beam
>          Issue Type: Sub-task
>          Components: sdk-py-core
>            Reporter: Valentyn Tymofieiev
>            Assignee: Valentyn Tymofieiev
>            Priority: Major
>
> *apache_beam.io.avroio_test.TestAvro.test_split_points*
> *apache_beam.io.avroio_test.TestFastAvro.test_split_points*
> fail with:
>  
> {code:java}
> Traceback (most recent call last):
>  File "/home/robbe/workspace/beam/sdks/python/apache_beam/io/avroio_test.py", 
> line 308, in test_split_points
>  self.assertEquals(split_points_report[-10:], [(2, 1)] * 10)
> AssertionError: Lists differ: [(10, 1), (10, 1), (10, 1), (10, 1), (10, 1[42 
> chars], 1)] != [(2, 1), (2, 1), (2, 1), (2, 1), (2, 1), (2[32 chars], 1)]
> First differing element 0:
> (10, 1)
> (2, 1)
> + [(2, 1), (2, 1), (2, 1), (2, 1), (2, 1), (2, 1), (2, 1), (2, 1), (2, 1), 
> (2, 1)]
> - [(10, 1),
> - (10, 1),
> - (10, 1),
> - (10, 1),
> - (10, 1),
> - (10, 1),
> - (10, 1),
> - (10, 1),
> - (10, 1),
> - (10, 1)] {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to