[ 
https://issues.apache.org/jira/browse/BEAM-5315?focusedWorklogId=168067&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-168067
 ]

ASF GitHub Bot logged work on BEAM-5315:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 21/Nov/18 00:32
            Start Date: 21/Nov/18 00:32
    Worklog Time Spent: 10m 
      Work Description: aaltay closed pull request #6925: [BEAM-5315] Partially 
port IO: avro schema parsing and codecs
URL: https://github.com/apache/beam/pull/6925
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/python/apache_beam/io/avroio.py 
b/sdks/python/apache_beam/io/avroio.py
index 79c2505df4a..b648ad205ba 100644
--- a/sdks/python/apache_beam/io/avroio.py
+++ b/sdks/python/apache_beam/io/avroio.py
@@ -330,14 +330,14 @@ def offset(self):
 
   @staticmethod
   def _decompress_bytes(data, codec):
-    if codec == 'null':
+    if codec == b'null':
       return data
-    elif codec == 'deflate':
+    elif codec == b'deflate':
       # zlib.MAX_WBITS is the window size. '-' sign indicates that this is
       # raw data (without headers). See zlib and Avro documentations for more
       # details.
       return zlib.decompress(data, -zlib.MAX_WBITS)
-    elif codec == 'snappy':
+    elif codec == b'snappy':
       # Snappy is an optional avro codec.
       # See Snappy and Avro documentation for more details.
       try:
@@ -360,8 +360,10 @@ def num_records(self):
   def records(self):
     decoder = avroio.BinaryDecoder(
         io.BytesIO(self._decompressed_block_bytes))
-    reader = avroio.DatumReader(
-        writers_schema=self._schema, readers_schema=self._schema)
+
+    writer_schema = self._schema
+    reader_schema = self._schema
+    reader = avroio.DatumReader(writer_schema, reader_schema)
 
     current_record = 0
     while current_record < self._num_records:
diff --git a/sdks/python/apache_beam/io/filesystemio_test.py 
b/sdks/python/apache_beam/io/filesystemio_test.py
index f11e1c45623..0a038812b02 100644
--- a/sdks/python/apache_beam/io/filesystemio_test.py
+++ b/sdks/python/apache_beam/io/filesystemio_test.py
@@ -177,7 +177,7 @@ def _read_and_verify(self, stream, expected, buffer_size):
       data_list.append(data)
       bytes_read += len(data)
       self.assertEqual(stream.tell(), bytes_read)
-    self.assertEqual(''.join(data_list), expected)
+    self.assertEqual(b''.join(data_list), expected)
 
   def test_pipe_stream(self):
     block_sizes = list(4**i for i in range(0, 12))
diff --git a/sdks/python/apache_beam/io/tfrecordio_test.py 
b/sdks/python/apache_beam/io/tfrecordio_test.py
index 6421fa5c591..07480b22519 100644
--- a/sdks/python/apache_beam/io/tfrecordio_test.py
+++ b/sdks/python/apache_beam/io/tfrecordio_test.py
@@ -405,6 +405,9 @@ def test_process_auto(self):
         assert_that(result, equal_to(['foo', 'bar']))
 
 
[email protected](sys.version_info[0] == 3,
+                 'This test still needs to be fixed on Python 3'
+                 'TODO: BEAM-5623 - several IO tests hang indefinitely')
 class TestEnd2EndWriteAndRead(unittest.TestCase):
 
   def create_inputs(self):


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 168067)
    Time Spent: 7h  (was: 6h 50m)

> Finish Python 3 porting for io module
> -------------------------------------
>
>                 Key: BEAM-5315
>                 URL: https://issues.apache.org/jira/browse/BEAM-5315
>             Project: Beam
>          Issue Type: Sub-task
>          Components: sdk-py-core
>            Reporter: Robbe
>            Assignee: Simon
>            Priority: Major
>          Time Spent: 7h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to