kamilwu commented on a change in pull request #12927:
URL: https://github.com/apache/beam/pull/12927#discussion_r498217598



##########
File path: sdks/python/apache_beam/io/avroio.py
##########
@@ -627,11 +627,19 @@ def write_record(self, writer, value):
     writer.append(value)
 
 
+class _FastAvroWriter(Writer):
+  """An adapter class which exposes a file handle so that it can be closed
+  by the sink. """
+  def __init__(self, file_handle, schema, codec):
+    super(_FastAvroWriter, self).__init__(file_handle, schema, codec)

Review comment:
       > Was it a unit test failure that occurred locally?
   
   Yes.
   
   > I wonder why our non-cython test suite does not catch it.
   
   I think it doesn't really depend on whether cython is installed or not. `pip 
install fastavro` downloads a wheel built on the specific platform, which 
already contains libraries that can be imported. For example, on my Linux 
machine:
   
   ```
   pip install --force-reinstall fastavro==0.23.6
   
   Collecting fastavro==0.23.6
     Using cached fastavro-0.23.6-cp37-cp37m-manylinux2010_x86_64.whl (1.4 MB)
   ```
   
   ```
   ls site-packages/fastavro | grep write 
   
   json_write.py
   _logical_writers.cpython-37m-x86_64-linux-gnu.so
   logical_writers.py
   _logical_writers_py.py
   _write.cpython-37m-x86_64-linux-gnu.so   <-- this is the library I'm talking 
about
   write.py
   _write_py.py
   ```
   
   I'm pretty sure the same happens on Jenkins and on our users' machines. So 
how did I catch the bug? That `.so` library links with `libpython3.x.so`:
   
   ```
   ldd site-packages/fastavro/_write.cpython-37m-x86_64-linux-gnu.so
   
   linux-vdso.so.1 (0x00007ffd7abf7000)
   libpython3.7m.so.1.0 => not found
   libc.so.6 => /lib64/libc.so.6 (0x00007f8c548e3000)
   /lib64/ld-linux-x86-64.so.2 (0x00007f8c54d07000)
   ```
   which I didn't have on my machine due to misconfiguration (I already know 
it's pyenv fault, but this is a different story). That's why the first `import` 
statement failed, `ImportError` was raised, and pure Python version was used 
instead.
   
   To sum up, it is a problem, but the chances that someone will encounter this 
are pretty low. Anyway, I think it still makes sense to file a JIRA and test 
the workaround that you proposed. 
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to