[jira] [Commented] (BEAM-5623) Several IO tests hang indefinitely during execution on Python 3.

2018-10-12 Thread Simon (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648050#comment-16648050
 ] 

Simon commented on BEAM-5623:
-

I found that we can reproduce the 'hanging test' error in working IO tests by 
replacing string with bytes, although bytes are python2 'standard'. So this 
does not necessarily result in an error, but might make the test hang.

for example:

f.write('\n'.join(..)) -> f.write(*b*'\n'.join(..))

either fixes a problem or in another case makes the test hang

> Several IO tests hang indefinitely during execution on Python 3.
> 
>
> Key: BEAM-5623
> URL: https://issues.apache.org/jira/browse/BEAM-5623
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> test_read_empty_single_file_no_eol_gzip 
> (apache_beam.io.textio_test.TextSourceTest) 
> Also several tests cases in tfrecordio_test, for example:
> test_process_auto (apache_beam.io.tfrecordio_test.TestReadAllFromTFRecord)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5628) Several VcfIO tests fail in Python 3 with TypeError: cannot use a string pattern on a bytes-like object

2018-10-12 Thread Simon (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16647969#comment-16647969
 ] 

Simon commented on BEAM-5628:
-

This error can be traced back to the _create_generator function (io/vcfio.py: 
line 318), where it is mentioned that PyVCF has explicit str() calls when 
parsing INFO fields, which fails with UTF-8 decoded strings. For this reason, 
the line is encoded back to UTF-8 in the python2 version. 

Because removing the encoding step results in hanging of some tests, there is a 
chance this relates to 5623.

Does anyone have additional insights?

> Several VcfIO tests fail in Python 3 with  TypeError: cannot use a string 
> pattern on a bytes-like object
> 
>
> Key: BEAM-5628
> URL: https://issues.apache.org/jira/browse/BEAM-5628
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Simon
>Priority: Major
>
> ERROR: test_read_after_splitting (apache_beam.io.vcfio_test.VcfSourceTest)
> "
>  --
> Traceback (most recent call last):
>File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/vcfio_test.py"",
>  line 336, in test_read_after_splitting
> ] split_records.extend(source_test_utils.read_from_source(*source_info))
> ]   File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/source_test_utils.py"",
>  line 101, in read_from_source
>  for value in reader:
>File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/vcfio.py"",
>  line 264, in read_records
>  for line in record_iterator:
>File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/vcfio.py"",
>  line 330, in __next__
>  record = next(self._vcf_reader)
>File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/target/.tox/py3/lib/python3.5/site-packages/vcf/parser.py"",
>  line 543, in __next__
>  row = self._row_pattern.split(line.rstrip())
>  TypeError: cannot use a string pattern on a bytes-like object
> "



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5628) Several VcfIO tests fail in Python 3 with TypeError: cannot use a string pattern on a bytes-like object

2018-10-12 Thread Simon (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon reassigned BEAM-5628:
---

Assignee: Simon

> Several VcfIO tests fail in Python 3 with  TypeError: cannot use a string 
> pattern on a bytes-like object
> 
>
> Key: BEAM-5628
> URL: https://issues.apache.org/jira/browse/BEAM-5628
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Simon
>Priority: Major
>
> ERROR: test_read_after_splitting (apache_beam.io.vcfio_test.VcfSourceTest)
> "
>  --
> Traceback (most recent call last):
>File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/vcfio_test.py"",
>  line 336, in test_read_after_splitting
> ] split_records.extend(source_test_utils.read_from_source(*source_info))
> ]   File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/source_test_utils.py"",
>  line 101, in read_from_source
>  for value in reader:
>File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/vcfio.py"",
>  line 264, in read_records
>  for line in record_iterator:
>File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/vcfio.py"",
>  line 330, in __next__
>  record = next(self._vcf_reader)
>File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/target/.tox/py3/lib/python3.5/site-packages/vcf/parser.py"",
>  line 543, in __next__
>  row = self._row_pattern.split(line.rstrip())
>  TypeError: cannot use a string pattern on a bytes-like object
> "



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5624) Avro IO does not work with avro-python3 package out-of-the-box on Python 3, several tests fail with AttributeError (module 'avro.schema' has no attribute 'parse')

2018-10-12 Thread Simon (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon resolved BEAM-5624.
-
   Resolution: Fixed
Fix Version/s: 2.9.0

> Avro IO does not work with avro-python3 package out-of-the-box on Python 3, 
> several tests fail with AttributeError (module 'avro.schema' has no attribute 
> 'parse') 
> ---
>
> Key: BEAM-5624
> URL: https://issues.apache.org/jira/browse/BEAM-5624
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Simon
>Priority: Major
> Fix For: 2.9.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> ==
> ERROR: Failure: AttributeError (module 'avro.schema' has no attribute 'parse')
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/target/.tox/py3/lib/python3.5/site-packages/nose/failure.py",
>  line 39, in runTest
> raise self.exc_val.with_traceback(self.tb)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/target/.tox/py3/lib/python3.5/site-packages/nose/loader.py",
>  line 418, in loadTestsFromName
> addr.filename, addr.module)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/target/.tox/py3/lib/python3.5/site-packages/nose/importer.py",
>  line 47, in importFromPath
> return self.importFromDir(dir_path, fqname)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/target/.tox/py3/lib/python3.5/site-packages/nose/importer.py",
>  line 94, in importFromDir
> mod = load_module(part_fqname, fh, filename, desc)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/target/.tox/py3/lib/python3.5/imp.py",
>  line 234, in load_module
> return load_source(name, filename, file)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/target/.tox/py3/lib/python3.5/imp.py",
>  line 172, in load_source
> module = _load(spec)
>   File "", line 693, in _load
>   File "", line 673, in _load_unlocked
>   File "", line 673, in exec_module
>   File "", line 222, in _call_with_frames_removed
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/avroio_test.py",
>  line 54, in 
> class TestAvro(unittest.TestCase):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/avroio_test.py",
>  line 89, in TestAvro
> SCHEMA = avro.schema.parse('''
> AttributeError: module 'avro.schema' has no attribute 'parse'
> Note that we use a different implementation of avro/avro-python3 package 
> depending on Python version. We are also evaluating potential replacement of 
> avro with fastavro.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5321) Finish Python 3 porting for transforms module

2018-10-03 Thread Simon (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636640#comment-16636640
 ] 

Simon commented on BEAM-5321:
-

Hi [~altay], would it be possible to assign this task to [~Juta]? Thank you!

> Finish Python 3 porting for transforms module
> -
>
> Key: BEAM-5321
> URL: https://issues.apache.org/jira/browse/BEAM-5321
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5322) Finish Python 3 porting for typehints module

2018-10-01 Thread Simon (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16634109#comment-16634109
 ] 

Simon commented on BEAM-5322:
-

Hi, I will pick up typehints where Robbe left off. Nice to meet you all!

> Finish Python 3 porting for typehints module
> 
>
> Key: BEAM-5322
> URL: https://issues.apache.org/jira/browse/BEAM-5322
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Robbe
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)