[jira] [Closed] (BEAM-4752) Import error in apache_beam.internal.pickler: "'module' object has no attribute 'dill'"

2018-07-10 Thread Barry Hart (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barry Hart closed BEAM-4752.

   Resolution: Invalid
Fix Version/s: 2.4.0

> Import error in apache_beam.internal.pickler: "'module' object has no 
> attribute 'dill'"
> ---
>
> Key: BEAM-4752
> URL: https://issues.apache.org/jira/browse/BEAM-4752
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.4.0
> Environment: CentOS Linux release 7.4.1708
> Python 2.7.13
>Reporter: Barry Hart
>Assignee: Ahmet Altay
>Priority: Major
> Fix For: 2.4.0
>
>
> I'm seeing the following error (stack trace below). I looked at the module 
> structure of the {{dill}} library, and it does not have a {{dill}} submodule 
> (although it *does* have a {{_dill}} submodule). I think the correct way to 
> reference {{Pickler}} is simply {{dill.Pickler.}}
> {noformat}
> Traceback (most recent call last):
>   File "script/beam_run_model.py", line 29, in 
> import apache_beam as beam
>   File 
> "/usr/local/pyenv/versions/2.7.13/lib/python2.7/site-packages/apache_beam/__init__.py",
>  line 84, in 
> import apache_beam.internal.pickler
>   File 
> "/usr/local/pyenv/versions/2.7.13/lib/python2.7/site-packages/apache_beam/internal/pickler.py",
>  line 107, in 
> dill.dill.Pickler.dispatch[type])
> AttributeError: 'module' object has no attribute 'dill'{noformat}
> Oddly, I have successfully used Beam 2.4.0 in the past with this version of 
> Dill.  ¯_(ツ)_/¯



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4752) Import error in apache_beam.internal.pickler: "'module' object has no attribute 'dill'"

2018-07-10 Thread Barry Hart (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539141#comment-16539141
 ] 

Barry Hart commented on BEAM-4752:
--

Apparently something odd happened when I was installing local Python libraries. 
I suspect I got a newer version of {{dill}}, as newer versions have 
{{_dill.py}} but not {{dill.py}}.

Closing...

> Import error in apache_beam.internal.pickler: "'module' object has no 
> attribute 'dill'"
> ---
>
> Key: BEAM-4752
> URL: https://issues.apache.org/jira/browse/BEAM-4752
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.4.0
> Environment: CentOS Linux release 7.4.1708
> Python 2.7.13
>Reporter: Barry Hart
>Assignee: Ahmet Altay
>Priority: Major
>
> I'm seeing the following error (stack trace below). I looked at the module 
> structure of the {{dill}} library, and it does not have a {{dill}} submodule 
> (although it *does* have a {{_dill}} submodule). I think the correct way to 
> reference {{Pickler}} is simply {{dill.Pickler.}}
> {noformat}
> Traceback (most recent call last):
>   File "script/beam_run_model.py", line 29, in 
> import apache_beam as beam
>   File 
> "/usr/local/pyenv/versions/2.7.13/lib/python2.7/site-packages/apache_beam/__init__.py",
>  line 84, in 
> import apache_beam.internal.pickler
>   File 
> "/usr/local/pyenv/versions/2.7.13/lib/python2.7/site-packages/apache_beam/internal/pickler.py",
>  line 107, in 
> dill.dill.Pickler.dispatch[type])
> AttributeError: 'module' object has no attribute 'dill'{noformat}
> Oddly, I have successfully used Beam 2.4.0 in the past with this version of 
> Dill.  ¯_(ツ)_/¯



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-2810) Consider a faster Avro library in Python

2018-07-10 Thread Barry Hart (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539079#comment-16539079
 ] 

Barry Hart commented on BEAM-2810:
--

I am a fairly frequent contributor to the fastavro library. I'm happy to try 
and help if it needs some tweaks.

FWIW, the library is pretty mature and works well for our project. It has 
several small changes from time to time, but generally nothing major. Probably 
the last big changes were late 2017, when we did some Cython work to make reads 
about 30% faster and writes about 2x faster. 

> Consider a faster Avro library in Python
> 
>
> Key: BEAM-2810
> URL: https://issues.apache.org/jira/browse/BEAM-2810
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> https://stackoverflow.com/questions/45870789/bottleneck-on-data-source
> Seems like this job is reading Avro files (exported by BigQuery) at about 2 
> MB/s.
> We use the standard Python "avro" library which is apparently known to be 
> very slow (10x+ slower than Java) 
> http://apache-avro.679487.n3.nabble.com/Avro-decode-very-slow-in-Python-td4034422.html,
>  and there are alternatives e.g. https://pypi.python.org/pypi/fastavro/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4752) Import error in apache_beam.internal.pickler: "'module' object has no attribute 'dill'"

2018-07-10 Thread Barry Hart (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539075#comment-16539075
 ] 

Barry Hart commented on BEAM-4752:
--

I can work around the problem by doing the following before {{import 
apache_beam}}.

 
{code:java}
import dill
if not hasattr(dill, 'dill'):
dill.dill = dill._dill{code}

> Import error in apache_beam.internal.pickler: "'module' object has no 
> attribute 'dill'"
> ---
>
> Key: BEAM-4752
> URL: https://issues.apache.org/jira/browse/BEAM-4752
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.4.0
> Environment: CentOS Linux release 7.4.1708
> Python 2.7.13
>Reporter: Barry Hart
>Assignee: Ahmet Altay
>Priority: Major
>
> I'm seeing the following error (stack trace below). I looked at the module 
> structure of the {{dill}} library, and it does not have a {{dill}} submodule 
> (although it *does* have a {{_dill}} submodule). I think the correct way to 
> reference {{Pickler}} is simply {{dill.Pickler.}}
> {noformat}
> Traceback (most recent call last):
>   File "script/beam_run_model.py", line 29, in 
> import apache_beam as beam
>   File 
> "/usr/local/pyenv/versions/2.7.13/lib/python2.7/site-packages/apache_beam/__init__.py",
>  line 84, in 
> import apache_beam.internal.pickler
>   File 
> "/usr/local/pyenv/versions/2.7.13/lib/python2.7/site-packages/apache_beam/internal/pickler.py",
>  line 107, in 
> dill.dill.Pickler.dispatch[type])
> AttributeError: 'module' object has no attribute 'dill'{noformat}
> Oddly, I have successfully used Beam 2.4.0 in the past with this version of 
> Dill.  ¯_(ツ)_/¯



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-4752) Import error in apache_beam.internal.pickler: "'module' object has no attribute 'dill'"

2018-07-10 Thread Barry Hart (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barry Hart updated BEAM-4752:
-
Description: 
I'm seeing the following error (stack trace below). I looked at the module 
structure of the {{dill}} library, and it does not have a {{dill}} submodule 
(although it *does* have a {{_dill}} submodule). I think the correct way to 
reference {{Pickler}} is simply {{dill.Pickler.}}
{noformat}
Traceback (most recent call last):
  File "script/beam_run_model.py", line 29, in 
import apache_beam as beam
  File 
"/usr/local/pyenv/versions/2.7.13/lib/python2.7/site-packages/apache_beam/__init__.py",
 line 84, in 
import apache_beam.internal.pickler
  File 
"/usr/local/pyenv/versions/2.7.13/lib/python2.7/site-packages/apache_beam/internal/pickler.py",
 line 107, in 
dill.dill.Pickler.dispatch[type])
AttributeError: 'module' object has no attribute 'dill'{noformat}
Oddly, I have successfully used Beam 2.4.0 in the past with this version of 
Dill.  ¯_(ツ)_/¯

  was:
I'm seeing the following error. The {{dill}} library has no {{dill}} submodule. 
I think the correct way to reference {{Pickler}} is simply {{dill.Pickler.}}
{noformat}
Traceback (most recent call last):
  File "script/beam_run_model.py", line 29, in 
import apache_beam as beam
  File 
"/usr/local/pyenv/versions/2.7.13/lib/python2.7/site-packages/apache_beam/__init__.py",
 line 84, in 
import apache_beam.internal.pickler
  File 
"/usr/local/pyenv/versions/2.7.13/lib/python2.7/site-packages/apache_beam/internal/pickler.py",
 line 107, in 
dill.dill.Pickler.dispatch[type])
AttributeError: 'module' object has no attribute 'dill'{noformat}
Oddly, I have successfully used Beam 2.4.0 in the past with this version of 
Dill.  ¯\_(ツ)_/¯


> Import error in apache_beam.internal.pickler: "'module' object has no 
> attribute 'dill'"
> ---
>
> Key: BEAM-4752
> URL: https://issues.apache.org/jira/browse/BEAM-4752
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.4.0
> Environment: CentOS Linux release 7.4.1708
> Python 2.7.13
>Reporter: Barry Hart
>Assignee: Ahmet Altay
>Priority: Major
>
> I'm seeing the following error (stack trace below). I looked at the module 
> structure of the {{dill}} library, and it does not have a {{dill}} submodule 
> (although it *does* have a {{_dill}} submodule). I think the correct way to 
> reference {{Pickler}} is simply {{dill.Pickler.}}
> {noformat}
> Traceback (most recent call last):
>   File "script/beam_run_model.py", line 29, in 
> import apache_beam as beam
>   File 
> "/usr/local/pyenv/versions/2.7.13/lib/python2.7/site-packages/apache_beam/__init__.py",
>  line 84, in 
> import apache_beam.internal.pickler
>   File 
> "/usr/local/pyenv/versions/2.7.13/lib/python2.7/site-packages/apache_beam/internal/pickler.py",
>  line 107, in 
> dill.dill.Pickler.dispatch[type])
> AttributeError: 'module' object has no attribute 'dill'{noformat}
> Oddly, I have successfully used Beam 2.4.0 in the past with this version of 
> Dill.  ¯_(ツ)_/¯



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4752) Import error in apache_beam.internal.pickler: "'module' object has no attribute 'dill'"

2018-07-10 Thread Barry Hart (JIRA)
Barry Hart created BEAM-4752:


 Summary: Import error in apache_beam.internal.pickler: "'module' 
object has no attribute 'dill'"
 Key: BEAM-4752
 URL: https://issues.apache.org/jira/browse/BEAM-4752
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Affects Versions: 2.4.0
 Environment: CentOS Linux release 7.4.1708
Python 2.7.13
Reporter: Barry Hart
Assignee: Ahmet Altay


I'm seeing the following error. The {{dill}} library has no {{dill}} submodule. 
I think the correct way to reference {{Pickler}} is simply {{dill.Pickler.}}
{noformat}
Traceback (most recent call last):
  File "script/beam_run_model.py", line 29, in 
import apache_beam as beam
  File 
"/usr/local/pyenv/versions/2.7.13/lib/python2.7/site-packages/apache_beam/__init__.py",
 line 84, in 
import apache_beam.internal.pickler
  File 
"/usr/local/pyenv/versions/2.7.13/lib/python2.7/site-packages/apache_beam/internal/pickler.py",
 line 107, in 
dill.dill.Pickler.dispatch[type])
AttributeError: 'module' object has no attribute 'dill'{noformat}
Oddly, I have successfully used Beam 2.4.0 in the past with this version of 
Dill.  ¯\_(ツ)_/¯



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)