Trying to follow the quickstart at
http://pythonhosted.org/airflow/start.html (inside a fresh virtualenv,
under ubuntu 14.04)
I get the following error:
(venv)brian@cfprov:~/airflow$ airflow initdb
[2016-05-31 17:02:25,939] {__init__.py:36} INFO - Using executor
SequentialExecutor
[2016-05-31 17:02:26,049] {driver.py:120} INFO - Generating grammar
tables from /usr/lib/python2.7/lib2to3/Grammar.txt
[2016-05-31 17:02:26,114] {driver.py:120} INFO - Generating grammar
tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
DB: sqlite:////home/brian/airflow/airflow.db
[2016-05-31 17:02:26,361] {db.py:222} INFO - Creating tables
INFO [alembic.runtime.migration] Context impl SQLiteImpl.
INFO [alembic.runtime.migration] Will assume non-transactional DDL.
INFO [alembic.runtime.migration] Running upgrade -> e3a246e0dc1,
current schema
INFO [alembic.runtime.migration] Running upgrade e3a246e0dc1 ->
1507a7289a2f, create is_encrypted
/home/brian/airflow/venv/local/lib/python2.7/site-packages/alembic/util/messaging.py:69:
UserWarning: Skipping unsupported ALTER for creation of implicit constraint
warnings.warn(msg)
INFO [alembic.runtime.migration] Running upgrade 1507a7289a2f ->
13eb55f81627, maintain history for compatibility with earlier migrations
INFO [alembic.runtime.migration] Running upgrade 13eb55f81627 ->
338e90f54d61, More logging into task_isntance
INFO [alembic.runtime.migration] Running upgrade 338e90f54d61 ->
52d714495f0, job_id indices
INFO [alembic.runtime.migration] Running upgrade 52d714495f0 ->
502898887f84, Adding extra to Log
INFO [alembic.runtime.migration] Running upgrade 502898887f84 ->
1b38cef5b76e, add dagrun
INFO [alembic.runtime.migration] Running upgrade 1b38cef5b76e ->
2e541a1dcfed, task_duration
INFO [alembic.runtime.migration] Running upgrade 2e541a1dcfed ->
40e67319e3a9, dagrun_config
INFO [alembic.runtime.migration] Running upgrade 40e67319e3a9 ->
561833c1c74b, add password column to user
INFO [alembic.runtime.migration] Running upgrade 561833c1c74b ->
4446e08588, dagrun start end
INFO [alembic.runtime.migration] Running upgrade 4446e08588 ->
bbc73705a13e, Add notification_sent column to sla_miss
INFO [alembic.runtime.migration] Running upgrade bbc73705a13e ->
bba5a7cfc896, Add a column to track the encryption state of the 'Extra'
field in connection
INFO [alembic.runtime.migration] Running upgrade bba5a7cfc896 ->
1968acfc09e3, add is_encrypted column to variable table
INFO [alembic.runtime.migration] Running upgrade 1968acfc09e3 ->
2e82aab8ef20, rename user table
ERROR [airflow.models.DagBag] Failed to import:
/home/brian/airflow/venv/local/lib/python2.7/site-packages/airflow/example_dags/example_twitter_dag.py
Traceback (most recent call last):
File
"/home/brian/airflow/venv/local/lib/python2.7/site-packages/airflow/models.py",
line 247, in process_file
m = imp.load_source(mod_name, filepath)
File
"/home/brian/airflow/venv/local/lib/python2.7/site-packages/airflow/example_dags/example_twitter_dag.py",
line 26, in <module>
from airflow.operators import BashOperator, HiveOperator,
PythonOperator
ImportError: cannot import name HiveOperator
Done.
(venv)brian@cfprov:~/airflow$
Airflow *does* appear to work (the web UI comes up), but I also get the
same error logged for the execution of the example_bash_operator code:
[2016-05-31 17:04:16,451] {models.py:154} INFO - Filling up the DagBag from
/home/brian/airflow/dags/example_dags/example_bash_operator.py
[2016-05-31 17:04:16,452] {models.py:250} ERROR - Failed to import:
/home/brian/airflow/venv/local/lib/python2.7/site-packages/airflow/example_dags/example_twitter_dag.py
Traceback (most recent call last):
File
"/home/brian/airflow/venv/local/lib/python2.7/site-packages/airflow/models.py",
line 247, in process_file
m = imp.load_source(mod_name, filepath)
File
"/home/brian/airflow/venv/local/lib/python2.7/site-packages/airflow/example_dags/example_twitter_dag.py",
line 26, in <module>
from airflow.operators import BashOperator, HiveOperator, PythonOperator
ImportError: cannot import name HiveOperator
If I try it from the command line:
>>> from airflow.operators.hive_operator import HiveOperator
[2016-05-31 17:24:59,316] {__init__.py:36} INFO - Using executor
SequentialExecutor
[2016-05-31 17:24:59,427] {driver.py:120} INFO - Generating grammar
tables from /usr/lib/python2.7/lib2to3/Grammar.txt
[2016-05-31 17:24:59,484] {driver.py:120} INFO - Generating grammar
tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File
"/home/brian/airflow/venv/local/lib/python2.7/site-packages/airflow/operators/hive_operator.py",
line 4, in <module>
from airflow.hooks import HiveCliHook
ImportError: cannot import name HiveCliHook
And again:
>>> from airflow.hooks.hive_hooks import HiveCliHook
[2016-05-31 17:27:11,716] {__init__.py:36} INFO - Using executor
SequentialExecutor
[2016-05-31 17:27:11,824] {driver.py:120} INFO - Generating grammar
tables from /usr/lib/python2.7/lib2to3/Grammar.txt
[2016-05-31 17:27:11,872] {driver.py:120} INFO - Generating grammar
tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File
"/home/brian/airflow/venv/local/lib/python2.7/site-packages/airflow/hooks/hive_hooks.py",
line 19, in <module>
import unicodecsv as csv
ImportError: No module named unicodecsv
Aha. This error has been hidden by magic.After "pip install unicodecsv"
it seems happy:
>>> from airflow.operators import HiveOperator
[2016-05-31 17:28:15,828] {__init__.py:36} INFO - Using executor
SequentialExecutor
[2016-05-31 17:28:15,936] {driver.py:120} INFO - Generating grammar
tables from /usr/lib/python2.7/lib2to3/Grammar.txt
[2016-05-31 17:28:15,984] {driver.py:120} INFO - Generating grammar
tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
>>>
So: I think the "airflow" package is missing a dependency on "unicodecsv".
Regards,
Brian Candler.
P.S. Side problem: I wanted to create an issue on JIRA but couldn't see
how. I have an Apache JIRA account (username "candlerb"). Under the
"Create" menu it offers me two choices:
* New JIRA Project # really, I can do this?!
* Create Service Desk Request # whatever that is
But all I wanted to do was create an issue. Am I missing something
obvious? Or are only project developers allowed to raise issues?