Kengo Seki created AIRFLOW-5730:
-----------------------------------

             Summary: Enable get_pandas_df on Druid and Pinot DbApiHooks
                 Key: AIRFLOW-5730
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5730
             Project: Apache Airflow
          Issue Type: Improvement
          Components: hooks
    Affects Versions: 1.10.5
            Reporter: Kengo Seki
            Assignee: Kengo Seki


Currently, DruidDbApiHook and PinotDbApiHook disable their {{get_pandas_df}} 
methods by raising {{NotImplementedError}}.
But they actually work as inherited from DbApiHook, as follows:

{code}
$ git diff
diff --git a/airflow/contrib/hooks/pinot_hook.py 
b/airflow/contrib/hooks/pinot_hook.py
index e617f8e9b..0864b3584 100644
--- a/airflow/contrib/hooks/pinot_hook.py
+++ b/airflow/contrib/hooks/pinot_hook.py
@@ -90,8 +90,5 @@ class PinotDbApiHook(DbApiHook):
     def set_autocommit(self, conn, autocommit):
         raise NotImplementedError()
 
-    def get_pandas_df(self, sql, parameters=None):
-        raise NotImplementedError()
-
     def insert_rows(self, table, rows, target_fields=None, commit_every=1000):
         raise NotImplementedError()
diff --git a/airflow/hooks/druid_hook.py b/airflow/hooks/druid_hook.py
index c3cd3cd71..e2e20f1ec 100644
--- a/airflow/hooks/druid_hook.py
+++ b/airflow/hooks/druid_hook.py
@@ -158,8 +158,5 @@ class DruidDbApiHook(DbApiHook):
     def set_autocommit(self, conn, autocommit):
         raise NotImplementedError()
 
-    def get_pandas_df(self, sql, parameters=None):
-        raise NotImplementedError()
-
     def insert_rows(self, table, rows, target_fields=None, commit_every=1000):
         raise NotImplementedError()
{code}

{code:title=Druid example}
$ airflow connections list

(snip)

├────────────────────────────────┼─────────────────────────────┼───────────────────────────┼────────┼────────────────┼──────────────────────┼────────────────────────────────┤
│ 'druid_broker_default'         │ 'druid-broker'              │ 'localhost'    
           │ 8082   │ False          │ True                 │ 
'gAAAAABdrxvt...M1ideRO8233QG' │
╘════════════════════════════════╧═════════════════════════════╧═══════════════════════════╧════════╧════════════════╧══════════════════════╧════════════════════════════════╛

$ ipython

(snip)

In [2]: from airflow.hooks.druid_hook import DruidDbApiHook                     
                                                                                
             

In [3]: DruidDbApiHook().get_pandas_df("SELECT * FROM wikipedia WHERE sum_delta 
> %(num)d", {"num": 2000})                                                      
             
[2019-10-23 23:28:18,606] {base_hook.py:89} INFO - Using connection to: id: 
druid_broker_default. Host: localhost, Port: 8082, Schema: None, Login: None, 
Password: None, extra: {'schema': 'http', 'endpoint': '/druid/v2/sql'}
[2019-10-23 23:28:18,607] {druid_hook.py:140} INFO - Get the connection to 
druid broker on localhost using user None
Out[3]: 
                       __time        channel cityName                           
                 comment  ...  sum_deleted sum_delta sum_metroCode              
      user
0    2015-09-12T00:00:00.000Z  #en.wikipedia           Archiving case from 
[[Wikipedia:Sockpuppet inv...  ...            0      3360             0         
          Bbb23
1    2015-09-12T00:00:00.000Z  #ja.wikipedia           
[[Special:Contributions/119.224.209.170|119.22...  ...            0      6853   
          0                 Kkairri
2    2015-09-12T01:00:00.000Z  #en.wikipedia                                    
         /* Hong Kong */  ...            0      4500             0              
   Bertaut
3    2015-09-12T01:00:00.000Z  #en.wikipedia           Archiving 1 
discussion(s) from [[User talk:New...  ...            0      3599             0 
 Lowercase sigmabot III
4    2015-09-12T01:00:00.000Z  #en.wikipedia           [[WP:AES|←]]Created page 
with '{{Infobox wildf...  ...            0     13335             0              
    Orygun
..                        ...            ...      ...                           
                     ...  ...          ...       ...           ...              
       ...
851  2015-09-12T23:00:00.000Z  #pt.wikipedia                Bem-vindo (usando 
[[WP:H|Huggle]])  (3.1.16)  ...            0      2588             0            
    Mobyduck
852  2015-09-12T23:00:00.000Z  #pt.wikipedia           adição de informação, 
renovação de conteúdos e...  ...            0      3666             0           
Templarius 01
853  2015-09-12T23:00:00.000Z  #ru.wikipedia           [[ВП:←|←]] Новая 
страница: «{{редактирую|~~~~|...  ...            0      6766             0      
           Dulamas
854  2015-09-12T23:00:00.000Z  #ru.wikipedia     Tver  [[ВП:×|отмена]] правки 
73302711 участника [[Sp...  ...            0      9302             0            
94.241.56.71
855  2015-09-12T23:00:00.000Z  #sr.wikipedia           Нова страница: 
[[Датотека:US Open.svg|десно|20...  ...            0     38443             0    
           Самарџија

[856 rows x 21 columns]
{code}

{code:title=Pinot example}
$ airflow connections list

(snip)

├────────────────────────────────┼─────────────────────────────┼───────────────────────────┼────────┼────────────────┼──────────────────────┼────────────────────────────────┤
│ 'pinot_broker_default'         │ 'pinot_broker_conn_id'      │ 'localhost'    
           │ 8000   │ False          │ True                 │ 
'gAAAAABdrxRj...Afd51PZY94nfa' │
├────────────────────────────────┼─────────────────────────────┼───────────────────────────┼────────┼────────────────┼──────────────────────┼────────────────────────────────┤

$ ipython

(snip)

In [2]: from airflow.contrib.hooks.pinot_hook import PinotDbApiHook

In [3]: PinotDbApiHook().get_pandas_df("select sum('runs') from baseballStats 
where yearID>=%(num)d group by playerName", {"num": 2000})
[2019-10-23 23:31:06,058] {base_hook.py:89} INFO - Using connection to: id: 
pinot_broker_default. Host: localhost, Port: 8000, Schema: None, Login: None, 
Password: None, extra: {'endpoint': '/query', 'schema': 'http'}
[2019-10-23 23:31:06,059] {pinot_hook.py:48} INFO - Get the connection to pinot 
broker on localhost
select sum('runs') from baseballStats where yearID>=2000 group by playerName
Out[3]: 
           playerName    sum_runs
0              Adrian  1820.00000
1        Jose Antonio  1692.00000
2              Rafael  1565.00000
3       Brian Michael  1500.00000
4        Jose Alberto  1426.00000
5  Alexander Emmanuel  1426.00000
6     Derek Sanderson  1390.00000
7              Carlos  1314.00000
8        Johnny David  1300.00000
9              Ichiro  1261.00000
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to