[ 
https://issues.apache.org/jira/browse/AIRFLOW-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151924#comment-16151924
 ] 

Siddharth commented on AIRFLOW-1560:
------------------------------------

The PR addresses airflow integration with AWS Dynamodb.  

Currently there is no hook to interact with DynamoDb for reading or writing 
items (single or batch insertions). To get started, we want to push data in 
DynamoDB using airflow jobs (scheduled daily). Idea is to read aggregates from 
S3 and push in DynamoDB (write data job will run everyday to make this happen). 
First we want to create DynamoDB hooks (this PR addressed the same) and then 
create operator to move data from S3 to DynamoDB.

I noticed that currently airflow has AWS_HOOK (parent hook for connecting to 
AWS using credentials stored in configs). It has a function to connect to AWS 
objects using Client API 
(http://boto3.readthedocs.io/en/latest/reference/services/dynamodb.html#client) 
which is specific to EMR_HOOK. But in case of inserting data we can use 
DynamoDB Resource API 
(http://boto3.readthedocs.io/en/latest/reference/services/dynamodb.html#service-resource)
 which provides higher level abstractions for inserting data in DynamoDB). One 
good question to ask can be difference between client and resource and why use 
one or the other? "Resources are higher-level abstraction than the raw, 
low-level calls made by service clients. They can't do anything the clients 
can't do, but in many cases they are nicer to use. The downside is that they 
don't always support 100% of the features of a service." 
(http://boto3.readthedocs.io/en/latest/guide/resources.html) 


> Add AWS DynamoDB hook for inserting batch items
> -----------------------------------------------
>
>                 Key: AIRFLOW-1560
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1560
>             Project: Apache Airflow
>          Issue Type: New Feature
>          Components: aws, boto3, hooks
>            Reporter: Siddharth
>            Assignee: Siddharth
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to