[jira] [Commented] (FALCON-36) Ability to ingest data from databases

Ajay Yadava (JIRA) Mon, 03 Aug 2015 21:01:19 -0700

    [ 
https://issues.apache.org/jira/browse/FALCON-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653027#comment-14653027
 ]


Ajay Yadava commented on FALCON-36:
-----------------------------------

[~me.venkatr] I understand your point. We can allow for various functionalities 
by having different types of datasources instead of top level entities. If I 
have to give an analogy then oozie has several different types of actions and 
each action has lot of different capabilities and parameters. However they are 
all still actions. Similarly we can have different types of datasources and 
accommodate various parameters but we shouldn't have them as top level 
entities. 

Another example of such behaviour is feed lifecycles. By the rationale of 
common denominator of capabilities import and retention are completely 
different but we look at them as just another lifecycle of the feed.

The common denominator is not in capabilities but in what they represent - they 
are all sources of data and you can import data from them. This seemingly 
pedantic difference is very important IMHO because it simplifies a lot of 
things. It's easy to build a new feature and make it available to all the data 
sources. It will otherwise be very confusing to have streaming feeds and Kafka 
entities.

> Ability to ingest data from databases
> -------------------------------------
>
>                 Key: FALCON-36
>                 URL: https://issues.apache.org/jira/browse/FALCON-36
>             Project: Falcon
>          Issue Type: Improvement
>          Components: acquisition
>    Affects Versions: 0.3
>            Reporter: Venkatesh Seetharam
>            Assignee: Venkat Ramachandran
>         Attachments: FALCON-36.patch, FALCON-36.patch.2, 
> FALCON-36.rebase.patch, FALCON-36.review.patch, Falcon Data Ingestion - 
> Proposal.docx, falcon-36.xsd.patch.1
>
>
> Attempt to address data import from RDBMS into hadoop and export of data from 
> Hadoop into RDBMS. The plan is to use sqoop 1.x to materialize data motion 
> from/to RDBMS to/from HDFS. Hive will not be integrated in the first pass 
> until Falcon has a first class integration with HCatalog.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FALCON-36) Ability to ingest data from databases

Reply via email to