[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15378897#comment-15378897
 ] 

ASF GitHub Bot commented on APEXMALHAR-2066:
--------------------------------------------

Github user asfgit closed the pull request at:

    https://github.com/apache/apex-malhar/pull/282


> Add jdbc poller input operator
> ------------------------------
>
>                 Key: APEXMALHAR-2066
>                 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2066
>             Project: Apache Apex Malhar
>          Issue Type: Task
>            Reporter: Ashwin Chandra Putta
>            Assignee: devendra tagare
>
> Create a JDBC poller input operator that has the following features.
> 1. poll from external jdbc store asynchronously in the input operator.
> 2. polling frequency and batch size should be configurable.
> 3. should be idempotent.
> 4. should be partition-able.
> 5. should be batch + polling capable.
> Assumptions for idempotency & partitioning,
> 1.User needs to provide tableName,dbConnection,setEmitColumnList,look-up key.
> 2.Optionally batchSize,pollInterval,Look-up key and a where clause can be 
> given.
> 3.This operator uses static partitioning to arrive at range queries for 
> exactly once reads.
> This operator will create a configured number of non-polling static 
> partitions for fetching the existing data in the table. And an additional
> single partition for polling additive data.
> 4.Assumption is that there is an ordered column using which range queries can 
> be formed.
> The *key* column, based on which the polling will happen, is any column which 
> has ever increasing values and supports greater than and less
> than operations in SQL. 
> 5.If an emitColumnList is provided, please ensure that the keyColumn is the 
> first column in the list
> 6.Range queries are formed using the JdbcMetaDataUtility Output - comma 
> separated list of the emit columns eg columnA,columnB,columnC
> 7. Only newly added data which has increasing ids will be fetched by the
>    polling jdbc partition
> Per window the first and the last key processed is saved using the 
> FSWindowDataManager - (<lowerBound,UpperBound>,operatorId,windowId).This 
> (lowerBound,upperBoundPair) is then used for recovery.The queries are 
> constructed using the JDBCMetaDataUtility.
> JDBCMetaDataUtility
> A utility class used to retrieve the metadata for a given unique key of a SQL 
> table. This class would emit range queries based on a primary index given.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to