[ 
https://issues.apache.org/jira/browse/BEAM-1323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16046824#comment-16046824
 ] 

Stephen Sisk commented on BEAM-1323:
------------------------------------

can you elaborate on what exactly the SplittingFn would do? What parameters 
does it take, what does it return, how is that used by jdbcio and what are 
databases where we think it would work well?

> Add parallelism/splitting in JdbcIO
> -----------------------------------
>
>                 Key: BEAM-1323
>                 URL: https://issues.apache.org/jira/browse/BEAM-1323
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-extensions
>            Reporter: Jean-Baptiste Onofré
>            Assignee: Jean-Baptiste Onofré
>
> Now, the JDBC IO is basically a {{DoFn}} executed with a {{ParDo}}. So, it 
> means that parallelism is "limited" and executed on one executor.
> We can imagine to create several JDBC {{BoundedSource}}s splitting the SQL 
> query in  subset (for instance using row id paging or any "splitting/limit" 
> we can figure based on the original SQL query) (something similar to what 
> Sqoop is doing).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to