[ 
https://issues.apache.org/jira/browse/SQOOP-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662525#comment-14662525
 ] 

Jarek Jarcec Cecho commented on SQOOP-2464:
-------------------------------------------

The original reason why we've decided to re-create all the workflow objects 
from scratch for each of the callback is that we don't want to allow connector 
developers to start depending on this as we might want to move the callbacks to 
different process in the future (even perhaps running on different machine). 
We're already taking advantage of that in 
[{{Destructor}}|https://github.com/apache/sqoop/blob/sqoop2/connector/connector-sdk/src/main/java/org/apache/sqoop/job/etl/Destroyer.java]
 class where each of the callbacks is actually called from different machines 
when using the default mapreduce execution engine.

I however feel that initialization of the connector and getting the schema will 
always belong to "initialization" phase that has to be done from single 
process. Hence I'm supportive of changing the semantics as suggested. We should 
however add tests that will ensure that object-reuse for those two methods is 
correctly done and document this behavior in our developer guide.

> Initializer object is not reused when calling getSchema
> -------------------------------------------------------
>
>                 Key: SQOOP-2464
>                 URL: https://issues.apache.org/jira/browse/SQOOP-2464
>             Project: Sqoop
>          Issue Type: Bug
>    Affects Versions: 1.99.6
>            Reporter: David Robson
>
> In JobManager there is two methods which are called one after the other - 
> "initializeConnector" and "getSchemaForConnector". Both these methods do the 
> same thing as the first step - create a new instance of the initializer class.
> If the same instance of the initializer was shared it means the class could 
> keep resources open (such as a connection to the database) and not have to 
> re-establish the connection. This might mean a close method needs to be added 
> to the initializers as otherwise the getSchema would need to close any 
> resources opened in the initialize call - which might seem a bit confusing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to