arekusuri opened a new pull request #2868: GOBBLIN-1025: Add retry for 
PK-Chuking iterator
URL: https://github.com/apache/incubator-gobblin/pull/2868
 
 
   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I 
have checked off all the steps below!
   
   
   ### JIRA
   https://issues.apache.org/jira/browse/GOBBLIN-1025
   
   
   ### Description
   In SFDC connector, there is a class called `ResultIterator` (I will change 
the name to SalesforceRecordIterator).
   It was using by only PK-Chunking currently. It encapsulated fetching a list 
of result files to a record iterator.
   
   However, the csvReader.nextRecord() may throw out network IO exception. We 
should do retry in this case.
   
   When a result file is fetched partly and one network IO exception happens, 
we are in a special situation - first half of the file is already fetched to 
our local, but another half of the file is still on datasource. 
   We need to
   1. reopen the file stream
   2. skip all the records that we already fetched, seek the cursor to the 
record which we haven't fetched yet.
   
   ### Tests
   
https://ltx1-holdemaz05.grid.linkedin.com:8443/executor?execid=21956300&job=salesforce_task_full&attempt=0
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to