autumnust opened a new pull request #3265:
URL: https://github.com/apache/gobblin/pull/3265


   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I 
have checked off all the steps below!
   
   
   ### JIRA
   - [ ] My PR addresses the following [Gobblin 
JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references 
them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
       - https://issues.apache.org/jira/browse/GOBBLIN-XXX
   
   
   ### Description
   - [ ] Here are some details about my PR, including screenshots (if 
applicable):
   The test in the `TestSingleTask` has been failing when being executed on CI. 
It is also reproducible by running: 
   ```./gradlew :gobblin-cluster:test --tests 
org.apache.gobblin.cluster.TestSingleTask``` 
   
   The real problem is more complicated and the proper fix will be out of the 
scope for now. The issue can be described in the following: 
   - Without overriding `shutdown` method in the `dummyExtrator`, it throws 
exception when being called. Note that this code path is only executed when 
`shutdownRequested` is set (only be set when a `Task` is being cancelled), as 
it wrote in `Task.java#510`, not through the regular shutdown of extractor. 
   - Before the change of GOBBLIN-1416 it was not surfacing up because the 
sync-barrier was placed in the wrong place. Will expand it into more details 
later. In short, from the log, it seems the cancel call and run call was 
running sequentially. (See the log paragraph I in the bottom)
   - After GOBBLIN-1416 this test failed, reasoning being the 
`org.apache.gobblin.runtime.GobblinMultiTaskAttempt#run #167` is returning as a 
result of cancel, so that the real cancel code-path was being executed so that 
the `shutdown` method in the `dummyExtractor` is called. However the default 
implementation in the `Extractor` interface throws an error and it eventually 
blow up the task. A supporting evidence for this process is the status of fork 
was printed as "RUNNING" from the failing tests, which means the return from 
`org.apache.gobblin.runtime.GobblinMultiTaskAttempt#run #167` doesn't handle 
the state of fork within the cancelled task. 
   
   The follow up of this PR should address: 
   - The sync-barrier, 
`org.apache.gobblin.cluster.SingleTask#_taskAttemptBuilt` should already be 
sync between the creation of `List<Task>` in the taskAttempt (instead of the 
creation of `taskAttempt` and cancel call). GOBBLIN-1416 was trying to solve 
this problem in a different way. I believe it could be unified.
   - The handling of fork state when a task is being shutdown is probably 
something that should be fixed also.
   
   
   
   Paragraph I 
   ```2021-04-19 16:23:09 PDT INFO  [pool-32-thread-1] 
org.apache.gobblin.cluster.SingleTask 225 - Task cancelled: Shutdown starting 
for tasks with jobId: testJob
       2021-04-19 16:23:09 PDT INFO  [pool-32-thread-1] 
org.apache.gobblin.runtime.GobblinMultiTaskAttempt 255 - Shutting down tasks
       2021-04-19 16:23:09 PDT INFO  [pool-32-thread-1] 
org.apache.gobblin.cluster.SingleTask 227 - Task cancelled: Shutdown complete 
for tasks with jobId: testJob
       2021-04-19 16:23:09 PDT INFO  [pool-32-thread-2] 
org.apache.gobblin.runtime.GobblinMultiTaskAttempt$2 510 - Task creation 
attempt 1
       2021-04-19 16:23:09 PDT WARN  [pool-32-thread-2] 
org.apache.gobblin.metrics.MetricContext$Builder 714 - MetricContext with 
specified name already exists, appending UUID to the given name: 
2f98aa98-1c81-4770-85e9-7731ee3afff1
       2021-04-19 16:23:09 PDT INFO  [pool-32-thread-2] 
org.apache.gobblin.runtime.TaskExecutor 259 - Submitting task randomTask
       2021-04-19 16:23:09 PDT WARN  [TaskExecutor-0] 
org.apache.gobblin.runtime.Task 362 - Synchronous task execution model is 
deprecated. Please consider using stream model.
       2021-04-19 16:23:09 PDT INFO  [pool-32-thread-2] 
org.apache.gobblin.runtime.GobblinMultiTaskAttempt 167 - Waiting for submitted 
tasks of job testJob to complete in container ...
       2021-04-19 16:23:09 PDT INFO  [pool-32-thread-2] 
org.apache.gobblin.runtime.GobblinMultiTaskAttempt 175 - 1 out of 1 tasks of 
job testJob are running in container
       2021-04-19 16:23:09 PDT INFO  [TaskExecutor-0] 
org.apache.gobblin.runtime.TaskExecutor 280 - Submitting fork 0 of task 
randomTask
       2021-04-19 16:23:09 PDT INFO  [TaskExecutor-0] 
org.apache.gobblin.runtime.Task 460 - Task mode streaming = false
       2021-04-19 16:23:09 PDT INFO  [ForkExecutor-0] 
org.apache.gobblin.runtime.TaskContext 375 - Found configured writer builder as 
org.apache.gobblin.cluster.InMemoryWuSingleTask$DummyDataWriterBuilder``` 
   
   
   ### Tests
   Running `./gradlew :gobblin-cluster:test --tests 
org.apache.gobblin.cluster.TestSingleTask` now passes, and the logging seems to 
be right after setting `testLogging.showStandardStreams = true`
   
   ### Commits
   - [ ] My commits all reference JIRA issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
       1. Subject is separated from body by a blank line
       2. Subject is limited to 50 characters
       3. Subject does not end with a period
       4. Subject uses the imperative mood ("add", not "adding")
       5. Body wraps at 72 characters
       6. Body explains "what" and "why", not "how"
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to