[ 
https://issues.apache.org/jira/browse/GOBBLIN-1437?focusedWorklogId=589969&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-589969
 ]

ASF GitHub Bot logged work on GOBBLIN-1437:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 27/Apr/21 19:16
            Start Date: 27/Apr/21 19:16
    Worklog Time Spent: 10m 
      Work Description: aplex commented on a change in pull request #3268:
URL: https://github.com/apache/gobblin/pull/3268#discussion_r621508901



##########
File path: 
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/job_monitor/KafkaJobMonitor.java
##########
@@ -100,17 +100,29 @@ protected void shutdownMetrics()
   @Override
   protected void processMessage(DecodeableKafkaRecord<byte[],byte[]> message) {
     try {
-      Collection<Either<JobSpec, URI>> parsedCollection = 
parseJobSpec(message.getValue());
-      for (Either<JobSpec, URI> parsedMessage : parsedCollection) {
-        if (parsedMessage instanceof Either.Left) {
-          this.newSpecs.inc();
-          this.jobCatalog.put(((Either.Left<JobSpec, URI>) 
parsedMessage).getLeft());
-        } else if (parsedMessage instanceof Either.Right) {
-          this.removedSpecs.inc();
-          URI jobSpecUri = ((Either.Right<JobSpec, URI>) 
parsedMessage).getRight();
-          this.jobCatalog.remove(jobSpecUri, true);
-          // Delete the job state if it is a delete spec request
-          deleteStateStore(jobSpecUri);
+      Collection<JobSpec> parsedCollection = parseJobSpec(message.getValue());
+      for (JobSpec parsedMessage : parsedCollection) {
+        SpecExecutor.Verb verb = 
SpecExecutor.Verb.valueOf(parsedMessage.getMetadata().get(SpecExecutor.VERB_KEY));
+
+        switch (verb) {
+          case ADD:
+          case UPDATE:
+          case UNKNOWN: // unknown are considered as add request to maintain 
backward compatibility

Review comment:
       Treating all unknown messages as "add" seems risky. I'm thinking about 
the case when we add a new verb, messages are generated for it, and then we 
need to rollback GaaS. Will the old GaaS see those messages as "unknown" and 
treat them as "add"?
   
   Also, is this a backward-compatibility with recent version of the service, 
or  with some ancient one? If we still need this compatibility, does it make 
sense to log something, so we'll at least see that this branch is executed?

##########
File path: 
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/job_catalog/NonObservingFSJobCatalog.java
##########
@@ -105,6 +105,14 @@ public synchronized void remove(URI jobURI) {
       remove(jobURI, false);
   }
 
+  /**
+   * Removes a job from job catalog.
+   * If it is a flow delete request from service side, then 
alwaysTriggerListeners should be set to true so that a
+   * running job will also be cancelled. If it is a remove request because a 
job finishes or has been submitted to helix,
+   * then alwaysTriggerListeners should be set to false so that job 
cancellation is not triggered.
+   * @param jobURI job uri
+   * @param alwaysTriggerListeners if it is true, a running job will be 
cancelled.

Review comment:
       This architecture looks a bit strange to me. And looks like this API 
ambiguity was the source of the original bug.
   
   Since removal from the catalog does not always mean cancelling the job, 
should this job cancellation be moved out of the catalog code and be done 
explicitly by the callers? E.g. caller can first cancel the job, if they want, 
and then remove it from the catalog.

##########
File path: 
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/job_monitor/KafkaJobMonitor.java
##########
@@ -100,17 +100,29 @@ protected void shutdownMetrics()
   @Override
   protected void processMessage(DecodeableKafkaRecord<byte[],byte[]> message) {
     try {
-      Collection<Either<JobSpec, URI>> parsedCollection = 
parseJobSpec(message.getValue());
-      for (Either<JobSpec, URI> parsedMessage : parsedCollection) {
-        if (parsedMessage instanceof Either.Left) {
-          this.newSpecs.inc();
-          this.jobCatalog.put(((Either.Left<JobSpec, URI>) 
parsedMessage).getLeft());
-        } else if (parsedMessage instanceof Either.Right) {
-          this.removedSpecs.inc();
-          URI jobSpecUri = ((Either.Right<JobSpec, URI>) 
parsedMessage).getRight();
-          this.jobCatalog.remove(jobSpecUri, true);
-          // Delete the job state if it is a delete spec request
-          deleteStateStore(jobSpecUri);
+      Collection<JobSpec> parsedCollection = parseJobSpec(message.getValue());
+      for (JobSpec parsedMessage : parsedCollection) {
+        SpecExecutor.Verb verb = 
SpecExecutor.Verb.valueOf(parsedMessage.getMetadata().get(SpecExecutor.VERB_KEY));
+
+        switch (verb) {
+          case ADD:
+          case UPDATE:
+          case UNKNOWN: // unknown are considered as add request to maintain 
backward compatibility
+            this.newSpecs.inc();
+            this.jobCatalog.put(parsedMessage);
+            break;
+          case DELETE:
+            this.removedSpecs.inc();
+            URI jobSpecUri = parsedMessage.getUri();
+            this.jobCatalog.remove(jobSpecUri, true);
+            // Delete the job state if it is a delete spec request
+            deleteStateStore(jobSpecUri);

Review comment:
       What will happen in the service process fails after removal of the job, 
but before state store is deleted? Will the state store eventually be 
deleted/cleaned up?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 589969)
    Time Spent: 20m  (was: 10m)

> Segregate FlowConfigs/Delete and FlowExecutions/Delete functions #3268
> ----------------------------------------------------------------------
>
>                 Key: GOBBLIN-1437
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1437
>             Project: Apache Gobblin
>          Issue Type: Bug
>            Reporter: Arjun Singh Bora
>            Priority: Major
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> there are some mixup of work in FlowConfigs/Delete and FlowExecutions/Delete 
> which is to be fixed
> FlowConfigs/Delete
> 1) is not cancelling the current running execution on cluster (i.e. when 
> KafkaSpecProducer is used)
> 2) is not deleting state store when DagManager is used and KafkaSpecProducer 
> is used
> FlowExecutions/Delete
> 1) is not calling SpecProducer.cancel() 
> 2) is deleting the state store when KafkaSpecProducer is used, because 
> cluster thinks it is a DELETE request



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to