[ 
https://issues.apache.org/jira/browse/GOBBLIN-1944?focusedWorklogId=888152&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-888152
 ]

ASF GitHub Bot logged work on GOBBLIN-1944:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 31/Oct/23 21:47
            Start Date: 31/Oct/23 21:47
    Worklog Time Spent: 10m 
      Work Description: phet commented on code in PR #3815:
URL: https://github.com/apache/gobblin/pull/3815#discussion_r1378195042


##########
gobblin-temporal/src/main/java/org/apache/gobblin/temporal/loadgen/launcher/GenArbitraryLoadJobLauncher.java:
##########
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.temporal.loadgen.launcher;
+
+import io.temporal.client.WorkflowOptions;
+import java.util.List;
+import java.util.Optional;
+import java.util.Properties;
+import java.util.concurrent.ConcurrentHashMap;
+import lombok.extern.slf4j.Slf4j;
+import org.apache.gobblin.annotation.Alpha;
+import org.apache.gobblin.metrics.Tag;
+import org.apache.gobblin.runtime.JobLauncher;
+import org.apache.gobblin.source.workunit.WorkUnit;
+import org.apache.gobblin.temporal.cluster.GobblinTemporalTaskRunner;
+import org.apache.gobblin.temporal.joblauncher.GobblinTemporalJobLauncher;
+import org.apache.gobblin.temporal.joblauncher.GobblinTemporalJobScheduler;
+import org.apache.gobblin.temporal.loadgen.work.IllustrationItem;
+import org.apache.gobblin.temporal.loadgen.work.SimpleGeneratedWorkload;
+import org.apache.gobblin.temporal.util.nesting.work.WorkflowAddr;
+import org.apache.gobblin.temporal.util.nesting.work.Workload;
+import org.apache.gobblin.temporal.util.nesting.workflow.NestingExecWorkflow;
+import org.apache.hadoop.fs.Path;
+
+import static 
org.apache.gobblin.temporal.GobblinTemporalConfigurationKeys.GOBBLIN_TEMPORAL_JOB_LAUNCHER_ARG_PREFIX;
+
+
+/**
+ * A {@link JobLauncher} for the initial triggering of a Temporal workflow 
that generates arbitrary load of many
+ * activities nested beneath a single subsuming super-workflow.  see: {@link 
NestingExecWorkflow}
+ *
+ * <p>
+ *   This class is instantiated by the {@link 
GobblinTemporalJobScheduler#buildJobLauncher(Properties)} on every job 
submission to launch the Gobblin job.
+ *   The actual task execution happens in the {@link 
GobblinTemporalTaskRunner}, usually in a different process.
+ * </p>
+ */
+@Alpha
+@Slf4j
+public class GenArbitraryLoadJobLauncher extends GobblinTemporalJobLauncher {
+  public static String GOBBLIN_TEMPORAL_JOB_LAUNCHER_ARG_NUM_ACTIVITIES = 
GOBBLIN_TEMPORAL_JOB_LAUNCHER_ARG_PREFIX + "num.activities";
+  public static String GOBBLIN_TEMPORAL_JOB_LAUNCHER_ARG_MAX_BRANCHES_PER_TREE 
= GOBBLIN_TEMPORAL_JOB_LAUNCHER_ARG_PREFIX + "max.branches.per.tree";
+  public static String 
GOBBLIN_TEMPORAL_JOB_LAUNCHER_ARG_MAX_SUB_TREES_PER_TREE = 
GOBBLIN_TEMPORAL_JOB_LAUNCHER_ARG_PREFIX + "max.sub.trees.per.tree";
+
+  public GenArbitraryLoadJobLauncher(
+      Properties jobProps,
+      Path appWorkDir,
+      List<? extends Tag<?>> metadataTags,
+      ConcurrentHashMap<String, Boolean> runningMap
+  ) throws Exception {
+    super(jobProps, appWorkDir, metadataTags, runningMap);
+  }
+
+  @Override
+  public void submitJob(List<WorkUnit> workunits) {
+    int numActivities = 
Integer.valueOf(this.jobProps.getProperty(GOBBLIN_TEMPORAL_JOB_LAUNCHER_ARG_NUM_ACTIVITIES,
 "<<not-set>>"));
+    int maxBranchesPerTree = 
Integer.valueOf(this.jobProps.getProperty(GOBBLIN_TEMPORAL_JOB_LAUNCHER_ARG_MAX_BRANCHES_PER_TREE,
 "<<not-set>>"));
+    int maxSubTreesPerTree = 
Integer.valueOf(this.jobProps.getProperty(GOBBLIN_TEMPORAL_JOB_LAUNCHER_ARG_MAX_SUB_TREES_PER_TREE,
 "<<not-set>>"));

Review Comment:
   that's a reasonable Q--let's jointly consider...
   
   I actually shied away from either implementing or seeking out a general arg 
processing solution.  since it seemed overkill at this stage.  the ideal would 
be to catch any `NumberFormatException` and produce a nice error message naming 
the property involved as well as what value it had that couldn't be parsed.
   
   still, even our widely-used `ConfigUtils::getInt` doesn't do that.  it's 
SotA leaves the end-user to look at the exception ST for the line numbers to 
then read the code to figure out which named property was involved.
   
   this impl continues in that tradition, which I agree is sub-optimal in the 
grand scheme, but nonetheless stays consistent with other potential property 
value violations that may come about.
   
   how do you weigh the balance of effort to improvement?





Issue Time Tracking
-------------------

    Worklog Id:     (was: 888152)
    Time Spent: 40m  (was: 0.5h)

> Create Gobblin-Temporal load-generator to de-risk Temporal's scaling
> --------------------------------------------------------------------
>
>                 Key: GOBBLIN-1944
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1944
>             Project: Apache Gobblin
>          Issue Type: Improvement
>          Components: gobblin-core
>            Reporter: Kip Kohn
>            Assignee: Abhishek Tiwari
>            Priority: Major
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> To de-risk Temporal as an effective platform for executing Gobblin jobs, we 
> must validate that temporal workflows can scale-up to a Big Data number of 
> activities. To perform such load testing, build a load-generator for creating 
> nesting workflows that ultimately subsume whatever configurable number of 
> activities.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to