[
https://issues.apache.org/jira/browse/GOBBLIN-1944?focusedWorklogId=888151&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-888151
]
ASF GitHub Bot logged work on GOBBLIN-1944:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 31/Oct/23 21:46
Start Date: 31/Oct/23 21:46
Worklog Time Spent: 10m
Work Description: phet commented on code in PR #3815:
URL: https://github.com/apache/gobblin/pull/3815#discussion_r1378195042
##########
gobblin-temporal/src/main/java/org/apache/gobblin/temporal/loadgen/launcher/GenArbitraryLoadJobLauncher.java:
##########
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.temporal.loadgen.launcher;
+
+import io.temporal.client.WorkflowOptions;
+import java.util.List;
+import java.util.Optional;
+import java.util.Properties;
+import java.util.concurrent.ConcurrentHashMap;
+import lombok.extern.slf4j.Slf4j;
+import org.apache.gobblin.annotation.Alpha;
+import org.apache.gobblin.metrics.Tag;
+import org.apache.gobblin.runtime.JobLauncher;
+import org.apache.gobblin.source.workunit.WorkUnit;
+import org.apache.gobblin.temporal.cluster.GobblinTemporalTaskRunner;
+import org.apache.gobblin.temporal.joblauncher.GobblinTemporalJobLauncher;
+import org.apache.gobblin.temporal.joblauncher.GobblinTemporalJobScheduler;
+import org.apache.gobblin.temporal.loadgen.work.IllustrationItem;
+import org.apache.gobblin.temporal.loadgen.work.SimpleGeneratedWorkload;
+import org.apache.gobblin.temporal.util.nesting.work.WorkflowAddr;
+import org.apache.gobblin.temporal.util.nesting.work.Workload;
+import org.apache.gobblin.temporal.util.nesting.workflow.NestingExecWorkflow;
+import org.apache.hadoop.fs.Path;
+
+import static
org.apache.gobblin.temporal.GobblinTemporalConfigurationKeys.GOBBLIN_TEMPORAL_JOB_LAUNCHER_ARG_PREFIX;
+
+
+/**
+ * A {@link JobLauncher} for the initial triggering of a Temporal workflow
that generates arbitrary load of many
+ * activities nested beneath a single subsuming super-workflow. see: {@link
NestingExecWorkflow}
+ *
+ * <p>
+ * This class is instantiated by the {@link
GobblinTemporalJobScheduler#buildJobLauncher(Properties)} on every job
submission to launch the Gobblin job.
+ * The actual task execution happens in the {@link
GobblinTemporalTaskRunner}, usually in a different process.
+ * </p>
+ */
+@Alpha
+@Slf4j
+public class GenArbitraryLoadJobLauncher extends GobblinTemporalJobLauncher {
+ public static String GOBBLIN_TEMPORAL_JOB_LAUNCHER_ARG_NUM_ACTIVITIES =
GOBBLIN_TEMPORAL_JOB_LAUNCHER_ARG_PREFIX + "num.activities";
+ public static String GOBBLIN_TEMPORAL_JOB_LAUNCHER_ARG_MAX_BRANCHES_PER_TREE
= GOBBLIN_TEMPORAL_JOB_LAUNCHER_ARG_PREFIX + "max.branches.per.tree";
+ public static String
GOBBLIN_TEMPORAL_JOB_LAUNCHER_ARG_MAX_SUB_TREES_PER_TREE =
GOBBLIN_TEMPORAL_JOB_LAUNCHER_ARG_PREFIX + "max.sub.trees.per.tree";
+
+ public GenArbitraryLoadJobLauncher(
+ Properties jobProps,
+ Path appWorkDir,
+ List<? extends Tag<?>> metadataTags,
+ ConcurrentHashMap<String, Boolean> runningMap
+ ) throws Exception {
+ super(jobProps, appWorkDir, metadataTags, runningMap);
+ }
+
+ @Override
+ public void submitJob(List<WorkUnit> workunits) {
+ int numActivities =
Integer.valueOf(this.jobProps.getProperty(GOBBLIN_TEMPORAL_JOB_LAUNCHER_ARG_NUM_ACTIVITIES,
"<<not-set>>"));
+ int maxBranchesPerTree =
Integer.valueOf(this.jobProps.getProperty(GOBBLIN_TEMPORAL_JOB_LAUNCHER_ARG_MAX_BRANCHES_PER_TREE,
"<<not-set>>"));
+ int maxSubTreesPerTree =
Integer.valueOf(this.jobProps.getProperty(GOBBLIN_TEMPORAL_JOB_LAUNCHER_ARG_MAX_SUB_TREES_PER_TREE,
"<<not-set>>"));
Review Comment:
that's a reasonable Q--let's jointly consider...
I actually shied away from either implementing or seeking out a general arg
processing solution. since it seemed overkill at this stage. the ideal would
be to catch any `NumberFormatException` and produce a nice error message naming
the property involved as well as what value it had that couldn't be parsed.
still, even our widely-used `ConfigUtils::getInt` doesn't do that. it's
SotA leaves the end-user to look at the exception ST for the line numbers to
then read the code to figure out which named property was involved.
this impl continues in that tradition, which I agree is sub-optimal in the
grand scheme, but nonetheless stays consistent with other potential property
value violations that may come about.
Issue Time Tracking
-------------------
Worklog Id: (was: 888151)
Time Spent: 0.5h (was: 20m)
> Create Gobblin-Temporal load-generator to de-risk Temporal's scaling
> --------------------------------------------------------------------
>
> Key: GOBBLIN-1944
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1944
> Project: Apache Gobblin
> Issue Type: Improvement
> Components: gobblin-core
> Reporter: Kip Kohn
> Assignee: Abhishek Tiwari
> Priority: Major
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> To de-risk Temporal as an effective platform for executing Gobblin jobs, we
> must validate that temporal workflows can scale-up to a Big Data number of
> activities. To perform such load testing, build a load-generator for creating
> nesting workflows that ultimately subsume whatever configurable number of
> activities.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)