[
https://issues.apache.org/jira/browse/GOBBLIN-1923?focusedWorklogId=883233&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-883233
]
ASF GitHub Bot logged work on GOBBLIN-1923:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 03/Oct/23 22:45
Start Date: 03/Oct/23 22:45
Worklog Time Spent: 10m
Work Description: phet commented on code in PR #3792:
URL: https://github.com/apache/gobblin/pull/3792#discussion_r1344827785
##########
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/api/MysqlMultiActiveLeaseArbiter.java:
##########
@@ -204,6 +212,14 @@ public MysqlMultiActiveLeaseArbiter(Config config) throws
IOException {
}
initializeConstantsTable();
+ Thread retentionThread = new Thread(new Runnable() {
Review Comment:
rather than a sleeping/blocking thread, how about a scheduled thread pool
executor taking this `Runnable` and an exec interval?
##########
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/api/MysqlMultiActiveLeaseArbiter.java:
##########
@@ -110,9 +111,13 @@ protected interface CheckedFunction<T, R> {
private static final String CREATE_LEASE_ARBITER_TABLE_STATEMENT = "CREATE
TABLE IF NOT EXISTS %s ("
+ "flow_group varchar(" + ServiceConfigKeys.MAX_FLOW_GROUP_LENGTH + ")
NOT NULL, flow_name varchar("
+ ServiceConfigKeys.MAX_FLOW_GROUP_LENGTH + ") NOT NULL, " + "
flow_action varchar(100) NOT NULL, "
- + "event_timestamp TIMESTAMP(3) DEFAULT CURRENT_TIMESTAMP(3), "
- + "lease_acquisition_timestamp TIMESTAMP(3) NULL DEFAULT NULL, "
+ + "event_timestamp TIMESTAMP NOT NULL, "
+ + "lease_acquisition_timestamp TIMESTAMP NULL, "
Review Comment:
as far as migrating this schema... will it require manual intervention to
either `drop` or `alter table`?
##########
gobblin-api/src/main/java/org/apache/gobblin/configuration/ConfigurationKeys.java:
##########
@@ -101,6 +101,8 @@ public class ConfigurationKeys {
public static final String DEFAULT_MULTI_ACTIVE_SCHEDULER_CONSTANTS_DB_TABLE
= "gobblin_multi_active_scheduler_constants_store";
public static final String SCHEDULER_LEASE_DETERMINATION_STORE_DB_TABLE_KEY
= MYSQL_LEASE_ARBITER_PREFIX + ".schedulerLeaseArbiter.store.db.table";
public static final String
DEFAULT_SCHEDULER_LEASE_DETERMINATION_STORE_DB_TABLE =
"gobblin_scheduler_lease_determination_store";
+ public static final String
SCHEDULER_LEASE_DETERMINATION_TABLE_RETENTION_PERIOD_MILLIS_KEY =
MYSQL_LEASE_ARBITER_PREFIX + ".retentionPeriodMillis";
+ public static final int
DEFAULT_SCHEDULER_LEASE_DETERMINATION_TABLE_RETENTION_PERIOD_MILLIS = 500000;
Review Comment:
500 seconds? seems way too low... at least for debugging. I'd look at more
like 72 hours
##########
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/api/MysqlMultiActiveLeaseArbiter.java:
##########
@@ -221,6 +237,31 @@ private void initializeConstantsTable() throws IOException
{
}, true);
}
+ /**
+ * Periodically deletes all rows in the table with event_timestamp older
than the retention period defined by config.
+ */
+ private void runRetentionOnArbitrationTable() {
+ while (true) {
+ try {
+ Thread.sleep(10000);
Review Comment:
tip: (same as if you use scheduled TP executor) - set sleep in time-esque
values (such as 60). also, 10s looks WAY too frequent, given a lease may
itself last for minutes! maybe we try this every six hours (4/daily)?
Issue Time Tracking
-------------------
Worklog Id: (was: 883233)
Time Spent: 0.5h (was: 20m)
> Add retention thread for lease arbiter table
> --------------------------------------------
>
> Key: GOBBLIN-1923
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1923
> Project: Apache Gobblin
> Issue Type: Bug
> Components: gobblin-service
> Reporter: Urmi Mustafi
> Assignee: Abhishek Tiwari
> Priority: Major
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Add retention to lease arbiter table so it does not grow unbounded. The table
> can be as large as O(number of flows) which may grow so large that
> reading/writing from this table becomes time consuming and slows down our
> throughput of obtaining and evaluating leases for launching flows.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)