ibessonov commented on code in PR #1720:
URL: https://github.com/apache/ignite-3/pull/1720#discussion_r1121785748
##########
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/TableManager.java:
##########
@@ -453,9 +459,10 @@ public TableManager(
assignmentsSwitchRebalanceListener =
createAssignmentsSwitchRebalanceListener();
}
- /** {@inheritDoc} */
@Override
public void start() {
+ mvGc = new MvGc(nodeName, tablesCfg.gcThreads().value());
Review Comment:
Why don't you instantiate it in constructor? This is the only reason why
it's volatile, right?
##########
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/TableManager.java:
##########
@@ -1246,12 +1279,16 @@ private void dropTableLocally(long causalityToken,
String name, UUID tblId, List
try {
int partitions = assignment.size();
+ CompletableFuture<?>[] removeStorageFromGcFutures = new
CompletableFuture<?>[partitions];
+
for (int p = 0; p < partitions; p++) {
TablePartitionId replicationGroupId = new
TablePartitionId(tblId, p);
raftMgr.stopRaftNodes(replicationGroupId);
replicaMgr.stopReplica(replicationGroupId);
+
+ removeStorageFromGcFutures[p] =
mvGc.removeStorage(replicationGroupId);
Review Comment:
So, you're doing it after you stop the replica.
On what step do we close the storage itself? Are you sure that the order is
correct?
##########
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/TableManager.java:
##########
@@ -1065,6 +1079,25 @@ private void cleanUpTablesResources(Map<UUID, TableImpl>
tables) {
}
}
});
+
+ CompletableFuture<Void> removeFromGcFuture =
mvGc.removeStorage(replicationGroupId);
+
+ stopping.add(() -> {
+ try {
+ // Should be done fairly quickly.
+ removeFromGcFuture.join();
+ } catch (Exception e) {
+ if (!exception.compareAndSet(null, e)) {
Review Comment:
Oh Jesus, it's all copy-pasted!
Could you please extract exception handling into a method?
##########
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/gc/GcStorageHandler.java:
##########
@@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.ignite.internal.table.distributed.gc;
+
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.atomic.AtomicReference;
+import org.apache.ignite.internal.table.distributed.StorageUpdateHandler;
+
+/**
+ * Container for handling storage by the garbage collector.
+ */
+class GcStorageHandler {
+ final StorageUpdateHandler storageUpdateHandler;
+
+ final AtomicReference<CompletableFuture<Void>> gcInProgressFuture = new
AtomicReference<>();
Review Comment:
Can you please document fields as well?
##########
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/gc/MvGc.java:
##########
@@ -0,0 +1,249 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.ignite.internal.table.distributed.gc;
+
+import static java.util.concurrent.CompletableFuture.completedFuture;
+import static
org.apache.ignite.internal.util.IgniteUtils.shutdownAndAwaitTermination;
+
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.ConcurrentMap;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.ThreadPoolExecutor;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicReference;
+import java.util.function.Supplier;
+import org.apache.ignite.internal.close.ManuallyCloseable;
+import org.apache.ignite.internal.hlc.HybridTimestamp;
+import org.apache.ignite.internal.logger.IgniteLogger;
+import org.apache.ignite.internal.logger.Loggers;
+import org.apache.ignite.internal.storage.MvPartitionStorage;
+import org.apache.ignite.internal.table.distributed.StorageUpdateHandler;
+import
org.apache.ignite.internal.table.distributed.replicator.TablePartitionId;
+import org.apache.ignite.internal.thread.NamedThreadFactory;
+import org.apache.ignite.internal.util.IgniteSpinBusyLock;
+import org.apache.ignite.lang.ErrorGroups.Gc;
+import org.apache.ignite.lang.IgniteInternalException;
+
+/**
+ * Garbage collector for multi-versioned storages and their indexes in the
background.
+ *
+ * @see MvPartitionStorage#pollForVacuum(HybridTimestamp)
+ */
+public class MvGc implements ManuallyCloseable {
+ private static final IgniteLogger LOG = Loggers.forClass(MvGc.class);
+
+ /** GC batch size for the storage. */
+ static final int GC_BUTCH_SIZE = 5;
+
+ /** Garbage collection thread pool. */
+ private final ExecutorService executor;
+
+ /** Prevents double closing. */
+ private final AtomicBoolean closeGuard = new AtomicBoolean();
+
+ /** Busy lock to close synchronously. */
+ private final IgniteSpinBusyLock busyLock = new IgniteSpinBusyLock();
+
+ /** Low watermark. */
+ private final AtomicReference<HybridTimestamp> lowWatermarkReference = new
AtomicReference<>();
+
+ /** Storage handler by table partition ID for which garbage will be
collected. */
+ private final ConcurrentMap<TablePartitionId, GcStorageHandler>
storageHandlerByPartitionId = new ConcurrentHashMap<>();
+
+ /**
+ * Constructor.
+ *
+ * @param nodeName Node name.
+ * @param threadCount Number of garbage collector threads.
+ */
+ public MvGc(String nodeName, int threadCount) {
+ assert threadCount > 0 : threadCount;
+
+ executor = new ThreadPoolExecutor(
+ threadCount,
+ threadCount,
+ 30,
+ TimeUnit.SECONDS,
+ new LinkedBlockingQueue<>(),
+ new NamedThreadFactory(nodeName, LOG)
+ );
+ }
+
+ /**
+ * Adds storage for background garbage collection when updating a low
watermark.
+ *
+ * @param tablePartitionId Table partition ID.
+ * @param storageUpdateHandler Storage update handler.
+ * @throws IgniteInternalException with {@link Gc#CLOSED_ERR} If the
garbage collector is closed.
+ */
+ public void addStorage(TablePartitionId tablePartitionId,
StorageUpdateHandler storageUpdateHandler) {
+ inBusyLock(() -> {
+ GcStorageHandler previous =
storageHandlerByPartitionId.putIfAbsent(
Review Comment:
How can previous be non-null?
##########
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/gc/MvGc.java:
##########
@@ -0,0 +1,249 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.ignite.internal.table.distributed.gc;
+
+import static java.util.concurrent.CompletableFuture.completedFuture;
+import static
org.apache.ignite.internal.util.IgniteUtils.shutdownAndAwaitTermination;
+
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.ConcurrentMap;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.ThreadPoolExecutor;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicReference;
+import java.util.function.Supplier;
+import org.apache.ignite.internal.close.ManuallyCloseable;
+import org.apache.ignite.internal.hlc.HybridTimestamp;
+import org.apache.ignite.internal.logger.IgniteLogger;
+import org.apache.ignite.internal.logger.Loggers;
+import org.apache.ignite.internal.storage.MvPartitionStorage;
+import org.apache.ignite.internal.table.distributed.StorageUpdateHandler;
+import
org.apache.ignite.internal.table.distributed.replicator.TablePartitionId;
+import org.apache.ignite.internal.thread.NamedThreadFactory;
+import org.apache.ignite.internal.util.IgniteSpinBusyLock;
+import org.apache.ignite.lang.ErrorGroups.Gc;
+import org.apache.ignite.lang.IgniteInternalException;
+
+/**
+ * Garbage collector for multi-versioned storages and their indexes in the
background.
+ *
+ * @see MvPartitionStorage#pollForVacuum(HybridTimestamp)
+ */
+public class MvGc implements ManuallyCloseable {
+ private static final IgniteLogger LOG = Loggers.forClass(MvGc.class);
+
+ /** GC batch size for the storage. */
+ static final int GC_BUTCH_SIZE = 5;
+
+ /** Garbage collection thread pool. */
+ private final ExecutorService executor;
+
+ /** Prevents double closing. */
+ private final AtomicBoolean closeGuard = new AtomicBoolean();
+
+ /** Busy lock to close synchronously. */
+ private final IgniteSpinBusyLock busyLock = new IgniteSpinBusyLock();
+
+ /** Low watermark. */
+ private final AtomicReference<HybridTimestamp> lowWatermarkReference = new
AtomicReference<>();
+
+ /** Storage handler by table partition ID for which garbage will be
collected. */
+ private final ConcurrentMap<TablePartitionId, GcStorageHandler>
storageHandlerByPartitionId = new ConcurrentHashMap<>();
+
+ /**
+ * Constructor.
+ *
+ * @param nodeName Node name.
+ * @param threadCount Number of garbage collector threads.
+ */
+ public MvGc(String nodeName, int threadCount) {
+ assert threadCount > 0 : threadCount;
+
+ executor = new ThreadPoolExecutor(
+ threadCount,
+ threadCount,
+ 30,
+ TimeUnit.SECONDS,
+ new LinkedBlockingQueue<>(),
+ new NamedThreadFactory(nodeName, LOG)
+ );
+ }
+
+ /**
+ * Adds storage for background garbage collection when updating a low
watermark.
+ *
+ * @param tablePartitionId Table partition ID.
+ * @param storageUpdateHandler Storage update handler.
+ * @throws IgniteInternalException with {@link Gc#CLOSED_ERR} If the
garbage collector is closed.
+ */
+ public void addStorage(TablePartitionId tablePartitionId,
StorageUpdateHandler storageUpdateHandler) {
+ inBusyLock(() -> {
+ GcStorageHandler previous =
storageHandlerByPartitionId.putIfAbsent(
+ tablePartitionId,
+ new GcStorageHandler(storageUpdateHandler)
+ );
+
+ if (previous == null && lowWatermarkReference.get() != null) {
+ scheduleGcForStorage(tablePartitionId);
+ }
+ });
+ }
+
+ /**
+ * Removes storage for background garbage collection and completes the
garbage collection for it.
+ *
+ * <p>Should be called before rebalancing/closing/destroying the storage.
+ *
+ * @param tablePartitionId Table partition ID.
+ * @return Storage garbage collection completion future.
+ * @throws IgniteInternalException with {@link Gc#CLOSED_ERR} If the
garbage collector is closed.
+ */
+ public CompletableFuture<Void> removeStorage(TablePartitionId
tablePartitionId) {
+ return inBusyLock(() -> {
+ GcStorageHandler removed =
storageHandlerByPartitionId.remove(tablePartitionId);
+
+ if (removed == null) {
+ return completedFuture(null);
+ }
+
+ CompletableFuture<Void> gcInProgressFuture =
removed.gcInProgressFuture.get();
+
+ return gcInProgressFuture == null ? completedFuture(null) :
gcInProgressFuture;
+ });
+ }
+
+ /**
+ * Updates the new watermark only if it is larger than the current low
watermark.
+ *
+ * <p>If the update is successful, it will schedule a new garbage
collection for all storages.
+ *
+ * @param newLwm New low watermark.
+ * @throws IgniteInternalException with {@link Gc#CLOSED_ERR} If the
garbage collector is closed.
+ */
+ public void updateLowWatermark(HybridTimestamp newLwm) {
+ inBusyLock(() -> {
+ HybridTimestamp updatedLwm =
lowWatermarkReference.updateAndGet(currentLwm -> {
+ if (currentLwm == null) {
+ return newLwm;
+ }
+
+ // Update only if the new one is greater than the current one.
+ return newLwm.compareTo(currentLwm) > 0 ? newLwm : currentLwm;
+ });
+
+ // If the new watermark is smaller than the current one or has
been updated in parallel, then we do nothing.
+ if (updatedLwm != newLwm) {
+ return;
+ }
+
+ executor.submit(() -> inBusyLock(this::initNewGcBusy));
+ });
+ }
+
+ @Override
+ public void close() throws Exception {
+ if (!closeGuard.compareAndSet(false, true)) {
+ return;
+ }
+
+ busyLock.block();
+
+ shutdownAndAwaitTermination(executor, 10, TimeUnit.SECONDS);
+ }
+
+ private void initNewGcBusy() {
+
storageHandlerByPartitionId.keySet().forEach(this::scheduleGcForStorage);
+ }
+
+ private void scheduleGcForStorage(TablePartitionId tablePartitionId) {
+ executor.submit(() -> inBusyLock(() -> {
+ GcStorageHandler storageHandler =
storageHandlerByPartitionId.compute(tablePartitionId, (id, gcStorageHandler) ->
{
+ if (gcStorageHandler == null) {
+ // Storage has been removed from garbage collection.
+ return gcStorageHandler;
Review Comment:
`return null` would be easier to read
##########
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/gc/GcStorageHandler.java:
##########
@@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.ignite.internal.table.distributed.gc;
+
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.atomic.AtomicReference;
+import org.apache.ignite.internal.table.distributed.StorageUpdateHandler;
+
+/**
+ * Container for handling storage by the garbage collector.
+ */
+class GcStorageHandler {
+ final StorageUpdateHandler storageUpdateHandler;
+
+ final AtomicReference<CompletableFuture<Void>> gcInProgressFuture = new
AtomicReference<>();
Review Comment:
Also, why have you decided to go with default field visibility instead of
encapsulating the logic inside of methods, like it's usually done almost
everywhere?
##########
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/TableManager.java:
##########
@@ -2260,10 +2300,12 @@ protected void
handleChangeStableAssignmentEvent(WatchEvent evt) {
// TODO: IGNITE-18703 Destroy raft log and meta
// Should be done fairly quickly.
- allOf(
- internalTable.storage().destroyPartition(partitionId),
- runAsync(() ->
internalTable.txStateStorage().destroyTxStateStorage(partitionId), ioExecutor)
- ).join();
+ mvGc.removeStorage(tablePartitionId)
+ .thenCompose(unused -> allOf(
+
internalTable.storage().destroyPartition(partitionId),
Review Comment:
I hope one day we will refactor all this spaghetti code and extract proper
abstractions :(
##########
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/gc/MvGc.java:
##########
@@ -0,0 +1,249 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.ignite.internal.table.distributed.gc;
+
+import static java.util.concurrent.CompletableFuture.completedFuture;
+import static
org.apache.ignite.internal.util.IgniteUtils.shutdownAndAwaitTermination;
+
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.ConcurrentMap;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.ThreadPoolExecutor;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicReference;
+import java.util.function.Supplier;
+import org.apache.ignite.internal.close.ManuallyCloseable;
+import org.apache.ignite.internal.hlc.HybridTimestamp;
+import org.apache.ignite.internal.logger.IgniteLogger;
+import org.apache.ignite.internal.logger.Loggers;
+import org.apache.ignite.internal.storage.MvPartitionStorage;
+import org.apache.ignite.internal.table.distributed.StorageUpdateHandler;
+import
org.apache.ignite.internal.table.distributed.replicator.TablePartitionId;
+import org.apache.ignite.internal.thread.NamedThreadFactory;
+import org.apache.ignite.internal.util.IgniteSpinBusyLock;
+import org.apache.ignite.lang.ErrorGroups.Gc;
+import org.apache.ignite.lang.IgniteInternalException;
+
+/**
+ * Garbage collector for multi-versioned storages and their indexes in the
background.
+ *
+ * @see MvPartitionStorage#pollForVacuum(HybridTimestamp)
+ */
+public class MvGc implements ManuallyCloseable {
+ private static final IgniteLogger LOG = Loggers.forClass(MvGc.class);
+
+ /** GC batch size for the storage. */
+ static final int GC_BUTCH_SIZE = 5;
+
+ /** Garbage collection thread pool. */
+ private final ExecutorService executor;
+
+ /** Prevents double closing. */
+ private final AtomicBoolean closeGuard = new AtomicBoolean();
+
+ /** Busy lock to close synchronously. */
+ private final IgniteSpinBusyLock busyLock = new IgniteSpinBusyLock();
+
+ /** Low watermark. */
+ private final AtomicReference<HybridTimestamp> lowWatermarkReference = new
AtomicReference<>();
+
+ /** Storage handler by table partition ID for which garbage will be
collected. */
+ private final ConcurrentMap<TablePartitionId, GcStorageHandler>
storageHandlerByPartitionId = new ConcurrentHashMap<>();
+
+ /**
+ * Constructor.
+ *
+ * @param nodeName Node name.
+ * @param threadCount Number of garbage collector threads.
+ */
+ public MvGc(String nodeName, int threadCount) {
+ assert threadCount > 0 : threadCount;
+
+ executor = new ThreadPoolExecutor(
+ threadCount,
+ threadCount,
+ 30,
+ TimeUnit.SECONDS,
+ new LinkedBlockingQueue<>(),
+ new NamedThreadFactory(nodeName, LOG)
+ );
+ }
+
+ /**
+ * Adds storage for background garbage collection when updating a low
watermark.
+ *
+ * @param tablePartitionId Table partition ID.
+ * @param storageUpdateHandler Storage update handler.
+ * @throws IgniteInternalException with {@link Gc#CLOSED_ERR} If the
garbage collector is closed.
+ */
+ public void addStorage(TablePartitionId tablePartitionId,
StorageUpdateHandler storageUpdateHandler) {
+ inBusyLock(() -> {
+ GcStorageHandler previous =
storageHandlerByPartitionId.putIfAbsent(
+ tablePartitionId,
+ new GcStorageHandler(storageUpdateHandler)
+ );
+
+ if (previous == null && lowWatermarkReference.get() != null) {
+ scheduleGcForStorage(tablePartitionId);
+ }
+ });
+ }
+
+ /**
+ * Removes storage for background garbage collection and completes the
garbage collection for it.
+ *
+ * <p>Should be called before rebalancing/closing/destroying the storage.
+ *
+ * @param tablePartitionId Table partition ID.
+ * @return Storage garbage collection completion future.
+ * @throws IgniteInternalException with {@link Gc#CLOSED_ERR} If the
garbage collector is closed.
+ */
+ public CompletableFuture<Void> removeStorage(TablePartitionId
tablePartitionId) {
+ return inBusyLock(() -> {
+ GcStorageHandler removed =
storageHandlerByPartitionId.remove(tablePartitionId);
+
+ if (removed == null) {
+ return completedFuture(null);
+ }
+
+ CompletableFuture<Void> gcInProgressFuture =
removed.gcInProgressFuture.get();
+
+ return gcInProgressFuture == null ? completedFuture(null) :
gcInProgressFuture;
+ });
+ }
+
+ /**
+ * Updates the new watermark only if it is larger than the current low
watermark.
+ *
+ * <p>If the update is successful, it will schedule a new garbage
collection for all storages.
+ *
+ * @param newLwm New low watermark.
+ * @throws IgniteInternalException with {@link Gc#CLOSED_ERR} If the
garbage collector is closed.
+ */
+ public void updateLowWatermark(HybridTimestamp newLwm) {
+ inBusyLock(() -> {
+ HybridTimestamp updatedLwm =
lowWatermarkReference.updateAndGet(currentLwm -> {
+ if (currentLwm == null) {
+ return newLwm;
+ }
+
+ // Update only if the new one is greater than the current one.
+ return newLwm.compareTo(currentLwm) > 0 ? newLwm : currentLwm;
+ });
+
+ // If the new watermark is smaller than the current one or has
been updated in parallel, then we do nothing.
+ if (updatedLwm != newLwm) {
+ return;
+ }
+
+ executor.submit(() -> inBusyLock(this::initNewGcBusy));
+ });
+ }
+
+ @Override
+ public void close() throws Exception {
+ if (!closeGuard.compareAndSet(false, true)) {
+ return;
+ }
+
+ busyLock.block();
+
+ shutdownAndAwaitTermination(executor, 10, TimeUnit.SECONDS);
+ }
+
+ private void initNewGcBusy() {
+
storageHandlerByPartitionId.keySet().forEach(this::scheduleGcForStorage);
+ }
+
+ private void scheduleGcForStorage(TablePartitionId tablePartitionId) {
+ executor.submit(() -> inBusyLock(() -> {
+ GcStorageHandler storageHandler =
storageHandlerByPartitionId.compute(tablePartitionId, (id, gcStorageHandler) ->
{
+ if (gcStorageHandler == null) {
+ // Storage has been removed from garbage collection.
+ return gcStorageHandler;
+ }
+
+ boolean casResult =
gcStorageHandler.gcInProgressFuture.compareAndSet(null, new
CompletableFuture<>());
+
+ assert casResult : tablePartitionId;
+
+ return gcStorageHandler;
+ });
+
+ if (storageHandler == null) {
+ // Storage has been removed from garbage collection.
+ return;
+ }
+
+ CompletableFuture<Void> future =
storageHandler.gcInProgressFuture.get();
+
+ assert future != null : tablePartitionId;
+
+ try {
+ boolean scheduleGcForStorageAgain = true;
+
+ for (int i = 0; i < GC_BUTCH_SIZE &&
scheduleGcForStorageAgain; i++) {
+ HybridTimestamp lowWatermark = lowWatermarkReference.get();
+
+ assert lowWatermark != null : tablePartitionId;
+
+ // If storage has been deleted or there is no garbage,
then for now we will stop collecting garbage for this storage.
+ if
(!storageHandlerByPartitionId.containsKey(tablePartitionId)
+ ||
!storageHandler.storageUpdateHandler.vacuum(lowWatermark)) {
+ scheduleGcForStorageAgain = false;
+ }
+ }
+
+ if (scheduleGcForStorageAgain) {
+ scheduleGcForStorage(tablePartitionId);
+ }
+
+ future.complete(null);
+ } catch (Throwable t) {
+ future.completeExceptionally(t);
+ } finally {
+ boolean casResult =
storageHandler.gcInProgressFuture.compareAndSet(future, null);
Review Comment:
You have a race between this and the previous CAS. Please fix it, and be
careful
##########
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/gc/MvGc.java:
##########
@@ -0,0 +1,249 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.ignite.internal.table.distributed.gc;
+
+import static java.util.concurrent.CompletableFuture.completedFuture;
+import static
org.apache.ignite.internal.util.IgniteUtils.shutdownAndAwaitTermination;
+
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.ConcurrentMap;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.ThreadPoolExecutor;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicReference;
+import java.util.function.Supplier;
+import org.apache.ignite.internal.close.ManuallyCloseable;
+import org.apache.ignite.internal.hlc.HybridTimestamp;
+import org.apache.ignite.internal.logger.IgniteLogger;
+import org.apache.ignite.internal.logger.Loggers;
+import org.apache.ignite.internal.storage.MvPartitionStorage;
+import org.apache.ignite.internal.table.distributed.StorageUpdateHandler;
+import
org.apache.ignite.internal.table.distributed.replicator.TablePartitionId;
+import org.apache.ignite.internal.thread.NamedThreadFactory;
+import org.apache.ignite.internal.util.IgniteSpinBusyLock;
+import org.apache.ignite.lang.ErrorGroups.Gc;
+import org.apache.ignite.lang.IgniteInternalException;
+
+/**
+ * Garbage collector for multi-versioned storages and their indexes in the
background.
+ *
+ * @see MvPartitionStorage#pollForVacuum(HybridTimestamp)
+ */
+public class MvGc implements ManuallyCloseable {
+ private static final IgniteLogger LOG = Loggers.forClass(MvGc.class);
+
+ /** GC batch size for the storage. */
+ static final int GC_BUTCH_SIZE = 5;
+
+ /** Garbage collection thread pool. */
+ private final ExecutorService executor;
+
+ /** Prevents double closing. */
+ private final AtomicBoolean closeGuard = new AtomicBoolean();
+
+ /** Busy lock to close synchronously. */
+ private final IgniteSpinBusyLock busyLock = new IgniteSpinBusyLock();
+
+ /** Low watermark. */
+ private final AtomicReference<HybridTimestamp> lowWatermarkReference = new
AtomicReference<>();
+
+ /** Storage handler by table partition ID for which garbage will be
collected. */
+ private final ConcurrentMap<TablePartitionId, GcStorageHandler>
storageHandlerByPartitionId = new ConcurrentHashMap<>();
+
+ /**
+ * Constructor.
+ *
+ * @param nodeName Node name.
+ * @param threadCount Number of garbage collector threads.
+ */
+ public MvGc(String nodeName, int threadCount) {
+ assert threadCount > 0 : threadCount;
+
+ executor = new ThreadPoolExecutor(
+ threadCount,
+ threadCount,
+ 30,
+ TimeUnit.SECONDS,
+ new LinkedBlockingQueue<>(),
+ new NamedThreadFactory(nodeName, LOG)
+ );
+ }
+
+ /**
+ * Adds storage for background garbage collection when updating a low
watermark.
+ *
+ * @param tablePartitionId Table partition ID.
+ * @param storageUpdateHandler Storage update handler.
+ * @throws IgniteInternalException with {@link Gc#CLOSED_ERR} If the
garbage collector is closed.
+ */
+ public void addStorage(TablePartitionId tablePartitionId,
StorageUpdateHandler storageUpdateHandler) {
+ inBusyLock(() -> {
+ GcStorageHandler previous =
storageHandlerByPartitionId.putIfAbsent(
+ tablePartitionId,
+ new GcStorageHandler(storageUpdateHandler)
+ );
+
+ if (previous == null && lowWatermarkReference.get() != null) {
+ scheduleGcForStorage(tablePartitionId);
+ }
+ });
+ }
+
+ /**
+ * Removes storage for background garbage collection and completes the
garbage collection for it.
+ *
+ * <p>Should be called before rebalancing/closing/destroying the storage.
+ *
+ * @param tablePartitionId Table partition ID.
+ * @return Storage garbage collection completion future.
+ * @throws IgniteInternalException with {@link Gc#CLOSED_ERR} If the
garbage collector is closed.
+ */
+ public CompletableFuture<Void> removeStorage(TablePartitionId
tablePartitionId) {
+ return inBusyLock(() -> {
+ GcStorageHandler removed =
storageHandlerByPartitionId.remove(tablePartitionId);
+
+ if (removed == null) {
+ return completedFuture(null);
+ }
+
+ CompletableFuture<Void> gcInProgressFuture =
removed.gcInProgressFuture.get();
+
+ return gcInProgressFuture == null ? completedFuture(null) :
gcInProgressFuture;
+ });
+ }
+
+ /**
+ * Updates the new watermark only if it is larger than the current low
watermark.
+ *
+ * <p>If the update is successful, it will schedule a new garbage
collection for all storages.
+ *
+ * @param newLwm New low watermark.
+ * @throws IgniteInternalException with {@link Gc#CLOSED_ERR} If the
garbage collector is closed.
+ */
+ public void updateLowWatermark(HybridTimestamp newLwm) {
+ inBusyLock(() -> {
+ HybridTimestamp updatedLwm =
lowWatermarkReference.updateAndGet(currentLwm -> {
+ if (currentLwm == null) {
+ return newLwm;
+ }
+
+ // Update only if the new one is greater than the current one.
+ return newLwm.compareTo(currentLwm) > 0 ? newLwm : currentLwm;
+ });
+
+ // If the new watermark is smaller than the current one or has
been updated in parallel, then we do nothing.
+ if (updatedLwm != newLwm) {
+ return;
+ }
+
+ executor.submit(() -> inBusyLock(this::initNewGcBusy));
+ });
+ }
+
+ @Override
+ public void close() throws Exception {
+ if (!closeGuard.compareAndSet(false, true)) {
+ return;
+ }
+
+ busyLock.block();
+
+ shutdownAndAwaitTermination(executor, 10, TimeUnit.SECONDS);
+ }
+
+ private void initNewGcBusy() {
+
storageHandlerByPartitionId.keySet().forEach(this::scheduleGcForStorage);
+ }
+
+ private void scheduleGcForStorage(TablePartitionId tablePartitionId) {
+ executor.submit(() -> inBusyLock(() -> {
+ GcStorageHandler storageHandler =
storageHandlerByPartitionId.compute(tablePartitionId, (id, gcStorageHandler) ->
{
+ if (gcStorageHandler == null) {
+ // Storage has been removed from garbage collection.
+ return gcStorageHandler;
+ }
+
+ boolean casResult =
gcStorageHandler.gcInProgressFuture.compareAndSet(null, new
CompletableFuture<>());
+
+ assert casResult : tablePartitionId;
+
+ return gcStorageHandler;
+ });
+
+ if (storageHandler == null) {
+ // Storage has been removed from garbage collection.
+ return;
+ }
+
+ CompletableFuture<Void> future =
storageHandler.gcInProgressFuture.get();
+
+ assert future != null : tablePartitionId;
+
+ try {
+ boolean scheduleGcForStorageAgain = true;
+
+ for (int i = 0; i < GC_BUTCH_SIZE &&
scheduleGcForStorageAgain; i++) {
+ HybridTimestamp lowWatermark = lowWatermarkReference.get();
+
+ assert lowWatermark != null : tablePartitionId;
+
+ // If storage has been deleted or there is no garbage,
then for now we will stop collecting garbage for this storage.
+ if
(!storageHandlerByPartitionId.containsKey(tablePartitionId)
+ ||
!storageHandler.storageUpdateHandler.vacuum(lowWatermark)) {
+ scheduleGcForStorageAgain = false;
Review Comment:
You can just return from the method right away instead of having an
additional flag. The less variables you have, the simpler your code is,
generally speaking.
And by the way, future can be completed in finally block, another way to
simplify the code. Just don't forget that it can already be completed with
exception by that time
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]