zhipeng93 commented on code in PR #97: URL: https://github.com/apache/flink-ml/pull/97#discussion_r878863091
########## flink-ml-iteration/src/main/java/org/apache/flink/iteration/datacache/nonkeyed/MemorySegmentWriter.java: ########## @@ -0,0 +1,114 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.iteration.datacache.nonkeyed; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.api.common.typeutils.TypeSerializer; +import org.apache.flink.core.fs.Path; +import org.apache.flink.runtime.memory.MemoryManager; +import org.apache.flink.runtime.memory.MemoryReservationException; +import org.apache.flink.util.Preconditions; + +import org.openjdk.jol.info.GraphLayout; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.List; +import java.util.Optional; + +/** A class that writes cache data to memory segments. */ +@Internal +public class MemorySegmentWriter<T> implements SegmentWriter<T> { + private final MemoryManager memoryManager; + + private final Path path; + + private final List<T> cache; + + private final TypeSerializer<T> serializer; + + private long inMemorySize; + + private int count; + + private long reservedMemorySize; + + public MemorySegmentWriter( + Path path, MemoryManager memoryManager, TypeSerializer<T> serializer, long expectedSize) + throws MemoryReservationException { + this.serializer = serializer; + Preconditions.checkNotNull(memoryManager); + this.path = path; + this.cache = new ArrayList<>(); + this.inMemorySize = 0L; + this.count = 0; + this.memoryManager = memoryManager; + + if (expectedSize > 0) { + memoryManager.reserveMemory(this.path, expectedSize); + } + this.reservedMemorySize = expectedSize; + } + + @Override + public boolean addRecord(T record) { Review Comment: The way of using `MemoryManager` seems not appropriate to me after digging into the usage of `MemoryManager`. [1][2] The code snippet here seems to be caching the record in java heap, but trying to reserve memory from off-heap memory. If I am understanding [1] [2] correctly, - When using `MemoryManager` to manipulate managed memory, we are mostly dealing with off-heap memory. - The managed memory for each operator should be a fixed one after generating the job graph, i.e., it is not dynamically allocated. - The usage of managed memory should be declared to the jobgraph explicitly and then be used by the operator. Otherwise it will lead to OOM if deployed in a container. As I see, there are basically two options to cache the data: - cache it in `task heap` (i.e., cache it in a `list`): It is simple and easy to implement, but the downside is that we cannot control the size of cached element `statically` and the program may not be robust --- `task heap` is shared among the JVM and we have no idea about how others are using the JVM heap memory. - cache it in `off-heap` (for example using the managed memory). In this way, we need to declare the usage of the managed to the job graph via `Transformation#declareManagedMemoryUseCaseAtOperatorScope` or `Transformation#declareManagedMemoryUseCaseAtSlotScope` and get the fraction of the managed memory from [3]. I would suggest to go with option-2, but need more discussions with the runtime guys. [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-53%3A+Fine+Grained+Operator+Resource+Management [2] https://cwiki.apache.org/confluence/display/FLINK/FLIP-141%3A+Intra-Slot+Managed+Memory+Sharing [3] https://github.com/apache/flink/blob/18a967f8ad7b22c2942e227fb84f08f552660b5a/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/operators/sort/SortOperator.java#L79 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
