zhipeng93 commented on code in PR #237:
URL: https://github.com/apache/flink-ml/pull/237#discussion_r1209245957


##########
flink-ml-lib/src/main/java/org/apache/flink/ml/common/updater/ModelUpdater.java:
##########
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.common.updater;
+
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.runtime.state.StateInitializationContext;
+import org.apache.flink.runtime.state.StateSnapshotContext;
+
+import java.io.Serializable;
+import java.util.Iterator;
+
+/**
+ * A model updater that could be used to handle push/pull request from workers.
+ *
+ * <p>Note that model updater should also ensure that model data is robust to 
failures.
+ */
+public interface ModelUpdater extends Serializable {
+
+    /** Initialize the model data. */
+    void open(long startFeatureIndex, long endFeatureIndex);
+
+    /** Applies the push to update the model data, e.g., using gradient to 
update model. */
+    void handlePush(long[] keys, double[] values);
+
+    /** Applies the pull and return the retrieved model data. */
+    double[] handlePull(long[] keys);

Review Comment:
   In this PR, we propose to use two type of roles to describe the iterative 
machine learning training process following the idea of parameter servers. 
   - WorkerOp stores the training data and only involves local computation 
logic. When it needs to access model parameters and involves distributed 
communication, it communicates with ServerOp via `push/pull` primitive. The 
`push/pull` could be sparse key-value pairs or dense values. Currently only 
sparse key-value are supported.
   - ServerOp stores the model parameters and provide access to WorkerOps.
   - Subtasks of WorkerOp cannot talk to each other. Subtasks of ServerOp 
cannot talk to each other.
   
   `handlePush` and `handlePull` are two operations that the server answers the 
request from workers.
   The naming follows the name of `push/pull`. It is possible that `handlePush` 
handle keys that have been updated with `handlePush`, but not necessary.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to