min-guk commented on PR #2126: URL: https://github.com/apache/systemds/pull/2126#issuecomment-2409139899
I am submitting this question through the PR to provide better context by including the relevant code. Currently, only FED using CP INST works correctly, while FED using SP INST encounters errors. The detailed implementation of `federatedRollFunction` discussed in the previous meeting is as follows: 1. For non-partitioned `fedData`, the modified `range` and the same `varID` as the input are registered in `fedMap`. 2. For partitioned `fedData`, `fedRange` is modified, and two `fedRequests` are called with two different `varIDs`. 3. `fedRequest` calls `EXEC_INST` instead of `EXEC_UDF`, reusing the already implemented instructions. (Not implemented yet, future work) #### Question 1. Although step 2 was implemented, the result of `FederationUtils.getResults(ffr)` in the roll function differed from the final result checked by `readBlobFromFederated()` in the test. This is because, in the federated roll function, the `fedData` in `fedMap` uses different `varIDs`, but the `requestFederatedData` function used by `readBlobFromFederated()` requests `fedData` using the same single `_ID`. In most cases, `fedData` belonging to the `fedMap` uses the single `varID`, so it seems to have been implemented this way previously. Therefore, I modified the `requestFederatedData` function as follows to perform requests based on the `varID` of each `fedData`. Since this function is used in multiple places, I would like to ask if it’s okay to modify it this way. Additionally, if this modification is correct, should other functions like `forEachParallel` be modified similarly? ```java public List<Pair<FederatedRange, Future<FederatedResponse>>> requestFederatedData() { // FederatedRequest request = new FederatedRequest(RequestType.GET_VAR, _ID); // previous for(Pair<FederatedRange, FederatedData> e : _fedMap) FederatedRequest request = new FederatedRequest(RequestType.GET_VAR, e.getValue().getVarID()); readResponses.add(Pair.of(e.getKey(), e.getValue().executeFederatedOperation(request))); return readResponses; } ``` #### Question 2. The current issue with FED using SP INST is that it fails to read `fedData` corresponding to step 1 in `readBlobFromFederated()`. However, if the `fedData` in step 1 is simply copied as output using a output `varID`, it is successfully read. Since both CP and SP proceed using the same `EXEC_UDF` in the `processInstruction()` function, it seems likely that there is a task outside of `processInstruction()` that requests the fedWorker to clean the input `varID` data. I would like to confirm if this assumption is correct, and whether there are any additional considerations specifically for SP. Below are the `fedMap`, `readResponse`, and error messages that I checked with the debugger when running readBlobFromFederated().  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org