HeartSaVioR commented on a change in pull request #30827:
URL: https://github.com/apache/spark/pull/30827#discussion_r545482658
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreCoordinator.scala
##########
@@ -122,13 +125,20 @@ private class StateStoreCoordinator(override val rpcEnv:
RpcEnv)
extends ThreadSafeRpcEndpoint with Logging {
private val instances = new mutable.HashMap[StateStoreProviderId,
ExecutorCacheTaskLocation]
- override def receive: PartialFunction[Any, Unit] = {
- case ReportActiveInstance(id, host, executorId) =>
+ override def receiveAndReply(context: RpcCallContext): PartialFunction[Any,
Unit] = {
+ case ReportActiveInstance(id, host, executorId, otherProviderIds) =>
Review comment:
Given the semantic of the request, I'm feeling more natural for this
request to return provider ids which are assigned to this
`ExecutorCacheTaskLocation`, say, "Here's the state store IDs you're registered
as active.". That makes sure the request does the only a thing like we do for
function - this change seems to do two different semantics of requests at once,
and now return value doesn't look to be related to the origin semantic of
request.
But I agree this change is better in point of performance-wise, as this
doesn't require iterating on instances. Let's hear how others think about this.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]