HeartSaVioR commented on code in PR #40163:
URL: https://github.com/apache/spark/pull/40163#discussion_r1117774538
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStore.scala:
##########
@@ -533,11 +539,22 @@ object StateStore extends Logging {
}
}
- val provider = loadedProviders.getOrElseUpdate(
- storeProviderId,
- StateStoreProvider.createAndInit(
- storeProviderId, keySchema, valueSchema, numColsPrefixKey,
storeConf, hadoopConf)
- )
+ // SPARK-42567 - Track load time for state store provider and log
warning if takes longer
+ // than 2s.
+ val (provider, loadTimeMs) = Utils.timeTakenMs {
+ loadedProviders.getOrElseUpdate(
+ storeProviderId,
+ StateStoreProvider.createAndInit(
+ storeProviderId, keySchema, valueSchema, numColsPrefixKey,
storeConf, hadoopConf)
+ )
+ }
+
+ if (loadTimeMs > 2000L) {
+ logWarning(s"Took too long to load state store provider with " +
Review Comment:
nit: maybe it's not necessary to call out it's too long? we have similar
informative warning log message in FlieStreamSource like below:
```
if (listingTimeMs > 2000) {
// Output a warning when listing files uses more than 2 seconds.
logWarning(s"Listed ${files.size} file(s) in $listingTimeMs ms")
} else {
logTrace(s"Listed ${files.size} file(s) in $listingTimeMs ms")
}
```
probably we can just replace "too long" with the elapsed time. The point in
the message is "how long it took", and it's placed at the end of "a bit long"
line of text.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]