gengliangwang commented on code in PR #39202:
URL: https://github.com/apache/spark/pull/39202#discussion_r1058673291


##########
core/src/main/scala/org/apache/spark/internal/config/History.scala:
##########
@@ -79,6 +79,21 @@ private[spark] object History {
     .stringConf
     .createOptional
 
+  object LocalStoreSerializer extends Enumeration {
+    val JSON, PROTOBUF = Value
+  }
+
+  val LOCAL_STORE_SERIALIZER = ConfigBuilder("spark.history.store.serializer")
+    .doc("Serializer for writing/reading in-memory UI objects to/from 
disk-based KV Store; " +
+      "JSON or PROTOBUF. JSON serializer is the only choice before Spark 
3.4.0, thus it is the " +
+      "default value. PROTOBUF serializer is fast and compact, and it is the 
default " +
+      "serializer for disk-based KV store of live UI.")

Review Comment:
   SHS writes RocksDB files during replaying event logs. The default serializer 
is JSON+GZIP, which is slower than the Protobuf serializer. Imagine that there 
are 300GB of event logs to replay...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to