Copilot commented on code in PR #53625:
URL: https://github.com/apache/spark/pull/53625#discussion_r2649794845
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala:
##########
@@ -70,6 +71,11 @@ class QueryExecution(
val refreshPhaseEnabled: Boolean = true) extends Logging {
val id: Long = QueryExecution.nextExecutionId
+ val queryId: UUID = UUIDv7Generator.generate()
+
+ // Tracks how many times this QueryExecution has been executed.
+ // Used by SQLExecution to determine whether to use qplQueryId or generate a
new one.
Review Comment:
The comment refers to "qplQueryId" which appears to be a typo or outdated
term. Based on the code and PR description, this should likely be "queryId" for
clarity and consistency with the actual field name used throughout the codebase.
```suggestion
// Used by SQLExecution to determine whether to use queryId or generate a
new one.
```
##########
sql/core/src/main/scala/org/apache/spark/sql/util/UUIDv7Generator.scala:
##########
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.util
+
+import java.time.Instant
+import java.util.UUID
+
+import scala.util.Random
+
+object UUIDv7Generator {
+
+ /**
+ * Follow the UUIDv7 specification defined in
+ *
https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format#section-5.2
+ */
+
+ private val random = new Random()
+
+ /**
+ * Generate a UUIDv7 from the current time.
+ */
+ def generate(): UUID = {
+ val now = Instant.now()
+ generateFrom(now.toEpochMilli, now.getNano)
+ }
+
+ /**
+ * Deterministic UUIDv7 generation from epochMilli and nanos.
+ * Called by generate() and used for testing.
+ */
+ def generateFrom(epochMilli: Long, nano: Int): UUID = {
+ // 48 bits for timestamp
+ val timestampMs = epochMilli & 0xFFFFFFFFFFFFL
+
+ // 12 bits, avoid LSB as most HW clocks have resolution in range of 10-40
ns
+ val randA = (nano>> 4) & 0xFFF
Review Comment:
Missing space after the right shift operator. The code has `(nano>> 4)` but
should be `(nano >> 4)` for consistency with Scala conventions and the rest of
the codebase. You can see this convention in other files like
HashedRelation.scala which uses `(address >>> SIZE_BITS)` with spaces.
```suggestion
val randA = (nano >> 4) & 0xFFF
```
##########
sql/core/src/main/scala/org/apache/spark/sql/util/UUIDv7Generator.scala:
##########
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.util
+
+import java.time.Instant
+import java.util.UUID
+
+import scala.util.Random
+
+object UUIDv7Generator {
+
+ /**
+ * Follow the UUIDv7 specification defined in
+ *
https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format#section-5.2
+ */
+
+ private val random = new Random()
Review Comment:
The shared Random instance in the object is not thread-safe. Since
`generateFrom` can be called from multiple threads concurrently (via
SQLExecution.withNewExecutionId0), the shared Random instance may produce
non-unique UUIDs due to race conditions in Random.nextLong().
Consider using ThreadLocalRandom.current() instead of a shared Random
instance to ensure thread-safety. ThreadLocalRandom is the standard approach
for concurrent random number generation in Java/Scala and is used elsewhere in
the Spark codebase.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]