Re: [PR] [SPARK-47372][SS] Add support for range scan based key state encoder for use with state store provider [spark]

via GitHub Fri, 22 Mar 2024 23:05:03 -0700


anishshri-db commented on code in PR #45503:
URL: https://github.com/apache/spark/pull/45503#discussion_r1536573406



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateEncoder.scala:
##########
@@ -212,6 +212,12 @@ class PrefixKeyScanStateEncoder(
  * We cannot support variable sized fields given the UnsafeRow format which 
stores variable
  * sized fields as offset and length pointers to the actual values, thereby 
changing the required
  * ordering.
+ * Note that we also support "null" values being passed for these fixed size 
fields. We prepend
+ * a single byte to indicate whether the column value is null or not. We 
cannot change the
+ * nullability on the UnsafeRow itself as the expected ordering would change 
if non-first
+ * columns are marked as null. If the first col is null, those entries will 
appear last in
+ * the iterator. If non-first columns are null, ordering based on the previous 
columns will

Review Comment:
   Not exactly sure what you mean -
   
   for eg - if you look at the test here - 
https://github.com/apache/spark/pull/45503/files#diff-4c6b19c847c68e25ffa927df85efb4c79f388648d8c6242f1fe9f84cf09ec5ffR449
   
   Suppose we have the sample input as:
   ```
   Seq((931L, 10), (40L, null), (452300L, 1), (1L, 304), (100L, null))
   ```
   
   when we iterate the long elements should all be ordered as expected
   
   So we should get
   ```
   Seq(1L, 40L, 100L, 931L, 452300L)
   ```
   
   So in this case the ordering on the first col will still apply which is what 
I was trying to convey
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-47372][SS] Add support for range scan based key state encoder for use with state store provider [spark]

Reply via email to