Re: [PR] [SPARK-43427][PROTOBUF] spark protobuf: allow upcasting unsigned integer types [spark]

via GitHub Mon, 13 Nov 2023 00:54:55 -0800


justaparth commented on code in PR #43773:
URL: https://github.com/apache/spark/pull/43773#discussion_r1390779040



##########
connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/ProtobufDeserializer.scala:
##########
@@ -193,6 +193,11 @@ private[sql] class ProtobufDeserializer(
       case (INT, ShortType) =>
         (updater, ordinal, value) => updater.setShort(ordinal, 
value.asInstanceOf[Short])
 
+      case (INT, LongType) =>
+        (updater, ordinal, value) =>
+          updater.setLong(
+            ordinal,
+            Integer.toUnsignedLong(value.asInstanceOf[Int]))

Review Comment:
   > It would be problematic when Spark has unsigned types. For the same 
reason, Parquet also doesn't support unsigned physical types for Spark.
   
   hey, i'm not sure if i follow; do you mind explaining what you mean by this?
   
   My goal here is to add an option allowing unsigned 32 and 64 bit integers 
coming from protobuf to be represented in a type that can contain them without 
overflow. I actually modeled my code off of how the parquet code today is 
written, which i believe is doing this same thing by default:
   
   
https://github.com/justaparth/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala#L243-L270
   
   
https://github.com/justaparth/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala#L345-L351



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-43427][PROTOBUF] spark protobuf: allow upcasting unsigned integer types [spark]

Reply via email to