advancedxy commented on PR #42236:
URL: https://github.com/apache/spark/pull/42236#issuecomment-1668006357

   > While I'm not certain if it's reasonable, I still want to point out that 
relocating the content of the `spark-protobuf` module may result to a poorer 
user experience: In order to use this sql function, users have no choice but to 
relocate the content of the Java PB description files used in their business 
according to Spark's project rules(replcace all `com.google.protobuf.` 
to`org.sparkproject.spark_protobuf.protobuf.` in their generated Java files, 
Perhaps I misunderstood something?). Is this really user-friendly for existing 
data and businesses? Meanwhile, the `spark-protobuf` module is a module that 
won't be packaged into spark-client tar ball, is it very risky to only publish 
unshaded jars? @rangadi @HyukjinKwon
   
   
   +1 for @LuciferYang 's opinion. It's not reasonable that user has to package 
its protobuf jar relocated. Since the spark-protobuf jar is not included in the 
spark-client binary, I think a better solution would be publishing both 
unshaded and shaded jars. Users could simply packing their protobuf message 
classes into a jar and use spark-protobuf jar as needed. The shaded 
spark-protobuf jar is kept for compatibility and also for last resort.
   
   According to https://protobuf.dev/support/cross-version-runtime-guarantee/, 
I believe as long as Spark uses higher 3.x versions, user compiled java classes 
should be fine to run.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to