HyukjinKwon commented on PR #40525:
URL: https://github.com/apache/spark/pull/40525#issuecomment-1501261686

   > Is it so had to add the dependency for grpc when using the pandas API?
   
   It's not super hard. But it's a bit odd to add this alone to pandas API on 
Spark. We should probably think about adding grpcio as a hard dependency for 
whole PySpark project, but definitely not alone for pandas API on Spark.
   
   > What are we achieving with this? GRPC is a stable protocol and not a 
random library. It's available throughout all platforms.
   > What's the benefit of trying this pure approach?
   
   So for the current status, we're trying to add the dependencies that the 
module need so users won't need to install the unnecessary dependency. In 
addition, adding the dependency breaks existing applications when they migrate 
from 3.4 to 3.5. It matters when PySpark is installed without `pip` (which is 
actually the official release channel of Apache Spark).
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to