bjornjorgensen opened a new pull request, #40716:
URL: https://github.com/apache/spark/pull/40716

   ### What changes were proposed in this pull request?
   
   Change `gRPC` to `grpcio` 
   This is ONLY in the printing, for users that haven't install `gRPC`
   
   ### Why are the changes needed?
   Users that don't have install `gRPC` will get this error when starting 
connect.
   
   ModuleNotFoundError                       Traceback (most recent call last)
   File /opt/spark/python/pyspark/sql/connect/utils.py:45, in 
require_minimum_grpc_version()
        44 try:
   ---> 45     import grpc
        46 except ImportError as error:
   
   ModuleNotFoundError: No module named 'grpc'
   
   The above exception was the direct cause of the following exception:
   
   ImportError                               Traceback (most recent call last)
   Cell In[1], line 11
         9 import pyarrow
        10 from pyspark import SparkConf, SparkContext
   ---> 11 from pyspark import pandas as ps
        12 from pyspark.sql import SparkSession
        13 from pyspark.sql.functions import col, concat, concat_ws, expr, lit, 
trim
   
   File /opt/spark/python/pyspark/pandas/__init__.py:59
        50     warnings.warn(
        51         "'PYARROW_IGNORE_TIMEZONE' environment variable was not set. 
It is required to "
        52         "set this environment variable to '1' in both driver and 
executor sides if you use "
      (...)
        55         "already launched."
        56     )
        57     os.environ["PYARROW_IGNORE_TIMEZONE"] = "1"
   ---> 59 from pyspark.pandas.frame import DataFrame
        60 from pyspark.pandas.indexes.base import Index
        61 from pyspark.pandas.indexes.category import CategoricalIndex
   
   File /opt/spark/python/pyspark/pandas/frame.py:88
        85 from pyspark.sql.window import Window
        87 from pyspark import pandas as ps  # For running doctests and 
reference resolution in PyCharm.
   ---> 88 from pyspark.pandas._typing import (
        89     Axis,
        90     DataFrameOrSeries,
        91     Dtype,
        92     Label,
        93     Name,
        94     Scalar,
        95     T,
        96     GenericColumn,
        97 )
        98 from pyspark.pandas.accessors import PandasOnSparkFrameMethods
        99 from pyspark.pandas.config import option_context, get_option
   
   File /opt/spark/python/pyspark/pandas/_typing.py:25
        22 from pandas.api.extensions import ExtensionDtype
        24 from pyspark.sql.column import Column as PySparkColumn
   ---> 25 from pyspark.sql.connect.column import Column as ConnectColumn
        26 from pyspark.sql.dataframe import DataFrame as PySparkDataFrame
        27 from pyspark.sql.connect.dataframe import DataFrame as 
ConnectDataFrame
   
   File /opt/spark/python/pyspark/sql/connect/column.py:19
         1 #
         2 # Licensed to the Apache Software Foundation (ASF) under one or more
         3 # contributor license agreements.  See the NOTICE file distributed 
with
      (...)
        15 # limitations under the License.
        16 #
        17 from pyspark.sql.connect.utils import check_dependencies
   ---> 19 check_dependencies(__name__)
        21 import datetime
        22 import decimal
   
   File /opt/spark/python/pyspark/sql/connect/utils.py:35, in 
check_dependencies(mod_name)
        33 require_minimum_pandas_version()
        34 require_minimum_pyarrow_version()
   ---> 35 require_minimum_grpc_version()
   
   File /opt/spark/python/pyspark/sql/connect/utils.py:47, in 
require_minimum_grpc_version()
        45     import grpc
        46 except ImportError as error:
   ---> 47     raise ImportError(
        48         "grpc >= %s must be installed; however, " "it was not 
found." % minimum_grpc_version
        49     ) from error
        50 if LooseVersion(grpc.__version__) < 
LooseVersion(minimum_grpc_version):
        51     raise ImportError(
        52         "gRPC >= %s must be installed; however, "
        53         "your version was %s." % (minimum_grpc_version, 
grpc.__version__)
        54     )
   
   ImportError: grpc >= 1.48.1 must be installed; however, it was not found.
   
   
   The last line tells that there is a module named `grpc` that's missing. 
   
    `pip install grpc`
   
   Collecting grpc
     Downloading grpc-1.0.0.tar.gz (5.2 kB)
     Preparing metadata (setup.py) ... error
     error: subprocess-exited-with-error
     
     × python setup.py egg_info did not run successfully.
     │ exit code: 1
     ╰─> [6 lines of output]
         Traceback (most recent call last):
           File "<string>", line 2, in <module>
           File "<pip-setuptools-caller>", line 34, in <module>
           File 
"/tmp/pip-install-vp4d8s4c/grpc_c0f1992ad8f7456b8ac09ecbaeb81750/setup.py", 
line 33, in <module>
             raise RuntimeError(HINT)
         RuntimeError: Please install the official package with: pip install 
grpcio
         [end of output]
     
     note: This error originates from a subprocess, and is likely not a problem 
with pip.
   error: metadata-generation-failed
   
   × Encountered error while generating package metadata.
   ╰─> See above for output.
   
   note: This is an issue with the package mentioned above, not pip.
   hint: See above for details.
   Note: you may need to restart the kernel to use updated packages.
   
   The right way to install this is `pip install grpcio`
   
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   
   ### How was this patch tested?
   Pass GA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to