HyukjinKwon commented on code in PR #40853:
URL: https://github.com/apache/spark/pull/40853#discussion_r1173179658


##########
python/pyspark/sql/connect/client.py:
##########
@@ -269,18 +267,15 @@ def userAgent(self) -> str:
         user_agent : str
             The user_agent parameter specified in the connection string,
             or "_SPARK_CONNECT_PYTHON" when not specified.
+            The returned value will be percent encoded.
         """
         user_agent = self.params.get(ChannelBuilder.PARAM_USER_AGENT, 
"_SPARK_CONNECT_PYTHON")
-        allowed_chars = string.ascii_letters + string.punctuation
-        if len(user_agent) > 200:
-            raise SparkConnectException(
-                "'user_agent' parameter cannot exceed 200 characters in length"
-            )
-        if set(user_agent).difference(allowed_chars):
+        ua_len = len(urllib.parse.quote(user_agent))
+        if ua_len > 2048:
             raise SparkConnectException(
-                "Only alphanumeric and common punctuations are allowed for 
'user_agent'"
+                f"'user_agent' parameter should not exceed 2048 characters, 
found {len} characters."
             )
-        return user_agent
+        return self.params.get(ChannelBuilder.PARAM_USER_AGENT, 
"_SPARK_CONNECT_PYTHON")

Review Comment:
   Why don't we just return `user_agent`?
   
   ```suggestion
           return user_agent
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to