slfan1989 opened a new issue, #4232:
URL: https://github.com/apache/amoro/issues/4232

   ### Search before asking
   
   - [x] I have searched in the 
[issues](https://github.com/apache/amoro/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### What would you like to be improved?
   
   ### Motivation
   
   `AmsThriftUrl` is used by clients to parse AMS thrift and ZooKeeper URLs. 
The current implementation has a few issues that can make URL parsing less 
robust:
   
   1. `parserThriftUrl()` lowercases the entire URL before parsing.
   2. `socketTimeout` query parameter parsing is duplicated.
   3. thrift URL validation is weak for missing host, missing port, or 
unsupported scheme.
   4. Some error messages are too generic, such as `invalid ams url`.
   
   Lowercasing the whole URL may unexpectedly change case-sensitive parts, such 
as catalog names or query parameter values.
   
   ### Current behavior
   
   For example:
   
   ```java
   AmsThriftUrl.parse("ThRiFt://LOCALHOST:1260/MyCatalog?socketTimeout=6000", 
null)
   ```
   
   currently lowercases the URL before parsing, so values such as host/catalog 
may be changed unexpectedly.
   
   Invalid URLs such as the following are not clearly rejected during parsing:
   
   ```
   http://127.0.0.1:1260/catalog
   thrift:///catalog
   thrift://127.0.0.1/catalog
   ```
   
   ### Expected behavior
   AmsThriftUrl should:
   
   - Preserve the original URL case for host, catalog name, and query values.
   - Treat URL schemes case-insensitively.
   - Validate that thrift URLs use the thrift scheme.
   - Validate that thrift URLs contain both host and port.
   - Parse `socketTimeout` consistently in thrift and ZooKeeper URL paths.
   - Provide clearer error messages for invalid URLs.
   
   ### Proposed changes
   - Avoid calling `toLowerCase()` on the whole URL.
   - Parse URL scheme with case-insensitive comparison.
   - Add explicit thrift URL validation for:
      - unsupported scheme
      - missing host
      - missing port
   - Extract shared socketTimeout query parsing logic.
   - Add unit tests for AmsThriftUrl.
   
   ### How should we improve?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [x] Yes I am willing to submit a PR!
   
   ### Subtasks
   
   _No response_
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to