xiangfu0 opened a new pull request, #14646: URL: https://github.com/apache/pinot/pull/14646
This PR implements various URL functions to handle various aspects of URL processing, including extraction, encoding/decoding, and manipulation, making them useful for tasks involving URL parsing and modification. ### URL Extraction Methods - `protocol(String url)`: Extracts the protocol (scheme) from the URL. - `domain(String url)`: Extracts the domain from the URL. - `domainWithoutWWW(String url)`: Extracts the domain without the leading "www." if present. - `topLevelDomain(String url)`: Extracts the top-level domain (TLD) from the URL. - `firstSignificantSubdomain(String url)`: Extracts the first significant subdomain from the URL. - `cutToFirstSignificantSubdomain(String url)`: Extracts the first significant subdomain and the top-level domain from the URL. - `cutToFirstSignificantSubdomainWithWWW(String url)`: Returns the part of the domain that includes top-level subdomains up to the "first significant subdomain", without stripping "www.". - `port(String url)`: Extracts the port from the URL. - `path(String url)`: Extracts the path from the URL without the query string. - `pathWithQuery(String url)`: Extracts the path from the URL with the query string. - `queryString(String url)`: Extracts the query string without the initial question mark (?) and excludes the fragment (#) and everything after it. - `fragment(String url)`: Extracts the fragment identifier (without the hash symbol) from the URL. - `queryStringAndFragment(String url)`: Extracts the query string and fragment identifier from the URL. - `extractURLParameter(String url, String name)`: Extracts the value of a specific query parameter from the URL. - `extractURLParameters(String url)`: Extracts all query parameters from the URL as an array of name=value pairs. - `extractURLParameterNames(String url)`: Extracts all parameter names from the URL query string. - `URLHierarchy(String url)`: Generates a hierarchy of URLs truncated at path and query separators. - `URLPathHierarchy(String url)`: Generates a hierarchy of path elements from the URL, excluding the protocol and host. ### URL Manipulation Methods - `urlEncode(String url)`: Encodes a string into a URL-safe format. - urlDecode(String url): Decodes a URL-encoded string. - encodeURLFormComponent(String url): Encodes the URL string following RFC-1866 standards, with spaces encoded as +. - decodeURLFormComponent(String url): Decodes the URL string following RFC-1866 standards, with + decoded as a space. - `netloc(String url)`: Extracts the network locality (username:password@host:port) from the URL. - `cutWWW(String url)`: Removes the leading "www." from a URL’s domain. - `cutQueryString(String url)`: Removes the query string, including the question mark. - `cutFragment(String url)`: Removes the fragment identifier, including the number sign. - `cutQueryStringAndFragment(String url)`: Removes both the query string and fragment identifier. - `cutURLParameter(String url, String name)`: Removes a specific query parameter from a URL. - `cutURLParameters(String url, String[] names)`: Removes multiple specific query parameters from a URL. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
