On Fri, 22 Sep 2023 08:53:07 GMT, Claes Redestad <[email protected]> wrote:

>> `URLEncoder` currently appends chars that needs encoding into a 
>> `java.io.CharArrayWriter`, converts that to a `String`, uses 
>> `String::getBytes` to get the encoded bytes and then appends these bytes in 
>> a escaped manner to the output stream. This is somewhat inefficient.
>> 
>> This PR replaces the `CharArrayWriter` with a reusable `CharBuffer` + 
>> `ByteBuffer` pair. This allows us to encode to the output `StringBuilder` in 
>> small chunks, with greatly reduced allocation as a result.
>> 
>> The exact size of the buffers is an open question, but generally it seems 
>> that a tiny buffer wins by virtue of allocating less, and that the per chunk 
>> overheads are relatively small.
>
> Claes Redestad has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Update src/java.base/share/classes/java/net/URLEncoder.java
>   
>   Co-authored-by: ExE Boss <[email protected]>

URLEncoder#DONT_NEED_ENCODING based on BitSet is actually a lookup table. 
Should we consider improving it in this way?


public class URLEncoder {
        static final long DONT_NEED_ENCODING_FLAGS_0;
        static final long DONT_NEED_ENCODING_FLAGS_1;

        static {
                long flag0 = 0;
            flag0 |= 1L << ' '; // ASCII 32
            flag0 |= 1L << '*'; // ASCII 42
            flag0 |= 1L << '-'; // ASCII 25
            flag0 |= 1L << '.'; // ASCII 46

            // ASCII 48 - 57
            for (int i = '0'; i <= '9'; ++i) {
                flag0 |= 1L << i;
            }
            DONT_NEED_ENCODING_FLAGS_0 = flag0;

            long flags1 = 0;
            // ASCII 65 - 90
            for (int i = 'A'; i <= 'Z'; ++i) {
                flags1 |= 1L << (i - 64);
            }
            flags1 |= 1L << ('_' - 64); // ASCII 95
            // ASCII 97 - 122
            for (int i = 'a'; i <= 'z'; ++i) {
                flags1 |= 1L << (i - 64);
            }
            DONT_NEED_ENCODING_FLAGS_1 = flags1;
        }

        private static boolean dontNeedEncoding(char c) {
                int prefix = c >> 6;
                if (prefix > 1) {
                        return false;
                }
                long flags = prefix == 0 ? DONT_NEED_ENCODING_FLAGS_0 : 
DONT_NEED_ENCODING_FLAGS_1;
                return (flags & (1L << c)) != 0;
        }
}

-------------

PR Comment: https://git.openjdk.org/jdk/pull/15865#issuecomment-1747842588

Reply via email to