Re: RFR: JDK-8021560,(str) String constructors that take ByteBuffer

Stuart Marks Thu, 15 Feb 2018 13:56:10 -0800

public String(ByteBuffer bytes, Charset cs);
public String(ByteBuffer bytes, String csname);

I think these constructors make good sense. They avoid an extra copy to anintermediate byte[].

One issue (also mentioned by Stephen Colebourne) is whether we need the csnameoverload. Arguably it's not needed if we have the Charset overload. And thecsname overload throws UnsupportedEncodingException, which is checked. But thecsname overload is apparently faster, since the decoder can be cached, and it'sunclear when this can be remedied for the Charset case....


I could go either way on this one.

**

I'd also suggest adding a CharBuffer constructor:

    public String(CharBuffer cbuf)

This would be semantically equivalent to

    public String(char[] value, int offset, int count)

except using the chars from the CharBuffer between the buffer's position and itslimit.


**

Regarding the getBytes() overloads:

public int getBytes(byte[] dst, int offset, Charset cs);
public int getBytes(byte[] dst, int offset, String csname);
public int getBytes(ByteBuffer bytes, Charset cs);
public int getBytes(ByteBuffer bytes, Charset csn);

On 2/13/18, 12:41 AM, Alan Bateman wrote:
These four methods encode as many characters as possible into the destinationbyte[] or buffer but don't give any indication that the destination didn'thave enough space to encode the entire string. I thus worry they could be ahazard and result in buggy code. If there is insufficient space then the userof the API doesn't know how many characters were encoded so it's not easy tosubstring and call getBytes again to encode the remaining characters. There isalso the issue of how to size the destination. What would you think abouthaving them fail when there is insufficient space? If they do fail then thereis a side effect that they will have written to the destination so that wouldneed to be documented too.


I share Alan's concern here.

If the intent is to reuse a byte[] or a ByteBuffer, then there needs to be aneffective way to handle the case where the provided array/buffer doesn't haveenough room to receive the decoded string. A variety of ways of dealing withthis have been mentioned, such as throwing an exception; returning negativevalue to indicate failure, possibly also encoding the number of bytes written;or even allocating a fresh array or buffer of the proper size and returning thatinstead. The caller would have to check the return value and take care to handleall the cases properly. This is likely to be fairly error-prone.

This also raises the question in my mind of what these getBytes() methods areintended for.

On the one hand, they might be useful for the caller to manage its own memoryallocation and reuse of arrays/buffers. If so, then it's necessary forintermediate results from partial processing to be handled properly. If thedestination fills up, there needs to be a way to report how much of the inputwas consumed, so that a subsequent operation can pick up where the previous oneleft off. (This was one of David Lloyd's points.) If there's sufficient room inthe destination, there needs to be a way to report this and how much spaceremains in the destination. One could contemplate adding all this information tothe API. This eventually leads to


    CharsetEncoder.encode(CharBuffer in, ByteBuffer out, boolean endOfInput)

which has all the necessary partial progress state in the buffers.

On the other hand, maybe the intent of these APIs is for convenience. I'dobserve that String already has this method:


    public byte[] getBytes(Charset)

which returns the decoded bytes in a newly allocated array of the proper size.This is pretty convenient. It doesn't let the caller reuse a destination arrayor buffer... but that's what brings in all the partial result edge cases.

Bottom line is that I'm not entirely sure of the use of these new getBytes()overloads. Maybe I've missed a use case where these work; if so, maybe somebodycan describe it.


s'marks

Re: RFR: JDK-8021560,(str) String constructors that take ByteBuffer

Reply via email to