It seems to me rather odd and surprising to introduce a code path into
sb.append(cs,int,int) that allocates memory in order to get at an intrinsic that
only sometimes makes things run faster. As you observed, the performance
tradeoffs aren't obvious.
Instead, if we want to optimize sb.append(cs,int,int) maybe we should just go
ahead and do that, possibly by adding or rearranging the intrinsics. I've filed
JDK-8217675 to cover this. Jim Laskey said he might be able to look at it.
https://bugs.openjdk.java.net/browse/JDK-8217675
s'marks