On 11/28/2015 08:19 PM, Vitaly Davidovich wrote:
Is that a valid example given StringBuilder isn't threadsafe to begin
with? If the SB instance is shared among threads that perform writes
to it without external synchronization, it's a bug in that code. Am I
missing something?
That "bug" might be intentional. Produced to circumvent security check.
String is trusted to be immutable. For example here:
public FileInputStream(File file) throws FileNotFoundException {
String name = (file != null ? file.getPath() : null);
SecurityManager security = System.getSecurityManager();
if (security != null) {
security.checkRead(name);
}
if (name == null) {
throw new NullPointerException();
}
if (file.isInvalid()) {
throw new FileNotFoundException("Invalid file path");
}
fd = new FileDescriptor();
fd.attach(this);
path = name;
open(name);
}
...the trust is that it doesn't change between calls to:
security.checkRead(name);
and:
open(name);
You'd have to ensure that the returned String is stable and
effectively immutable though, which means the underlying byte[] has to
be released by SB right before it's mutated if it's shared, which I
think Ivan's patch is doing (from a quick glance).
But there's a race in code (even if 'shared' and 'value' were volatile),
which I think is impossible to fix in StringBuilder without introducing
locking.
Sequence of operations (referring to quoted example below):
- Thread 2 executes unshareValue() 1st while 'value' is not shared yet,
which does nothing.
- Thread 1 sets shared = true and creates a String with 'value' and
returns it
- Thread 2 proceeds with mutation of 'value' which is now shared with
String in thread 1
It would be possible to create a StringBuilder-like class (even a
subclass of AbstractStringBuilder) that would be safe to use and would
share the array with String with minimal overhead for mutative
operations, but such class would not be compatible with StringBuilder.
For example, using standard trick that limits the use of an instance to
the thread that constructed it:
public class SingleThreadedStringBuilder extends AbstractStringBuilder {
private Thread constructingThread = Thread.currentThread();
// and then in each mutative operation...
@Override
public SingleThreadedStringBuilder append(CharSequence s) {
if (Thread.currentThread() != constructingThread) throw new
IllegalStateException();
super.append(s);
return this;
}
// it could even be a one-shot object, so it could become unusable
after a call to toString()
@Override
public String toString() {
if (Thread.currentThread() != constructingThread) throw new
IllegalStateException();
constructingThread = null;
// create and return a String, possibly passing the 'value'
array by reference
...
}
Such class could be used for string concatenation, but I think JEP280
has a better answer for that and more.
Regards, Peter
sent from my phone
On Nov 27, 2015 10:16 AM, "Peter Levart" <[email protected]
<mailto:[email protected]>> wrote:
On 11/27/2015 01:39 PM, Ivan Gerasimov wrote:
Hello!
Prior to Java5, StringBuffer used to be able to share its
internal char[] buffer with the String, returned with toString().
With introducing of the new StringBuilder class this
functionality was removed.
It seems tempting to reintroduce this feature now in
AbstractStringBuilder.
The rationale here is that StringBuilder already provides a
way of accepting the hint about the result's size.
The constructor with the explicitly specified capacity is used
for increasing the efficiency of strings concatenations.
Optimizing this case by avoiding additional memory allocation
and copying looks sensible to me.
Here's the draft webrev
http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/00/webrev/
<http://cr.openjdk.java.net/%7Eigerasim/XXXXXXX-ASB-shared-value/00/webrev/>
It is tempting, yes. But I think there was a reason that this was
dropped when StringBuilder was introduced and that reason was
thread-safety. StringBuilder doesn't use any synchronization. If a
concurrent thread gets a reference to a StringBuilder and executes
mutating operations while some other thread calls toString() on
the same StringBuilder, then returned String could appear to be
changing it's content (be mutable), which can have security
implications:
413 public String toString() {
414 final byte[] value = this.value;
415 if (isLatin1()) {
416 if ((count << coder) < value.length) {
417 return StringLatin1.newString(value, 0, count);
418 }
419 shared = true;
420 return new String(value, String.LATIN1);
421 }
422 return StringUTF16.newStringSB(value, 0, count);
423 }
927 public AbstractStringBuilder replace(int start, int end,
String str) {
928 if (end > count) {
929 end = count;
930 }
931 checkRangeSIOOBE(start, end, count);
932 int len = str.length();
933 int newCount = count + len - (end - start);
934 ensureCapacityInternal(newCount);
935 unshareValue();
936 shift(end, newCount - count);
937 count = newCount;
938 putStringAt(start, str);
939 return this;
940 }
For example:
static final StringBuilder sb = new StringBuilder(3).append("abc");
Thread 1:
String s = sb.toString();
System.out.println(s);
Thread 2:
sb.replace(0, 3, "def");
Question: What can Thread 1 print?
Thread 1 sets shared = true and shares value array with String,
but when Thread 2 executes replace, it may not yet notice the
write from Thread 1 and may not do anything in unshareValue(),
effectively overwriting the shared array with "def".
Answer: I think anything of the following:
abc
dbc
aec
abf
dec
dbf
aef
def
...and the answer may change over time. Now imagine handing over
such string to new FileInputStream()...
Regards, Peter
This variant showed a significant speed improvement for the
cases, when the the capacity is equal to the result's size (up
to +25% to throughput).
For the other cases, the difference isn't very clear based on
my benchmarks :)
But is seems to be small enough, as it only adds a few
comparisons, coupled with already relatively heavy operations,
like allocation and copying.
Here's the benchmark that I've used:
http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/MyBenchmark.java
<http://cr.openjdk.java.net/%7Eigerasim/XXXXXXX-ASB-shared-value/MyBenchmark.java>
And the corresponding graphs.
http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/ASB-Shared-bench-test15.png
<http://cr.openjdk.java.net/%7Eigerasim/XXXXXXX-ASB-shared-value/ASB-Shared-bench-test15.png>
http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/ASB-Shared-bench-test17.png
<http://cr.openjdk.java.net/%7Eigerasim/XXXXXXX-ASB-shared-value/ASB-Shared-bench-test17.png>
The blue line here stands for the baseline throughput.
If people agree, this is a useful addition, a test should also
be added, of course.
Sincerely yours,
Ivan