A total benchmark improvement of ~3.7% is certainly nothing to ignore. However, this isn't going to be a 100% gain: we do after all need to factor in the need to create new string headers and possibly allocate new buffer storage on string modifications. We're going to be better than even, but I don't thnk we're going to be at 3.7% after all those changes are made.
Besides the few test cases that you mention, do we have a lot of places where strings are specifically used with reference semantics in order to do inplace modifications in multiple places in the code? That is, is this going to be a huge change to existing PIR code? --Andrew Whitworth On Wed, Dec 23, 2009 at 1:58 PM, chromatic <[email protected]> wrote: > Do STRINGs have value or reference semantics? > > I'm exploring the idea of forbidding in-place modification of STRINGs in the C > API; the functions will return new STRING headers with the changes. This has > implications for PIR code which expects that STRINGs have reference semantics > -- that you can modify a STRING referred by multiple locations. > > Currently Parrot seems to prefer reference semantics. A handful of > frequently-called C functions perform a copy-on-write (COW) operation to > create a new STRING header every time a STRING header escapes -- in other > words, because they can't tell if the escaping header will get modified, they > have to allocate a new header with COW semantics for every escaping header, > even if the header only ever gets read (or becomes garbage immediately). > > The NQP-rx benchmark represents some likely HLL performance: > > ./parrot ext/nqp-rx/nqp-rx.pbc --target=pir Actions.pm > > Some ~72% of all STRING COW headers created are for internal bookkeeping only > -- to prevent the accidental modification of a STRING out from underneath > something else that uses it. This occurs in two places in the benchmark. The > first is when fetching the STRING contents of a Key PMC. The second is when > using a constant STRING (one created with CONST_STRING in our .c files, for > example, or appearing as a literal in PIR) as a parameter to a function. > > Another occasion which does not appear in this benchmark is when fetching the > name of a Class. (You can imagine how modifying that STRING in place would > cause problems.) > > Note that the String PMC's get_string() vtable entry always returns a COW > STRING. The set S, SC opcode performs COW on the STRING constant. > > Removing the always-COW from the Key PMC (when dealing with STRINGs) speeds up > the benchmark by 2.504%. > > Removing the always-COW from constant STRINGs used as function parameters > speeds up the benchmark by 1.204%. > > Both together speed up the benchmark by 3.678%. > > This particular benchmark shows no change in GC performance, which suggests > that the GC pressure primarily comes from PMCs. Another benchmark with > different STRING usage would show more benefit if it had STRING pressure on > the > GC. > > A couple of test files show failures with these changes, but they're where you > might expect them: > > t/op/string.t (Wstat: 11 Tests: 392 Failed: 0) > Non-zero wait status: 11 > Parse errors: Bad plan. You planned 411 tests but ran 392. > t/pmc/key.t (Wstat: 11 Tests: 8 Failed: 0) > Non-zero wait status: 11 > Parse errors: Bad plan. You planned 9 tests but ran 8. > > -- c > _______________________________________________ > http://lists.parrot.org/mailman/listinfo/parrot-dev > _______________________________________________ http://lists.parrot.org/mailman/listinfo/parrot-dev
