A total benchmark improvement of ~3.7% is certainly nothing to ignore.
However, this isn't going to be a 100% gain: we do after all need to
factor in the need to create new string headers and possibly allocate
new buffer storage on string modifications. We're going to be better
than even, but I don't thnk we're going to be at 3.7% after all those
changes are made.

Besides the few test cases that you mention, do we have a lot of
places where strings are specifically used with reference semantics in
order to do inplace modifications in multiple places in the code? That
is, is this going to be a huge change to existing PIR code?

--Andrew Whitworth



On Wed, Dec 23, 2009 at 1:58 PM, chromatic <[email protected]> wrote:
> Do STRINGs have value or reference semantics?
>
> I'm exploring the idea of forbidding in-place modification of STRINGs in the C
> API; the functions will return new STRING headers with the changes.  This has
> implications for PIR code which expects that STRINGs have reference semantics
> -- that you can modify a STRING referred by multiple locations.
>
> Currently Parrot seems to prefer reference semantics.  A handful of
> frequently-called C functions perform a copy-on-write (COW) operation to
> create a new STRING header every time a STRING header escapes -- in other
> words, because they can't tell if the escaping header will get modified, they
> have to allocate a new header with COW semantics for every escaping header,
> even if the header only ever gets read (or becomes garbage immediately).
>
> The NQP-rx benchmark represents some likely HLL performance:
>
>        ./parrot ext/nqp-rx/nqp-rx.pbc --target=pir Actions.pm
>
> Some ~72% of all STRING COW headers created are for internal bookkeeping only
> -- to prevent the accidental modification of a STRING out from underneath
> something else that uses it.  This occurs in two places in the benchmark.  The
> first is when fetching the STRING contents of a Key PMC.  The second is when
> using a constant STRING (one created with CONST_STRING in our .c files, for
> example, or appearing as a literal in PIR) as a parameter to a function.
>
> Another occasion which does not appear in this benchmark is when fetching the
> name of a Class.  (You can imagine how modifying that STRING in place would
> cause problems.)
>
> Note that the String PMC's get_string() vtable entry always returns a COW
> STRING.  The set S, SC opcode performs COW on the STRING constant.
>
> Removing the always-COW from the Key PMC (when dealing with STRINGs) speeds up
> the benchmark by 2.504%.
>
> Removing the always-COW from constant STRINGs used as function parameters
> speeds up the benchmark by 1.204%.
>
> Both together speed up the benchmark by 3.678%.
>
> This particular benchmark shows no change in GC performance, which suggests
> that the GC pressure primarily comes from PMCs.  Another benchmark with
> different STRING usage would show more benefit if it had STRING pressure on 
> the
> GC.
>
> A couple of test files show failures with these changes, but they're where you
> might expect them:
>
> t/op/string.t                      (Wstat: 11 Tests: 392 Failed: 0)
>  Non-zero wait status: 11
>  Parse errors: Bad plan.  You planned 411 tests but ran 392.
> t/pmc/key.t                        (Wstat: 11 Tests: 8 Failed: 0)
>  Non-zero wait status: 11
>  Parse errors: Bad plan.  You planned 9 tests but ran 8.
>
> -- c
> _______________________________________________
> http://lists.parrot.org/mailman/listinfo/parrot-dev
>
_______________________________________________
http://lists.parrot.org/mailman/listinfo/parrot-dev

Reply via email to