Implemented most of the changes you (Vitaly) suggested. I also tested this
with
regexp=interpreted.
In order to check the encoding of a string I introduced the methods
Is{Ascii,TwoByte}RepresentationUnderneath() and I think my implementation is
reasonably fast for the non-indirect case. I haven't replaced
Is{Ascii,TwoByte}Representation() in all the places where it should be
necessary
yet, as there are no test cases for that right now. I'll do that in another
CL
when dealing with externalized strings. Since cons strings may suffer from
the
same problem of wrong encoding tag and already has to be unpacked to check
for
encoding in some places, using those two new methods should not have a big
overall performance impact. It mostly concerns places when choosing between
ToAsciiVector() and ToUC16Vector().
Cheers,
Yang
http://codereview.chromium.org/7477045/diff/39001/src/heap.cc
File src/heap.cc (right):
http://codereview.chromium.org/7477045/diff/39001/src/heap.cc#newcode2681
src/heap.cc:2681: MaybeObject* maybe_result = Allocate(map, NEW_SPACE);
On 2011/08/17 19:20:23, Vitaly Repeshko wrote:
Consider adding Heap::Allocate{Ascii,TwoByte}SlicedString.
I decided not to, at least for now, since this is not used anywhere else
and there are no analogous methods for cons strings either.
http://codereview.chromium.org/7477045/diff/39001/src/ia32/code-stubs-ia32.cc
File src/ia32/code-stubs-ia32.cc (right):
http://codereview.chromium.org/7477045/diff/39001/src/ia32/code-stubs-ia32.cc#newcode3405
src/ia32/code-stubs-ia32.cc:3405: __ j(zero, &seq_two_byte_string);
On 2011/08/17 19:20:23, Vitaly Repeshko wrote:
Looks like this code was never updated to use near labels...
I updated the part of code that I touched. I'll make a CL to update all
the jumps in the ia32 code in the future.
http://codereview.chromium.org/7477045/diff/39001/src/ia32/code-stubs-ia32.cc#newcode3524
src/ia32/code-stubs-ia32.cc:3524: __ cmp(edi, kNotAStringSlice);
On 2011/08/17 19:20:23, Vitaly Repeshko wrote:
Will this still work if we initialize edi to 0 or to the slice offset
and
unconditionally add it to ebx and the length loaded from the original
string?
It would. However I also use edi as indicator to whether we are dealing
with a sliced string as subject (whenever edi != kNotAStringSlice). If
that is the case, the slice has to be unpacked and also the length of
the slice has to be used instead of the length of the parent string
(stored into esi). Using kNotAStringSlice = 0 does not work as indicator
since there are slices that start at offset 0 but have a shorter length
than the parent string.
http://codereview.chromium.org/7477045/diff/39001/src/ia32/code-stubs-ia32.cc#newcode5665
src/ia32/code-stubs-ia32.cc:5665: if (FLAG_string_slices) {
On 2011/08/17 19:20:23, Vitaly Repeshko wrote:
Whooops.
This is intended. I'm currently working on generated code for creating
substrings. Until then, whenever string slices are used, we resort to
the runtime system to create substrings. This method here doesn't create
slices, yet.
http://codereview.chromium.org/7477045/diff/39001/src/jsregexp.cc
File src/jsregexp.cc (right):
http://codereview.chromium.org/7477045/diff/39001/src/jsregexp.cc#newcode255
src/jsregexp.cc:255: subject->ToAsciiVector(),
On 2011/08/17 19:20:23, Vitaly Repeshko wrote:
To{Ascii,UC16}Vector won't work in case an indirect string encoding
disagrees
with its underlying string encoding.
I solved this by changing the assertion in both ToAsciiVector and
ToUC16Vector to consider indirect strings.
http://codereview.chromium.org/7477045/
--
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev