Awesome, thanks for the pointer. I'll open up an issue for this.
One thought, it might be nice to have something a little more automatic.
Specifically, I was thinking if there was an 'ImmutableString' class, you
would always know that you should make a view rather than a copy. It would
also be in keeping with the general design where specifying a type allows
the compiler to do nice optimizations.
On Wednesday, June 11, 2014 9:14:27 AM UTC-4, Kevin Squire wrote:
>
> Hi Abe,
>
> Looks like you just reimplemented SubString.
>
> julia> x = "Hi there!"
> "Hi there!"
>
> julia> SubString(x, 2, 4)
> "i t"
>
> julia> typeof(ans)
> SubString{ASCIIString} (constructor with 1 method)
>
>
> Which is totally understandable, as there seems to be almost zero
> documentation about them. Would you mind opening an issue about that?
>
> Cheers,
> Kevin
>
>
>
> On Wed, Jun 11, 2014 at 5:08 AM, Abe Schneider <[email protected]
> <javascript:>> wrote:
>
>> I'm in the middle of profiling my code, and I noticed that I'm paying a
>> penalty for doing a lot of sub-string copies. For what I'm doing, I don't
>> actually need copies of the string, but rather just want to keep a pointer
>> to the string with the range the view occupies. I thought I'd write up a
>> quick test to see if I could speed things up (please ignore the horrible
>> names):
>>
>> import Base: getindex, endof, substring
>>
>> immutable StringView
>> value::String
>> first::Int64
>> last::Int64
>> end
>>
>> immutable FastString <: String
>> value::String
>> end
>>
>> getindex(s::FastString, r::UnitRange{Int64}) = StringView(s.value,
>> r.start, r.stop)
>> endof(s::FastString) = endof(s.value)
>>
>>
>> const size = 10000000
>>
>> function teststring()
>> s = randstring(size)
>> for i=1:size-10
>> value = s[i:(i+10)]
>> end
>> end
>>
>> function teststringview()
>> local s::FastString = FastString(randstring(size))
>> for i=1:size-10
>> value = s[i:(i+10)]
>> end
>> end
>>
>> teststring()
>> @time teststring()
>>
>> teststringview()
>> @time teststringview()
>>
>>
>> The results I get are:
>> elapsed time: 0.910467582 seconds (890006256 bytes allocated)
>> elapsed time: 0.392893967 seconds (409999912 bytes allocated)
>>
>> The speed-up isn't incredible, but if you're doing a lot of text
>> processing, it might help.
>>
>> I'd be curious if anyone had thoughts or better ways of doing this. Or
>> for that matter, reasons why this may be a bad idea.
>>
>>
>