I'm in the middle of profiling my code, and I noticed that I'm paying a 
penalty for doing a lot of sub-string copies. For what I'm doing, I don't 
actually need copies of the string, but rather just want to keep a pointer 
to the string with the range the view occupies. I thought I'd write up a 
quick test to see if I could speed things up (please ignore the horrible 
names):

import Base: getindex, endof, substring

immutable StringView
  value::String
  first::Int64
  last::Int64
end

immutable FastString <: String
  value::String
end

getindex(s::FastString, r::UnitRange{Int64}) = StringView(s.value, r.start, 
r.stop)
endof(s::FastString) = endof(s.value)


const size = 10000000

function teststring()
  s = randstring(size)
  for i=1:size-10
    value = s[i:(i+10)]
  end
end

function teststringview()
  local s::FastString = FastString(randstring(size))
  for i=1:size-10
    value = s[i:(i+10)]
  end
end

teststring()
@time teststring()

teststringview()
@time teststringview()


The results I get are:
elapsed time: 0.910467582 seconds (890006256 bytes allocated)
elapsed time: 0.392893967 seconds (409999912 bytes allocated)

The speed-up isn't incredible, but if you're doing a lot of text 
processing, it might help.

I'd be curious if anyone had thoughts or better ways of doing this. Or for 
that matter, reasons why this may be a bad idea.

Reply via email to