On 11/26/2017 9:11 PM, Neia Neutuladh wrote:
The culprit for the C# version's poor performance was System.String.Substring,
which allocates a copy of its input data. So "Hello world".Substring(5) creates
a new char* pointing at a new memory allocation containing "Hello\0". C++'s
std::string does the same thing. So if I reimplemented subtex naively in C++,
its performance would be closer to the C# version than to the D version.
I could probably get slightly better performance than the D version by writing a
special `stringslice` struct. But that's a lot of work, and it's currently just
barely fast enough that I realize that I've actually run the command instead of
my shell inexplicably swallowing it.
0 terminated strings in C (and C++) have always been a severe performance issue
for programs that deal a lot in strings, for these reasons:
1. To get a substring, a copy must be made, meaning also that storage allocated
and managed for it.
2. To do most operations on it, you need to do a strlen() or equivalent.
You can always write your own string package to deal with, and I've written many
:-( and they all failed for one reason or another, mostly because about
everything in the C/C++ ecosystem is built around 0 terminated strings.