Re: Creeping Bloat in Phobos

Dmitry Olshansky via Digitalmars-d Sun, 28 Sep 2014 14:16:34 -0700

29-Sep-2014 00:44, Uranuz пишет:

It's Tolstoy actually:
http://en.wikipedia.org/wiki/War_and_Peace


You don't need byGrapheme for simple DSL. In fact as long as DSL is
simple enough (ASCII only) you may safely avoid decoding. If it's in
Russian you might want to decode. Even in this case there are ways to
avoid decoding, it may involve a bit of writing in as for typical
short novel ;)


Yes, my mistake ;) I was thinking about *Crime and Punishment* but
writen *War and Peace*. Don't know why. May be because it is longer.


Admittedly both are way too long for my taste :)

Thanks for useful links. As far as we are talking about standard library
I think that some stanard aproach should be provided to solve often
tasks: searching, sorting, parsing, splitting strings. I see that
currently we have a lot of ways of doing similar things with strings. I
think this is a problem of documentation at some part.

Some of this is historical, in particular std.string is way older thenstd.algorithm.

When I parsing
text I can't understand why I need to use all of these range interfaces
instead of just manipulating on raw narrow string. We have several
modules about working on strings: std.range, std.algorithm, std.string,
std.array,

std.range publicly imports std.array thus I really do not see why westill have std.array as standalone module.


 std.utf and I can't see how they help me to solve my

problems. In opposite they just creating me new problem to think of them
in order to find *right* way.

There is no *right* way, every level of abstraction has its uses. Alsothere is a bit of trade-off on performance vs easy/obvious/nice code.

So most of my time I spend on thinking
about it but not solving my task.

Takes time to get accustomed with a standard library. See also std.convand std.format. String processing is indeed shotgun-ed across entire phobos.

It is hard for me to accept that we don't need to decode to do some
operations. What is annoying is that I always need to think of
codelength that I should show to user and byte length that is used to
slice char array. It's very easy to be confused with them and do
something wrong.

As long as you use decoding primitives you keep getting back properindices automatically. That must be what some folks considered correctway to do Unicode until it was apparent to everybody that Unicode is waymore then this.


I see that all is complicated we have 3 types of character and more than
5 modules for trivial manipulations on strings with 10ths of functions.
It all goes into hell.

There are many tools, but when I write parsers I actually use almostnone of them. Well, nowdays I'm going to use the stuff in std.uni likeCodePointSet, utfMatcher etc. std.regex makes some use of these already,but prior to that std.utf.decode was my lone workhorse.

But I don't even started to do my job. And we
don't have *standard* way to deal with it in std lib. At least this way
in not documented enough.

Well on the bright side consider that C has lots of broken functions instdlib, and even some that are _never_ safe like "gets" ;)


--
Dmitry Olshansky

Re: Creeping Bloat in Phobos

Reply via email to