Re: Possible quick win in GC?

2014-09-28 Thread Abdulhaq via Digitalmars-d

You mean this?
https://en.wikipedia.org/wiki/Escape_analysis


Of course my proposal uses the techique of escape analysis as 
part of its methodology, but the essence of the idea is to 
greatly cut down on the work that the GC has to do on each sweep 
when dealing with objects that have been found to belong to a 
particular set. The objects in each set are in an object graph 
that has no incoming references from objects external to the set 
and which can therefore be allocated in their own heap that is 
destroyed when the root object goes out of scope.


The saving takes place because the GC does not need to scan the 
default heap for pointers found in the new heaps (bands). For 
certain type of programs such as compilers / lexers / parsers 
where many temporary objects are allocated and shortly after 
deallocated this can result in a substantial time saving in 
execution. In terms of memory usage we would see multiple 
potentially large but short-lived spikes.




Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread via Digitalmars-d

On Monday, 29 September 2014 at 04:57:45 UTC, Walter Bright wrote:

Lots of bad practices are commonplace.


This is not an argument, it is a postulate.

You are arguing as if it is impossible to know whether the 
logic error is local

to the handler, or not, with a reasonable probability.


You're claiming to know that a program in an unknown and 
unanticipated state is really in a known state. It isn't.


It does not have to be known, it is sufficient that it is 
isolated or that it is improbable to be global or that it is of 
low impact to long term integrity.


Are you really suggesting that asserts should be replaced by 
thrown exceptions? I suspect we have little common ground here.


No, regular asserts should not be caught except for mailing the 
error log to the developer. They are for testing only.


Pre/postconditions between subsystems are on a different level 
though. They should not be conflated with regular asserts.


A vast assumption here that you know in advance what bugs 
you're going to have and what causes them.


I know in advance that a "divison-by-zero" error is of limited 
scope with high probability or that an error in a strictly pure 
validator is of low impact with high probability. I also know 
that any sign of a flaw in a transaction engine is a critical 
error that warrants a shutdown.


We know in advance that all programs above low complexity will 
contain bugs, most of them innocent and they are not a good 
excuse of shutting down the entire service for many services.


If you have memory safety, reasonable isolation and well tested 
global data-structures it is most desirable to keep the system 
running if it is incapable of corrupting a critical database.



You're using exceptions as a program bug reporting mechanism.


Uncaught exceptions are bugs and should be logged as such. If a 
form validator throws an unexpected exception then it is a bug. 
It makes the validation questionable, but does not affect the 
rest of the system. It is a non-critical bug that needs attention.



Whoa camel, indeed!


By your line of reasoning no software should ever be shipped, 
without a formal proof, because they most certainly will be buggy 
and contain unspecified undetected state.


Keep in mind that a working program, in the real world, is a 
program that provides reasonable output for reasonable input. 
Total correctness is a pipe dream, it is not within reach for 
most real programs. Not even with formal proofs.


Re: Local functions infer attributes?

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 9:23 PM, Timon Gehr wrote:

On 09/29/2014 04:43 AM, Walter Bright wrote:


You're right that tuples in D cannot contain storage classes


void foo(ref int x){}
alias p=ParameterTypeTuple!foo;
pragma(msg, p); // (ref int)

(But this does not help.)


You're right, I had forgotten about that.


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 9:03 PM, Sean Kelly wrote:

On Monday, 29 September 2014 at 02:57:03 UTC, Walter Bright wrote:
Right.  But if the condition that caused the restart persists, the process can
end up in a cascading restart scenario.  Simply restarting on error isn't
necessarily enough.


When it isn't enough, use the "engage the backup" technique.



I don't believe that the way to get 6 sigma reliability is by ignoring errors
and hoping. Airplane software is most certainly not done that way.


I believe I was arguing the opposite.  More to the point, I think it's necessary
to expect undefined behavior to occur and to plan for it.  I think we're on the
same page here and just miscommunicating.


Assuming that the program bug couldn't have affected other threads is relying on 
hope. Bugs happen when the program went into an unknown and unanticipated state. 
You cannot know, until after you debug it, what other damage the fault caused, 
or what other damage caused the detected fault.




My point was that it's often more complicated than that.  There have been papers
written on self-repairing systems, for example, and ways to design systems that
are inherently durable when it comes to even internal errors.


I confess much skepticism about such things when it comes to software. I do know 
how reliable avionics software is done, and that stuff does work even in the 
face of all kinds of bugs, damage, and errors. I'll be betting my life on that 
tomorrow :-)


Would you bet your life on software that had random divide by 0 bugs in it that 
were just ignored in the hope that they weren't serious? Keep in mind that 
software is rather unique in that a single bit error in a billion bytes can 
render the software utterly demented.


Remember the Apollo 11 lunar landing, when the descent computer software started 
showing self-detected faults? Armstrong turned it off and landed manually. He 
wasn't going to bet his ass that the faults could be ignored. You and I 
wouldn't, either.




I think what I'm
trying to say is that simply aborting on error is too brittle in some cases,
because it only deals with one vector--memory corruption that is unlikely to
reoccur.  But I've watched always-on systems fall apart from some unexpected
ongoing situation, and simply restarting doesn't actually help.


In such a situation, ignoring the error seems hardly likely to do any better.


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Walter Bright via Digitalmars-d
On 9/28/2014 9:31 PM, "Ola Fosheim Grøstad" 
" wrote:

Nothing wrong with it. Quite common and useful for a non-critical web service to
log the exception, then re-throw something like "internal error",  catch the
internal error at the root and returning the appropriate 5xx HTTP response, then
keep going.


Lots of bad practices are commonplace.



You are arguing as if it is impossible to know whether the logic error is local
to the handler, or not, with a reasonable probability.


You're claiming to know that a program in an unknown and unanticipated state is 
really in a known state. It isn't.




assert()s should also not be left in production code. They are not for catching
runtime errors, but for testing at the expense of performance.


Are you really suggesting that asserts should be replaced by thrown exceptions? 
I suspect we have little common ground here.




Uncaught exceptions should be re-thrown higher up in the call chain to a
different error level based on the possible impact on the system. Getting an
unexpected mismatch exception in a form-validator is not a big deal. Getting
out-of-bounds error in main storage is a big deal. Whether it is a big deal can
only be decided at the higher level.


A vast assumption here that you know in advance what bugs you're going to have 
and what causes them.




It is no doubt useful to be able to obtain a stack trace so that you can log it
when an exception turns out to fall into the "big deal" category and therefore
should be re-thrown as a critical failture. The deciding factor should be
performance.


You're using exceptions as a program bug reporting mechanism. Whoa camel, 
indeed!


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread via Digitalmars-d

On Sunday, 28 September 2014 at 22:59:46 UTC, Walter Bright wrote:
If anyone is writing code that throws an Exception with 
"internal error", then they are MISUSING exceptions to throw on 
logic bugs. I've been arguing this all along.


Nothing wrong with it. Quite common and useful for a non-critical 
web service to log the exception, then re-throw something like 
"internal error",  catch the internal error at the root and 
returning the appropriate 5xx HTTP response, then keep going.


You are arguing as if it is impossible to know whether the logic 
error is local to the handler, or not, with a reasonable 
probability. "Division by zero" is usually not a big deal, but it 
is a logic error. No need to shut down the service.


I'm not suggesting that Exceptions are to be thrown on 
programmer screwups - I suggest the OPPOSITE.


It is impossible to verify what the source is. It might be a bug 
in a boolean expression leading to a throw when the system is ok.


assert()s should also not be left in production code. They are 
not for catching runtime errors, but for testing at the expense 
of performance.


Uncaught exceptions should be re-thrown higher up in the call 
chain to a different error level based on the possible impact on 
the system. Getting an unexpected mismatch exception in a 
form-validator is not a big deal. Getting out-of-bounds error in 
main storage is a big deal. Whether it is a big deal can only be 
decided at the higher level.


It is no doubt useful to be able to obtain a stack trace so that 
you can log it when an exception turns out to fall into the "big 
deal" category and therefore should be re-thrown as a critical 
failture. The deciding factor should be performance.





Re: Local functions infer attributes?

2014-09-28 Thread Timon Gehr via Digitalmars-d

On 09/29/2014 04:43 AM, Walter Bright wrote:


You're right that tuples in D cannot contain storage classes


void foo(ref int x){}
alias p=ParameterTypeTuple!foo;
pragma(msg, p); // (ref int)

(But this does not help.)


A different compiler written in D

2014-09-28 Thread Andre Kostur via Digitalmars-d
I ran across mention of a JavaScript compiler written in D on the ACM 
TechNews...


https://github.com/maximecb/Higgs


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Timon Gehr via Digitalmars-d

On 09/29/2014 12:59 AM, Walter Bright wrote:

...


Unless, of course, you're suggesting that we put this around every
main() function:

void main() {
try {
...
} catch(Exception e) {
assert(0, "Unhandled exception: I screwed up");
}
}


I'm not suggesting that Exceptions are to be thrown on programmer
screwups - I suggest the OPPOSITE.



He does not suggest that Exceptions are to be thrown on programmer 
screw-ups, but rather that the thrown exception itself is the screw-up, 
with a possibly complex cause.


It is not:

if(screwedUp()) throw Exception("");


It is rather:

void foo(int x){
if(!test(x)) throw Exception(""); // this may be an expected code 
path for some callers

}

void bar(){
// ...
int y=screwUp();
foo(y); // yet it is unexpected here
}


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Timon Gehr via Digitalmars-d

On 09/29/2014 02:47 AM, Walter Bright wrote:

On 9/28/2014 4:18 PM, Joseph Rushton Wakeling via Digitalmars-d wrote:

I don't follow this point.  How can this approach work with programs
that are
built with the -release switch?


All -release does is not generate code for assert()s. ...


(Euphemism for undefined behaviour.)



Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Timon Gehr via Digitalmars-d

On 09/29/2014 06:06 AM, Timon Gehr wrote:

On 09/29/2014 02:47 AM, Walter Bright wrote:

On 9/28/2014 4:18 PM, Joseph Rushton Wakeling via Digitalmars-d wrote:

I don't follow this point.  How can this approach work with programs
that are
built with the -release switch?


All -release does is not generate code for assert()s. ...


(Euphemism for undefined behaviour.)



Also, -release additionally removes contracts, in particular invariant 
calls, and enables version(assert).


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Sean Kelly via Digitalmars-d

On Monday, 29 September 2014 at 02:57:03 UTC, Walter Bright wrote:


I've said that processes are different, because the scope of 
the effects is limited by the hardware.


If a system with threads that share memory cannot be restarted, 
there are serious problems with the design of it, because a 
crash and the necessary restart are going to happen sooner or 
later, probably sooner.


Right.  But if the condition that caused the restart persists, 
the process can end up in a cascading restart scenario.  Simply 
restarting on error isn't necessarily enough.



I don't believe that the way to get 6 sigma reliability is by 
ignoring errors and hoping. Airplane software is most certainly 
not done that way.


I believe I was arguing the opposite.  More to the point, I think 
it's necessary to expect undefined behavior to occur and to plan 
for it.  I think we're on the same page here and just 
miscommunicating.



I recall Toyota got into trouble with their computer controlled 
cars because of their idea of handling of inevitable bugs and 
errors. It was one process that controlled everything. When 
something unexpected went wrong, it kept right on operating, 
any unknown and unintended consequences be damned.


The way to get reliable systems is to design to accommodate 
errors, not pretend they didn't happen, or hope that nothing 
else got affected, etc. In critical software systems, that 
means shut down and restart the offending system, or engage the 
backup.


My point was that it's often more complicated than that.  There 
have been papers written on self-repairing systems, for example, 
and ways to design systems that are inherently durable when it 
comes to even internal errors.  I think what I'm trying to say is 
that simply aborting on error is too brittle in some cases, 
because it only deals with one vector--memory corruption that is 
unlikely to reoccur.  But I've watched always-on systems fall 
apart from some unexpected ongoing situation, and simply 
restarting doesn't actually help.


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 6:17 PM, Sean Kelly wrote:

On Sunday, 28 September 2014 at 22:00:24 UTC, Walter Bright wrote:


I can't get behind the notion of "reasonably certain". I certainly would not
use such techniques in any code that needs to be robust, and we should not be
using such cowboy techniques in Phobos nor officially advocate their use.


I think it's a fair stance not to advocate this approach.  But as it is I spend
a good portion of my time diagnosing bugs in production systems based entirely
on archived log data, and analyzing the potential impact on the system to
determine the importance of a hot fix.  The industry seems to be moving towards
lowering the barrier between engineering and production code (look at what
Netflix has done for example), and some of this comes from an isolation model
akin to the Erlang approach, but the typical case is still that hot fixing code
is incredibly expensive and so you don't want to do it if it isn't necessary.
For me, the correct approach may simply be to eschew assert() in favor of
enforce() in some cases.  But the direction I want to be headed is the one
you're encouraging.  I simply don't know if it's practical from a performance
perspective.  This is still developing territory.


You've clearly got a tough job to do, and I understand you're doing the best you 
can with it. I know I'm hardcore and uncompromising on this issue, but that's 
where I came from (the aviation industry).


I know what works (airplanes are incredibly safe) and what doesn't work 
(Toyota's approach was in the news not too long ago). Deepwater Horizon and 
Fukushima are also prime examples of not dealing properly with modest failures 
that cascaded into disaster.


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 6:39 PM, Sean Kelly wrote:

Well... suppose you design a system with redundancy such that an error in a
specific process isn't enough to bring down the system.  Say it's a quorum
method or whatever.  In the instance that a process goes crazy, I would argue
that the system is in an undefined state but a state that it's designed
specifically to handle, even if that state can't be explicitly defined at design
time.  Now if enough things go wrong at once the whole system will still fail,
but it's about building systems with the expectation that errors will occur.
They may even be logic errors--I think it's kind of irrelevant at that point.

Even a network of communicating processes, one getting in a bad state can
theoretically poison the entire system and you're often not in a position to
simply shut down the whole thing and wait for a repairman.  And simply rebooting
the system if it's a bad sensor that's causing the problem just means a pause
before another failure cascade.  I think any modern program designed to run
continuously (increasingly the typical case) must be designed with some degree
of resiliency or self-healing in mind.  And that means planning for and limiting
the scope of undefined behavior.


I've said that processes are different, because the scope of the effects is 
limited by the hardware.


If a system with threads that share memory cannot be restarted, there are 
serious problems with the design of it, because a crash and the necessary 
restart are going to happen sooner or later, probably sooner.


I don't believe that the way to get 6 sigma reliability is by ignoring errors 
and hoping. Airplane software is most certainly not done that way.


I recall Toyota got into trouble with their computer controlled cars because of 
their idea of handling of inevitable bugs and errors. It was one process that 
controlled everything. When something unexpected went wrong, it kept right on 
operating, any unknown and unintended consequences be damned.


The way to get reliable systems is to design to accommodate errors, not pretend 
they didn't happen, or hope that nothing else got affected, etc. In critical 
software systems, that means shut down and restart the offending system, or 
engage the backup.


There's no other way that works.


Re: [Semi OT] Language for Game Development talk

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 1:25 PM, H. S. Teoh via Digitalmars-d wrote:

This is not directly related to this thread, but recently in a Phobos PR
we discovered the following case:

// This function gets inlined:
int func1(int a) {
if (someCondition) {
return value1;
} else {
return value2;
}
}

// But this one doesn't:
int func2(int a) {
if (someCondition) {
return value1;
}
return value2;
}

IIRC Kenji said something about the first case being convertible to an
expression, but the second can't. It would be nice if inlining worked
for both cases, since semantically they are the same.


https://issues.dlang.org/show_bug.cgi?id=7625



Re: Local functions infer attributes?

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 6:31 PM, Manu via Digitalmars-d wrote:

   S!(int, S*)

That's different.

I feel like I have to somehow justify to you guys how meta code works
in D. I have meta code that is no less than 5 layers deep. It's
complex, but at the same time, somehow surprisingly elegant and simple
(this is the nature of D I guess).
If I now assume throughout my meta "pointer means ref", then when I
actually pass a pointer in, the meta can't know if it was meant to be
a ref or not. It results in complex explicit logic to handle at almost
every point due to a loss of information.

You can't call f() with the same syntax anymore (you need an '&')
which is a static if in the meta, you can't use the S* arg in the same
meta (needs a '*') which is another static if. Assignments are
changed, and unexpected indexing mechanics appear. When implementation
logic expects and understands the distinction between pointers and
ref's, this confuses that logic. When I interface between languages
(everything I never do binds to at least C++, and in this case, also
Lua), this complicates the situation.

I can't conflate 2 things that aren't the same. It leads to a lot of
mess in a lot of places.


You're right that tuples in D cannot contain storage classes (and ref is just 
one storage class, there's also out and in, etc.).


You can use autoref, but I haven't understood why that doesn't work for you.



Re: Creeping Bloat in Phobos

2014-09-28 Thread Martin Nowak via Digitalmars-d

On 09/28/2014 01:02 AM, Walter Bright wrote:


It's the autodecode'ing front(), which is a fairly complex function.


At least for dmd it's caused by a long-standing compiler bug.
https://issues.dlang.org/show_bug.cgi?id=7625

https://github.com/D-Programming-Language/phobos/pull/2566


tired of not having exit(code), ER proposed

2014-09-28 Thread ketmar via Digitalmars-d
Hello.

i'm really got tired of writing try/catch boilerplate in main() just to
be able to exit from some inner function, setup exitcode and suppress
stack trace. so i wrote this small ER:
https://issues.dlang.org/show_bug.cgi?id=13554

it adds ExitError to core.exception. when runtime catches it, runtime
will set error code and exit silently instead of printing error message
and stack trace. so now you can avoid wrapping your main in boilerplate
try/catch and easily exit from anywhere just by throwing ExitError.

sure you still can catch ExitError in your main(), do some cleanup and
rethrow it if you want to.

it's simple, it's easy, it's handy.

happy hacking.


signature.asc
Description: PGP signature


Re: Creeping Bloat in Phobos

2014-09-28 Thread Martin Nowak via Digitalmars-d

On 09/28/2014 02:23 AM, Andrei Alexandrescu wrote:

front() should follow a simple pattern that's been very successful in
HHVM: small inline function that covers most cases with "if (c < 0x80)"
followed by an out-of-line function on the multicharacter case. That
approach would make the cost of auto-decoding negligible in the
overwhelming majority of cases.

Andrei


Well, we're using the same trick for already 3 years now :).
https://github.com/D-Programming-Language/phobos/pull/299


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread ketmar via Digitalmars-d
On Mon, 29 Sep 2014 01:18:02 +0200
Joseph Rushton Wakeling via Digitalmars-d 
wrote:

> > Then use assert(). That's just what it's for.
> 
> I don't follow this point.  How can this approach work with programs
> that are built with the -release switch?

don't use "-release" switch. the whole concept of "release version" is
broken by design. ship what you debugged, not what you think you
debugged.


signature.asc
Description: PGP signature


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Sean Kelly via Digitalmars-d

On Monday, 29 September 2014 at 00:09:59 UTC, Walter Bright wrote:
On 9/28/2014 3:51 PM, Joseph Rushton Wakeling via Digitalmars-d 
wrote:
However, it's clearly very desirable in this use-case for the 
application to
keep going if at all possible and for any problem, even an 
Error, to be
contained in its local context if we can do so.  (By "local 
context", in
practice this probably means a thread or fiber or some other 
similar programming

construct.)


If the program has entered an unknown state, its behavior from 
then on cannot be predictable. There's nothing I or D can do 
about that. D cannot officially endorse such a practice, though 
D being a systems programming language it will let you do what 
you want.


I would not even consider such a practice for a program that is 
in charge of anything that could result in injury, death, 
property damage, security breaches, etc.


Well... suppose you design a system with redundancy such that an 
error in a specific process isn't enough to bring down the 
system.  Say it's a quorum method or whatever.  In the instance 
that a process goes crazy, I would argue that the system is in an 
undefined state but a state that it's designed specifically to 
handle, even if that state can't be explicitly defined at design 
time.  Now if enough things go wrong at once the whole system 
will still fail, but it's about building systems with the 
expectation that errors will occur.  They may even be logic 
errors--I think it's kind of irrelevant at that point.


Even a network of communicating processes, one getting in a bad 
state can theoretically poison the entire system and you're often 
not in a position to simply shut down the whole thing and wait 
for a repairman.  And simply rebooting the system if it's a bad 
sensor that's causing the problem just means a pause before 
another failure cascade.  I think any modern program designed to 
run continuously (increasingly the typical case) must be designed 
with some degree of resiliency or self-healing in mind.  And that 
means planning for and limiting the scope of undefined behavior.


Re: Creeping Bloat in Phobos

2014-09-28 Thread ketmar via Digitalmars-d
On Sun, 28 Sep 2014 19:44:39 +
Uranuz via Digitalmars-d  wrote:

> I speaking language which graphemes coded by 2 bytes
UCS-4? KOI8? my locale is KOI8, and i HATE D for assuming that everyone
one the planet using UTF-8 and happy with it. from my POV, almost all
string decoding is broken. string i got from filesystem? good god, lest
it not contain anything out of ASCII range! string i got from text
file? the same. string i must write to text file or stdout? oh, 'cmon,
what do you mean telling me "п©я─п╦п╡п╣я┌"?! i can't read that!


signature.asc
Description: PGP signature


Re: Local functions infer attributes?

2014-09-28 Thread Manu via Digitalmars-d
On 29 September 2014 10:51, Walter Bright via Digitalmars-d
 wrote:
> On 9/28/2014 5:38 PM, Manu via Digitalmars-d wrote:
>>
>> That said, my friend encountered one of my frequently recurring pain
>> cases himself yesterday:
>> struct S(T...)
>> {
>>void f(T args) {}
>> }
>>
>> S!(int, ref S) fail; // <-- no clean way to do this. I need this very
>> frequently, and he reached for it too, so I can't be that weird.
>
>
>   S!(int, S*)

That's different.

I feel like I have to somehow justify to you guys how meta code works
in D. I have meta code that is no less than 5 layers deep. It's
complex, but at the same time, somehow surprisingly elegant and simple
(this is the nature of D I guess).
If I now assume throughout my meta "pointer means ref", then when I
actually pass a pointer in, the meta can't know if it was meant to be
a ref or not. It results in complex explicit logic to handle at almost
every point due to a loss of information.

You can't call f() with the same syntax anymore (you need an '&')
which is a static if in the meta, you can't use the S* arg in the same
meta (needs a '*') which is another static if. Assignments are
changed, and unexpected indexing mechanics appear. When implementation
logic expects and understands the distinction between pointers and
ref's, this confuses that logic. When I interface between languages
(everything I never do binds to at least C++, and in this case, also
Lua), this complicates the situation.

I can't conflate 2 things that aren't the same. It leads to a lot of
mess in a lot of places.


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Sean Kelly via Digitalmars-d

On Sunday, 28 September 2014 at 22:00:24 UTC, Walter Bright wrote:


I can't get behind the notion of "reasonably certain". I 
certainly would not use such techniques in any code that needs 
to be robust, and we should not be using such cowboy techniques 
in Phobos nor officially advocate their use.


I think it's a fair stance not to advocate this approach.  But as 
it is I spend a good portion of my time diagnosing bugs in 
production systems based entirely on archived log data, and 
analyzing the potential impact on the system to determine the 
importance of a hot fix.  The industry seems to be moving towards 
lowering the barrier between engineering and production code 
(look at what Netflix has done for example), and some of this 
comes from an isolation model akin to the Erlang approach, but 
the typical case is still that hot fixing code is incredibly 
expensive and so you don't want to do it if it isn't necessary.  
For me, the correct approach may simply be to eschew assert() in 
favor of enforce() in some cases.  But the direction I want to be 
headed is the one you're encouraging.  I simply don't know if 
it's practical from a performance perspective.  This is still 
developing territory.


Re: Local functions infer attributes?

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 5:38 PM, Manu via Digitalmars-d wrote:

That said, my friend encountered one of my frequently recurring pain
cases himself yesterday:
struct S(T...)
{
   void f(T args) {}
}

S!(int, ref S) fail; // <-- no clean way to do this. I need this very
frequently, and he reached for it too, so I can't be that weird.


  S!(int, S*)



Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 4:18 PM, Joseph Rushton Wakeling via Digitalmars-d wrote:

I don't follow this point.  How can this approach work with programs that are
built with the -release switch?


All -release does is not generate code for assert()s. To leave the asserts in, 
do not use -release. If you still want the asserts to be in even with -release,


if (condition) assert(0);



Moreover, Sean's points here are absolutely on the money -- there are cases
where the "users" of a program may indeed want to see traces even for
anticipated errors.
And even if you design a nice structure of throwing and
catching exceptions so that the simple error message _always_ gives good enough
context to understand what went wrong, you still have the other issue that Sean
raised -- of an exception accidentally escaping its intended scope, because you
forgot to handle it -- when a trace may be extremely useful.

Put it another way -- I think you make a good case that stack traces for
exceptions should be turned off by default (possibly just in -release mode?),
but if that happens I think there's also a good case for a build flag that
ensures stack traces _are_ shown for Exceptions as well as Errors.


The -g switch should take care of that. It's what I use when I need a stack 
trace, as there are many ways a program can fail (not just Errors).




Re: Local functions infer attributes?

2014-09-28 Thread Manu via Digitalmars-d
On 28 September 2014 22:21, Andrei Alexandrescu via Digitalmars-d
 wrote:
> On 9/27/14, 7:42 PM, Manu via Digitalmars-d wrote:
>>
>> void f() pure nothrow @nogc
>> {
>> void localFunc()
>> {
>> }
>>
>> localFunc();
>> }
>>
>> Complains because localFunc is not @nogc or nothrow.
>> Doesn't complain about pure though.
>>
>> Is it reasonable to say that the scope of the outer function is
>> nothrow+@nogc, and therefore everything declared within should also be
>> so?
>
>
> Interesting. I'd guess probably not, e.g. a function may define a static
> local function and return its address (without either throwing or creating
> garbage), whereas that local function itself may do whatever it pleases.
>
> However, local functions have their body available by definition so they
> should have all deducible attributes deducted. That should take care of the
> problem.
>
>
> Andrei
>
> P.S. I also notice that my latest attempt at establishing communication has
> remained ignored.

I was out of town (was on my phone), and now I'm home with 2 guests,
and we're working together. I can't sit and craft a pile of example
cases until I'm alone and have time to do so. I haven't ignored it,
but I need to find the time to give you what you want.

That said, my friend encountered one of my frequently recurring pain
cases himself yesterday:
struct S(T...)
{
  void f(T args) {}
}

S!(int, ref S) fail; // <-- no clean way to do this. I need this very
frequently, and he reached for it too, so I can't be that weird.


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 3:51 PM, Joseph Rushton Wakeling via Digitalmars-d wrote:

However, it's clearly very desirable in this use-case for the application to
keep going if at all possible and for any problem, even an Error, to be
contained in its local context if we can do so.  (By "local context", in
practice this probably means a thread or fiber or some other similar programming
construct.)


If the program has entered an unknown state, its behavior from then on cannot be 
predictable. There's nothing I or D can do about that. D cannot officially 
endorse such a practice, though D being a systems programming language it will 
let you do what you want.


I would not even consider such a practice for a program that is in charge of 
anything that could result in injury, death, property damage, security breaches, 
etc.




Re: Creeping Bloat in Phobos

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 2:00 PM, Dmitry Olshansky wrote:

I've already stated my perception of the "no stinking exceptions", and "no
destructors 'cause i want it fast" elsewhere.

Code must be correct and fast, with correct being a precondition for any
performance tuning and speed hacks.


Sure. I'm not arguing for preferring incorrect code.



Correct usually entails exceptions and automatic cleanup. I also do not believe
the "exceptions have to be slow" motto, they are costly but proportion of such
costs was largely exaggerated.


I think it was you that suggested that instead of throwing on invalid UTF, that 
the replacement character be used instead? Or maybe not, I'm not quite sure.


Regardless, the replacement character method is widely used and accepted 
practice. There's no reason to throw.




Re: Creeping Bloat in Phobos

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 1:33 PM, Andrei Alexandrescu wrote:

On 9/28/14, 11:36 AM, Walter Bright wrote:

Currently, the autodecoding functions allocate with the GC and throw as
well. (They'll GC allocate an exception and throw it if they encounter
an invalid UTF sequence. The adapters use the more common method of
inserting a substitution character and continuing on.) This makes it
harder to make GC-free Phobos code.


The right solution here is refcounted exception plus policy-based functions in
conjunction with RCString. I can't believe this focus has already been lost and
we're back to let's remove autodecoding and ban exceptions. -- Andrei


Or setExt() can simply insert .byCodeUnit as I suggested in the PR, and it's 
done and working correctly and doesn't throw and doesn't allocate and goes fast.


Not everything in Phobos can be dealt with so easily, of course, but there's 
quite a bit of low hanging fruit of this nature we can just take care of now.


Re: Creeping Bloat in Phobos

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 1:39 PM, H. S. Teoh via Digitalmars-d wrote:

It can work just fine, and I wrote it. The problem is convincing
someone to pull it :-( as the PR was closed and reopened with
autodecoding put back in.


The problem with pulling such PRs is that they introduce a dichotomy
into Phobos. Some functions autodecode, some don't, and from a user's
POV, it's completely arbitrary and random. Which leads to bugs because
people can't possibly remember exactly which functions autodecode and
which don't.


That's ALREADY the case, as I explained to bearophile.

The solution is not to have the ranges autodecode, but to have the ALGORITHMS 
decide to autodecode (if they need it) or not (if they don't).




As I've explained many times, very few string algorithms actually need
decoding at all. 'find', for example, does not. Trying to make a
separate universe out of autodecoding algorithms is missing the point.

[...]

Maybe what we need to do, is to change the implementation of
std.algorithm so that it internally uses byCodeUnit for narrow strings
where appropriate. We're already specialcasing Phobos code for narrow
strings anyway, so it wouldn't make things worse by making those special
cases not autodecode.


Those special cases wind up going everywhere and impacting everyone who attempts 
to write generic algorithms.




This doesn't quite solve the issue of composing ranges, since one
composed range returns dchar in .front composed with another range will
have autodecoding built into it. For those cases, perhaps one way to
hack around the present situation is to use Phobos-private enums in the
wrapper ranges (e.g., enum isNarrowStringUnderneath=true; in struct
Filter or something), that ranges downstream can test for, and do the
appropriate bypasses.


More complexity :-( for what should be simple tasks.



(BTW, before you pick on specific algorithms you might want to actually
look at the code for things like find(), because I remember there were a
couple o' PRs where find() of narrow strings will use (presumably) fast
functions like strstr or strchr, bypassing a foreach loop over an
autodecoding .front.)


Oh, I know that many algorithms have such specializations. Doesn't it strike you 
as sucky to have to special case a whole basket of algorithms when the 
InputRange does not behave in a reliable manner?


It's very simple for an algorithm to decode if it needs to, it just adds in a 
.byDchar adapter to its input range. Done. No special casing needed. The lines 
of code written drop in half. And it works with both arrays of chars, arrays of 
dchars, and input ranges of either.


---

The stalling of setExt() has basically halted my attempts to adjust Phobos so 
that one can write nothrow and @nogc algorithms that work on strings.


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Joseph Rushton Wakeling via Digitalmars-d

On 28/09/14 19:33, Walter Bright via Digitalmars-d wrote:

On 9/28/2014 9:23 AM, Sean Kelly wrote:

Also, I think the idea that a program is created and shipped to an end user is
overly simplistic.  In the server/cloud programming world, when an error occurs,
the client who submitted the request will get a response appropriate for them
and the system will also generate log information intended for people working on
the system.  So things like stack traces and assertion failure information is
useful even for production software.  Same with any critical system, as I'm sure
you're aware.  The systems are designed to handle failures in specific ways, but
they also have to leave a breadcrumb trail so the underlying problem can be
diagnosed and fixed.  Internal testing is never perfect, and achieving a high
coverage percentage is nearly impossible if the system wasn't designed from the
ground up to be testable in such a way (mock frameworks and such).


Then use assert(). That's just what it's for.


I don't follow this point.  How can this approach work with programs that are 
built with the -release switch?


Moreover, Sean's points here are absolutely on the money -- there are cases 
where the "users" of a program may indeed want to see traces even for 
anticipated errors.  And even if you design a nice structure of throwing and 
catching exceptions so that the simple error message _always_ gives good enough 
context to understand what went wrong, you still have the other issue that Sean 
raised -- of an exception accidentally escaping its intended scope, because you 
forgot to handle it -- when a trace may be extremely useful.


Put it another way -- I think you make a good case that stack traces for 
exceptions should be turned off by default (possibly just in -release mode?), 
but if that happens I think there's also a good case for a build flag that 
ensures stack traces _are_ shown for Exceptions as well as Errors.


Re: Creeping Bloat in Phobos

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 1:38 PM, bearophile wrote:

Walter Bright:


It can work just fine, and I wrote it. The problem is convincing someone to
pull it :-( as the PR was closed and reopened with autodecoding put back in.


Perhaps you need a range2 and algorithm2 modules. Introducing your changes in a
sneaky way may not produce well working and predictable user code.


I'm not suggesting sneaky ways. setExt() was a NEW function.



I know that you care about performance - you post about it often. I would
expect that unnecessary and pervasive decoding would be of concern to you.


I care first of all about program correctness (that's why I proposed unusual
things like optional strong typing for built-in array indexes, or I proposed the
"enum preconditions").


Ok, but you implied at one point that you were not aware of which parts of your 
string code decoded and which did not. That's not consistent with being very 
careful about correctness.


Note that autodecode does not always happen - it doesn't happen for ranges of 
chars. It's very hard to look at piece of code and tell if autodecode is going 
to happen or not.



Secondly I care for performance in the functions or parts
of code where performance is needed. There are plenty of code where performance
is not the most important thing. That's why I have tons of range-base code. In
such large parts of code having short, correct, nice looking code that looks
correct is more important. Please don't assume I am simple minded :-)


It's very hard to disable the autodecode when it is not needed, though the new 
.byCodeUnit has made that much easier.




Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 1:56 PM, H. S. Teoh via Digitalmars-d wrote:

It looks even more awful when the person who wrote the library code is
Russian, and the user speaks English, and when an uncaught exception
terminates the program, you get a completely incomprehensible message in
a language you don't know. Not much different from a line number and
filename that has no meaning for a user.


I cannot buy into the logic that since Russian error messages are 
incomprehensible to me, that therefore incomprehensible messages are ok.



That's why I said, an uncaught exception is a BUG.


It's a valid opinion, but is not the way D is designed to work.



The only place where
user-readable messages can be output is in a catch block where you
actually have the chance to localize the error string. But if no catch
block catches it, then by definition it's a bug, and you might as while
print some useful info with it that your users can send back to you,
rather than unhelpful bug reports of the form "the program crashed with
error message 'internal error'".


If anyone is writing code that throws an Exception with "internal error", then 
they are MISUSING exceptions to throw on logic bugs. I've been arguing this all 
along.




if the program failed to catch an exception, you're already screwed
anyway


This is simply not true. One can write utilities with no caught exceptions at 
all, and yet have the program emit user friendly messages about "disk full" and 
stuff like that.




so why not provide more info rather than less?


Because having an internal stack dump presented to the app user for when he, 
say, puts in invalid command line arguments, is quite inappropriate.




Unless, of course, you're suggesting that we put this around every
main() function:

void main() {
try {
...
} catch(Exception e) {
assert(0, "Unhandled exception: I screwed up");
}
}


I'm not suggesting that Exceptions are to be thrown on programmer screwups - I 
suggest the OPPOSITE.






Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Joseph Rushton Wakeling via Digitalmars-d

On 28/09/14 22:13, Walter Bright via Digitalmars-d wrote:

On 9/28/2014 12:33 PM, Sean Kelly wrote:

Then use assert(). That's just what it's for.

What if I don't want to be forced to abort the program in the event of such an
error?


Then we are back to the discussion about can a program continue after a logic
error is uncovered, or not.

In any program, the programmer must decide if an error is a bug or not, before
shipping it. Trying to avoid making this decision leads to confusion and using
the wrong techniques to deal with it.

A program bug is, by definition, unknown and unanticipated. The idea that one
can "recover" from it is fundamentally wrong. Of course, in D one can try and
recover from them anyway, but you're on your own trying that, just as you're on
your own when casting integers to pointers.


Allowing for your "you can try ..." remarks, I still feel this doesn't really 
cover the practical realities of how some applications need to behave.


Put it this way: suppose we're writing the software for a telephone exchange, 
which is handling thousands of simultaneous calls.  If an Error is thrown inside 
the part of the code handling one single call, is it correct to bring down 
everyone else's call too?


I appreciate that you might tell me "You need to find a different means of error 
handling that can distinguish errors that are recoverable", but the bottom line 
is, in such a scenario it's not possible to completely rule out an Error being 
thrown (an obvious cause would be an assert that gets triggered because the 
programmer forgot to put a corresponding enforce() statement at a higher level 
in the code).


However, it's clearly very desirable in this use-case for the application to 
keep going if at all possible and for any problem, even an Error, to be 
contained in its local context if we can do so.  (By "local context", in 
practice this probably means a thread or fiber or some other similar programming 
construct.)


Sean's touched on this in the current thread with his reference to Erlang, and I 
remember that he and Dicebot brought the issue up in an earlier discussion on 
the Error vs. Exception question, but I don't recall that discussion having any 
firm conclusion, and I think it's important to address; we can't simply take "An 
Error is unrecoverable" as a point of principle for every application.


(Related note: If I recall right, an Error or uncaught Exception thrown within a 
thread or fiber will not actually bring the application down, only cause that 
thread/fiber to hang, without printing any indication of anything going wrong. 
So on a purely practical basis, it can be essential for the top-level code of a 
thread or fiber to have a catch {} block for both Errors and Exceptions, just in 
order to be able to report what has happened effectively.)


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 1:50 PM, Sean Kelly wrote:

On Sunday, 28 September 2014 at 20:31:03 UTC, Walter Bright wrote:

> The scope of a logic bug can be known to be quite limited.

If you know about the bug, then you'd have fixed it already instead of
inserting recovery code for unknown problems. I can't really accept that one
has "unknown bugs of known scope".


Well, say you're using SafeD or some other system where you know that memory
corruption is not possible (pure functional programming, for example).  In this
case, if you know what data a particular execution flow touches, you know the
scope of the potential damage.  And if the data touched is all either shared but
read-only or generated during the processing of the request, you can be
reasonably certain that nothing outside the scope of the transaction has been
adversely affected at all.


You may know the error is not a memory corrupting one, but that doesn't mean 
there aren't non-corrupting changes to the shared memory that would result in 
additional unexpected failures. Also, the logic bug may be the result of an 
@system part of the code going wrong. You do not know, because YOU DO NOT KNOW 
the cause the error. And if you knew the cause, you wouldn't need a stack trace 
to debug it anyway.


I.e. despite being 'safe' it does not imply the program is in a predictable or 
anticipated state.


I can't get behind the notion of "reasonably certain". I certainly would not use 
such techniques in any code that needs to be robust, and we should not be using 
such cowboy techniques in Phobos nor officially advocate their use.


Re: DQuick a GUI Library (prototype)

2014-09-28 Thread Xavier Bigand via Digitalmars-d

Le 28/09/2014 02:48, Ivan a écrit :

On Tuesday, 20 Augu2013 at 21:22:48 UTC, Flamaros wrote:

I want to share a short presentation of the project I am working on
with friends. It's a prototype of a GUI library written in D.

This pdf contains our vision of what the project would be. Samples are
directly extracted from our prototype and works. We are not able to
share more than this presentation for the moment because a lot of
things are missing and it's plenty of bugs.

The development is really slow, so don't expect to see a real
demonstration a day.

The link :
https://docs.google.com/file/d/0BygGiQfhIcvGOGlrWlBFNWNBQ3M/edit?usp=sharing


PS : Download it for a better quality



Is this still being developed?


Yes it is, but development is still slow.

You can follow last progress on github : https://github.com/D-Quick/DQuick




Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Dmitry Olshansky via Digitalmars-d

29-Sep-2014 01:21, Sean Kelly пишет:

On Sunday, 28 September 2014 at 21:16:51 UTC, Dmitry Olshansky wrote:


But otherwise agreed, dropping the whole process is not always a good
idea or it easily becomes a DoS attack vector in a public service.


What I really want to work towards is the Erlang model where an app is a
web of communicating processes (though Erlang processes are effectively
equivalent to D objects).  Then, killing a process on an error is
absolutely correct.  It doesn't affect the resilience of the system.
But if these processes are actually threads or fibers with memory
protection, things get a lot more complicated.


One thing I really appreciated about JVM is exactly the memory safety 
with ability to handle this pretty much in the same way Erlang does.



I really need to spend
some time investigating how modern Linux systems handle tons of
processes running on them and try to find a happy medium.


Keep us posted.

--
Dmitry Olshansky


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Sean Kelly via Digitalmars-d
On Sunday, 28 September 2014 at 21:16:51 UTC, Dmitry Olshansky 
wrote:


But otherwise agreed, dropping the whole process is not always 
a good idea or it easily becomes a DoS attack vector in a 
public service.


What I really want to work towards is the Erlang model where an 
app is a web of communicating processes (though Erlang processes 
are effectively equivalent to D objects).  Then, killing a 
process on an error is absolutely correct.  It doesn't affect the 
resilience of the system.  But if these processes are actually 
threads or fibers with memory protection, things get a lot more 
complicated.  I really need to spend some time investigating how 
modern Linux systems handle tons of processes running on them and 
try to find a happy medium.


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Cliff via Digitalmars-d
On Sunday, 28 September 2014 at 20:58:20 UTC, H. S. Teoh via 
Digitalmars-d wrote:
I do not condone adding file/line to exception *messages*. 
Catch blocks
can print / translate those messages, which can be made 
user-friendly,
but if the program failed to catch an exception, you're already 
screwed

anyway so why not provide more info rather than less?

Unless, of course, you're suggesting that we put this around 
every

main() function:

void main() {
try {
...
} catch(Exception e) {
assert(0, "Unhandled exception: I screwed up");
}
}



In our production C# code, we had a few practices which might be 
applicable here:


1. main() definitely had a top-level try/catch handler to produce 
useful output messages.  Because throwing an uncaught exception 
out to the user *is* a bug, we naturally want to not just toss 
out a stack trace but information on what to do with it should a 
user encounter it.  Even better if there is additional runtime 
information which can be provided for a bug report.


2. We also registered a top-level unhandled exception handler on 
the AppDomain (equivalent to a process in .NET, except that 
multiple AppDomains may exist within a single OS process), which 
allows the catching to exceptions which would otherwise escape 
background threads.  Depending on the nature of the application, 
these could be logged to some repository to which the user could 
be directed.  It's hard to strictly automate this because exactly 
what you can do with an exception which escapes a thread will be 
application dependent.  In our case, these exceptions were 
considered bugs, were considered to be unrecoverable and resulted 
in a program abort with a user message indicating where to find 
the relevant log outputs and how to contact us.


3. For some cases, throwing an exception would also trigger an 
application dump suitable for post-mortem debugging from the 
point the exception was about to be thrown.  This functionality 
is, of course, OS-specific, but helped us on more than a few 
occasions by eliminating the need to try to pre-determine which 
information was important and which was not so the exception 
could be usefully populated.


I'm not a fan of eliminating the stack from exceptions.  While 
exceptions should not be used to catch logic errors, an uncaught 
exception is itself a logic error (that is, one has omitted some 
required conditions in their code) and thus the context of the 
error needs to be made available somehow.


Re: Any libunwind experts n da house?

2014-09-28 Thread IgorStepanov via Digitalmars-d
On Saturday, 27 September 2014 at 23:31:17 UTC, Andrei 
Alexandrescu wrote:

On 9/27/14, 1:31 PM, IgorStepanov wrote:
No, that for throwing from C++ into D: for catch an exception, 
we should
pass type_info object to special C++ runtime function. C++ 
runtime
determines, can throwed object type can be casted to asked 
type, and if
yes - allow catch it and do catcher code. If you will see the 
my
example, you will see that I do this manually: get throwed 
type_info and

compare its mangle with requested mangle. If we will make it as
possible, it will be work better, faster and  reliable. As 
bonus:

possibility to implement dynamic_cast over C++ classes.


If that's what's needed, definitely please do explore it! But I 
defer expertise to Walter. -- Andrei


Ok. Anyway, I can't work on this to the full extent, because I 
have a three D works (six pull requests), which are waiting for 
action from the D collaborators (UDA for modules PR reviewed and 
is waiting for approval, multiple alias this and new AA 
implementation are waiting for review).


However, I've seen this direction and I want to report a few 
points:
1. C++ type_info/TypeHandle for classes is located on -1 index of 
vtbl;
2. type_info layout aren't standartized (as expected) and differs 
in G++, VS and (probably) DMC and SunC.
3. type_info in C++ uses in many different cases, like 
dynamic_cast and excetion handling.
4. D doensn't generate type_info and it can cause danger 
situation. e.g.



//C++ code
class CppBase
{
public:
virtual void test() = 0;
};

class CppDerived : public CppBase
{
public:
void test();
};


void CppDerived::test()
{
std::cout << "CppDerived::test()" << std::endl;
}

void doTest(CppBase *obj)
{
obj->test();
CppDerived *dobj = dynamic_cast(obj); 
//Attention!

if (dobj)
{
std::cout << "casted" << std::endl;
}
else
{
std::cout << "fail" << std::endl;
}
}

//D code

extern(C++) interface CppBase
{
void test();
}


class DDerived : CppBase
{
extern(C++) override void test()
{
writeln("DDerived.test()");
}
}

extern(C++) void doTest(CppBase);

void main()
{
writeln("start test");
doTest(new DDerived()); //BOOM! segfault while processing 
dynamic_cast, because DDerived type_info is wrong.

writeln("finish test");
}


//Now my suggestions:
1. We can implement type_info generation as library template like 
GenTypeInfo(T). It will return valid type_info object for 
supproted platforms and null for platforms, which isn't supported 
yet.
2. Compiler will use this template to get type_info and push it 
into vtbl (at -1 position)
3. In situation, when compile need to use type_info (may be 
try-catch with C++ exceptions or dynamic_cast, it will be raise 
error if type_info isn't implemented)


This approach allows to move complex, platform-depended code from 
compiler to library. Also it allows to don't implement some 
platforms without user restrictions.


In conclusion: this is a g++ type_info definitions:
https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/libsupc%2B%2B/cxxabi.h#L535

This has a flag "__diamond_shaped_mask" in __vmi_class_type_info, 
and "__virtual_mask" in __base_class_type_info.
D allows multiply inheritance for interfaces. In mapping to C++: 
Is this inheritance virtual? Should we set __diamond_shaped_mask 
is A: B, C; B : D; C : D (B, C, D is interfaces)?


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Dmitry Olshansky via Digitalmars-d

29-Sep-2014 00:50, Sean Kelly пишет:

On Sunday, 28 September 2014 at 20:31:03 UTC, Walter Bright wrote:


If the threads share memory, the only robust choice is to terminate
all the threads and the application.

If the thread is in another process, where the memory is not shared,
then terminating and possibly restarting that process is quite
acceptable.

> The scope of a logic bug can be known to be quite limited.

If you know about the bug, then you'd have fixed it already instead of
inserting recovery code for unknown problems. I can't really accept
that one has "unknown bugs of known scope".


Well, say you're using SafeD or some other system where you know that
memory corruption is not possible (pure functional programming, for
example).



In this case, if you know what data a particular execution
flow touches, you know the scope of the potential damage.  And if the
data touched is all either shared but read-only or generated during the
processing of the request, you can be reasonably certain that nothing
outside the scope of the transaction has been adversely affected at all.



not possible / highly unlikely (i.e. bug in VM or said system)

But otherwise agreed, dropping the whole process is not always a good 
idea or it easily becomes a DoS attack vector in a public service.



--
Dmitry Olshansky


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Sean Kelly via Digitalmars-d
On Sunday, 28 September 2014 at 20:58:20 UTC, H. S. Teoh via 
Digitalmars-d wrote:


That's why I said, an uncaught exception is a BUG. The only  
place where
user-readable messages can be output is in a catch block where  
you
actually have the chance to localize the error string. But if  
no catch
block catches it, then by definition it's a bug, and you might  
as while
print some useful info with it that your users can send back to 
 you,
rather than unhelpful bug reports of the form "the program 
crashed with

error message 'internal error'".


Pretty much every system should generate a localized error 
message for the user and a detailed log of the problem for the 
programmer.  Ideally, the user message will indicate how to 
provide the detailed information to the developer so the problem 
can be fixed.


The one case where uncaught exceptions aren't really a bug is 
with programs that aren't being used outside the group that 
developed them.  In these cases, the default behavior is pretty 
much exactly what's desired--a message, file/line info, and a 
stack trace.  Which is why it's there.  The vast bulk of today's 
shipping code doesn't run from the command line anyway, so the 
default exception handler should be practically irrelevant.


Re: Creeping Bloat in Phobos

2014-09-28 Thread Dmitry Olshansky via Digitalmars-d

29-Sep-2014 00:44, Uranuz пишет:

It's Tolstoy actually:
http://en.wikipedia.org/wiki/War_and_Peace

You don't need byGrapheme for simple DSL. In fact as long as DSL is
simple enough (ASCII only) you may safely avoid decoding. If it's in
Russian you might want to decode. Even in this case there are ways to
avoid decoding, it may involve a bit of writing in as for typical
short novel ;)


Yes, my mistake ;) I was thinking about *Crime and Punishment* but
writen *War and Peace*. Don't know why. May be because it is longer.



Admittedly both are way too long for my taste :)


Thanks for useful links. As far as we are talking about standard library
I think that some stanard aproach should be provided to solve often
tasks: searching, sorting, parsing, splitting strings. I see that
currently we have a lot of ways of doing similar things with strings. I
think this is a problem of documentation at some part.


Some of this is historical, in particular std.string is way older then 
std.algorithm.



When I parsing
text I can't understand why I need to use all of these range interfaces
instead of just manipulating on raw narrow string. We have several
modules about working on strings: std.range, std.algorithm, std.string,
std.array,


std.range publicly imports std.array thus I really do not see why we 
still have std.array as standalone module.


 std.utf and I can't see how they help me to solve my

problems. In opposite they just creating me new problem to think of them
in order to find *right* way.


There is no *right* way, every level of abstraction has its uses. Also 
there is a bit of trade-off on performance vs easy/obvious/nice code.



So most of my time I spend on thinking
about it but not solving my task.


Takes time to get accustomed with a standard library. See also std.conv 
and std.format. String processing is indeed shotgun-ed across entire phobos.



It is hard for me to accept that we don't need to decode to do some
operations. What is annoying is that I always need to think of
codelength that I should show to user and byte length that is used to
slice char array. It's very easy to be confused with them and do
something wrong.


As long as you use decoding primitives you keep getting back proper 
indices automatically. That must be what some folks considered correct 
way to do Unicode until it was apparent to everybody that Unicode is way 
more then this.




I see that all is complicated we have 3 types of character and more than
5 modules for trivial manipulations on strings with 10ths of functions.
It all goes into hell.


There are many tools, but when I write parsers I actually use almost 
none of them. Well, nowdays I'm going to use the stuff in std.uni like 
CodePointSet, utfMatcher etc. std.regex makes some use of these already, 
but prior to that std.utf.decode was my lone workhorse.



But I don't even started to do my job. And we
don't have *standard* way to deal with it in std lib. At least this way
in not documented enough.


Well on the bright side consider that C has lots of broken functions in 
stdlib, and even some that are _never_ safe like "gets" ;)


--
Dmitry Olshansky


Re: Creeping Bloat in Phobos

2014-09-28 Thread Dmitry Olshansky via Digitalmars-d

29-Sep-2014 00:33, Andrei Alexandrescu пишет:

On 9/28/14, 11:36 AM, Walter Bright wrote:

Currently, the autodecoding functions allocate with the GC and throw as
well. (They'll GC allocate an exception and throw it if they encounter
an invalid UTF sequence. The adapters use the more common method of
inserting a substitution character and continuing on.) This makes it
harder to make GC-free Phobos code.



The right solution here is refcounted exception plus policy-based
functions in conjunction with RCString. I can't believe this focus has
already been lost and we're back to let's remove autodecoding and ban
exceptions. -- Andrei


I've already stated my perception of the "no stinking exceptions", and 
"no destructors 'cause i want it fast" elsewhere.


Code must be correct and fast, with correct being a precondition for any 
performance tuning and speed hacks.


Correct usually entails exceptions and automatic cleanup. I also do not 
believe the "exceptions have to be slow" motto, they are costly but 
proportion of such costs was largely exaggerated.


--
Dmitry Olshansky


Re: Creeping Bloat in Phobos

2014-09-28 Thread Dmitry Olshansky via Digitalmars-d

29-Sep-2014 00:39, H. S. Teoh via Digitalmars-d пишет:

On Sun, Sep 28, 2014 at 12:57:17PM -0700, Walter Bright via Digitalmars-d wrote:

On 9/28/2014 11:51 AM, bearophile wrote:

Walter Bright:


but do want to stop adding more autodecoding functions like the
proposed std.path.withExtension().


I am not sure that can work. Perhaps you need to create a range2 and
algorithm2 modules, and keep adding some autodecoding functions to
the old modules.


It can work just fine, and I wrote it. The problem is convincing
someone to pull it :-( as the PR was closed and reopened with
autodecoding put back in.


The problem with pulling such PRs is that they introduce a dichotomy
into Phobos. Some functions autodecode, some don't, and from a user's
POV, it's completely arbitrary and random. Which leads to bugs because
people can't possibly remember exactly which functions autodecode and
which don't.



Agreed.




As I've explained many times, very few string algorithms actually need
decoding at all. 'find', for example, does not. Trying to make a
separate universe out of autodecoding algorithms is missing the point.

[...]

Maybe what we need to do, is to change the implementation of
std.algorithm so that it internally uses byCodeUnit for narrow strings
where appropriate. We're already specialcasing Phobos code for narrow
strings anyway, so it wouldn't make things worse by making those special
cases not autodecode.

This doesn't quite solve the issue of composing ranges, since one
composed range returns dchar in .front composed with another range will
have autodecoding built into it. For those cases, perhaps one way to
hack around the present situation is to use Phobos-private enums in the
wrapper ranges (e.g., enum isNarrowStringUnderneath=true; in struct
Filter or something), that ranges downstream can test for, and do the
appropriate bypasses.



We need to either generalize the hack we did for char[] and wchar[] or 
start creating a whole new phobos without auto-decoding.


I'm not sure what's best but the latter is more disruptive.


(BTW, before you pick on specific algorithms you might want to actually
look at the code for things like find(), because I remember there were a
couple o' PRs where find() of narrow strings will use (presumably) fast
functions like strstr or strchr, bypassing a foreach loop over an
autodecoding .front.)



Yes, it has fast path.


--
Dmitry Olshansky


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread H. S. Teoh via Digitalmars-d
On Sun, Sep 28, 2014 at 10:32:14AM -0700, Walter Bright via Digitalmars-d wrote:
> On 9/28/2014 9:16 AM, Sean Kelly wrote:
[...]
> >What if an API you're using throws an exception you didn't expect,
> >and therefore don't handle?
> 
> Then the app user sees the error message. This is one of the cool
> things about D - I can write small apps with NO error handling logic
> in it, and I still get appropriate and friendly messages when things
> go wrong like missing files.
> 
> That is, until recently, when I get a bunch of debug stack traces and
> internal file/line messages, which are of no use at all to an app user
> and look awful.

It looks even more awful when the person who wrote the library code is
Russian, and the user speaks English, and when an uncaught exception
terminates the program, you get a completely incomprehensible message in
a language you don't know. Not much different from a line number and
filename that has no meaning for a user.

That's why I said, an uncaught exception is a BUG. The only place where
user-readable messages can be output is in a catch block where you
actually have the chance to localize the error string. But if no catch
block catches it, then by definition it's a bug, and you might as while
print some useful info with it that your users can send back to you,
rather than unhelpful bug reports of the form "the program crashed with
error message 'internal error'". Good luck finding where in the code
that is. (And no, grepping does not work -- the string 'internal error'
could have come from a system call or C library error code translated by
a generic code-to-message function, which could've been called from
*anywhere*.)


> >This might be considered a logic error if the exception is
> >recoverable and you don't intend the program to abort from that
> >operation.
> 
> Adding file/line to all exceptions implies that they are all bugs, and
> encourages them to be thought of as bugs and debugging tools, when
> they are NOT. Exceptions are for:
> 
> 1. enabling recovery from input/environmental errors
> 2. reporting input/environmental errors to the app user
> 3. making input/environmental errors not ignorable by default
> 
> They are not for detecting logic errors. Assert is designed for that.

I do not condone adding file/line to exception *messages*. Catch blocks
can print / translate those messages, which can be made user-friendly,
but if the program failed to catch an exception, you're already screwed
anyway so why not provide more info rather than less?

Unless, of course, you're suggesting that we put this around every
main() function:

void main() {
try {
...
} catch(Exception e) {
assert(0, "Unhandled exception: I screwed up");
}
}


T

-- 
I think Debian's doing something wrong, `apt-get install pesticide', doesn't 
seem to remove the bugs on my system! -- Mike Dresser


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Sean Kelly via Digitalmars-d

On Sunday, 28 September 2014 at 20:31:03 UTC, Walter Bright wrote:


If the threads share memory, the only robust choice is to 
terminate all the threads and the application.


If the thread is in another process, where the memory is not 
shared, then terminating and possibly restarting that process 
is quite acceptable.


> The scope of a logic bug can be known to be quite limited.

If you know about the bug, then you'd have fixed it already 
instead of inserting recovery code for unknown problems. I 
can't really accept that one has "unknown bugs of known scope".


Well, say you're using SafeD or some other system where you know 
that memory corruption is not possible (pure functional 
programming, for example).  In this case, if you know what data a 
particular execution flow touches, you know the scope of the 
potential damage.  And if the data touched is all either shared 
but read-only or generated during the processing of the request, 
you can be reasonably certain that nothing outside the scope of 
the transaction has been adversely affected at all.


Re: Creeping Bloat in Phobos

2014-09-28 Thread Uranuz via Digitalmars-d

It's Tolstoy actually:
http://en.wikipedia.org/wiki/War_and_Peace

You don't need byGrapheme for simple DSL. In fact as long as 
DSL is simple enough (ASCII only) you may safely avoid 
decoding. If it's in Russian you might want to decode. Even in 
this case there are ways to avoid decoding, it may involve a 
bit of writing in as for typical short novel ;)


Yes, my mistake ;) I was thinking about *Crime and Punishment* 
but writen *War and Peace*. Don't know why. May be because it is 
longer.


Thanks for useful links. As far as we are talking about standard 
library I think that some stanard aproach should be provided to 
solve often tasks: searching, sorting, parsing, splitting 
strings. I see that currently we have a lot of ways of doing 
similar things with strings. I think this is a problem of 
documentation at some part. When I parsing text I can't 
understand why I need to use all of these range interfaces 
instead of just manipulating on raw narrow string. We have 
several modules about working on strings: std.range, 
std.algorithm, std.string, std.array, std.utf and I can't see how 
they help me to solve my problems. In opposite they just creating 
me new problem to think of them in order to find *right* way. So 
most of my time I spend on thinking about it but not solving my 
task.


It is hard for me to accept that we don't need to decode to do 
some operations. What is annoying is that I always need to think 
of codelength that I should show to user and byte length that is 
used to slice char array. It's very easy to be confused with them 
and do something wrong.


I see that all is complicated we have 3 types of character and 
more than 5 modules for trivial manipulations on strings with 
10ths of functions. It all goes into hell. But I don't even 
started to do my job. And we don't have *standard* way to deal 
with it in std lib. At least this way in not documented enough.


Re: Creeping Bloat in Phobos

2014-09-28 Thread Andrei Alexandrescu via Digitalmars-d

On 9/28/14, 1:39 PM, H. S. Teoh via Digitalmars-d wrote:

On Sun, Sep 28, 2014 at 12:57:17PM -0700, Walter Bright via Digitalmars-d wrote:

On 9/28/2014 11:51 AM, bearophile wrote:

Walter Bright:


but do want to stop adding more autodecoding functions like the
proposed std.path.withExtension().


I am not sure that can work. Perhaps you need to create a range2 and
algorithm2 modules, and keep adding some autodecoding functions to
the old modules.


It can work just fine, and I wrote it. The problem is convincing
someone to pull it :-( as the PR was closed and reopened with
autodecoding put back in.


The problem with pulling such PRs is that they introduce a dichotomy
into Phobos. Some functions autodecode, some don't, and from a user's
POV, it's completely arbitrary and random. Which leads to bugs because
people can't possibly remember exactly which functions autodecode and
which don't.


I agree. -- Andrei




Re: Creeping Bloat in Phobos

2014-09-28 Thread H. S. Teoh via Digitalmars-d
On Sun, Sep 28, 2014 at 12:57:17PM -0700, Walter Bright via Digitalmars-d wrote:
> On 9/28/2014 11:51 AM, bearophile wrote:
> >Walter Bright:
> >
> >>but do want to stop adding more autodecoding functions like the
> >>proposed std.path.withExtension().
> >
> >I am not sure that can work. Perhaps you need to create a range2 and
> >algorithm2 modules, and keep adding some autodecoding functions to
> >the old modules.
> 
> It can work just fine, and I wrote it. The problem is convincing
> someone to pull it :-( as the PR was closed and reopened with
> autodecoding put back in.

The problem with pulling such PRs is that they introduce a dichotomy
into Phobos. Some functions autodecode, some don't, and from a user's
POV, it's completely arbitrary and random. Which leads to bugs because
people can't possibly remember exactly which functions autodecode and
which don't.


> As I've explained many times, very few string algorithms actually need
> decoding at all. 'find', for example, does not. Trying to make a
> separate universe out of autodecoding algorithms is missing the point.
[...]

Maybe what we need to do, is to change the implementation of
std.algorithm so that it internally uses byCodeUnit for narrow strings
where appropriate. We're already specialcasing Phobos code for narrow
strings anyway, so it wouldn't make things worse by making those special
cases not autodecode.

This doesn't quite solve the issue of composing ranges, since one
composed range returns dchar in .front composed with another range will
have autodecoding built into it. For those cases, perhaps one way to
hack around the present situation is to use Phobos-private enums in the
wrapper ranges (e.g., enum isNarrowStringUnderneath=true; in struct
Filter or something), that ranges downstream can test for, and do the
appropriate bypasses.

(BTW, before you pick on specific algorithms you might want to actually
look at the code for things like find(), because I remember there were a
couple o' PRs where find() of narrow strings will use (presumably) fast
functions like strstr or strchr, bypassing a foreach loop over an
autodecoding .front.)


T

-- 
I think Debian's doing something wrong, `apt-get install pesticide',
doesn't seem to remove the bugs on my system! -- Mike Dresser


Re: Creeping Bloat in Phobos

2014-09-28 Thread bearophile via Digitalmars-d

Walter Bright:

It can work just fine, and I wrote it. The problem is 
convincing someone to pull it :-( as the PR was closed and 
reopened with autodecoding put back in.


Perhaps you need a range2 and algorithm2 modules. Introducing 
your changes in a sneaky way may not produce well working and 
predictable user code.



I know that you care about performance - you post about it 
often. I would expect that unnecessary and pervasive decoding 
would be of concern to you.


I care first of all about program correctness (that's why I 
proposed unusual things like optional strong typing for built-in 
array indexes, or I proposed the "enum preconditions"). Secondly 
I care for performance in the functions or parts of code where 
performance is needed. There are plenty of code where performance 
is not the most important thing. That's why I have tons of 
range-base code. In such large parts of code having short, 
correct, nice looking code that looks correct is more important. 
Please don't assume I am simple minded :-)


Bye,
bearophile


Re: Creeping Bloat in Phobos

2014-09-28 Thread Andrei Alexandrescu via Digitalmars-d

On 9/28/14, 11:36 AM, Walter Bright wrote:

Currently, the autodecoding functions allocate with the GC and throw as
well. (They'll GC allocate an exception and throw it if they encounter
an invalid UTF sequence. The adapters use the more common method of
inserting a substitution character and continuing on.) This makes it
harder to make GC-free Phobos code.


The right solution here is refcounted exception plus policy-based 
functions in conjunction with RCString. I can't believe this focus has 
already been lost and we're back to let's remove autodecoding and ban 
exceptions. -- Andrei


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 12:38 PM, Sean Kelly wrote:

Exceptions are meant for RECOVERABLE errors. If you're using them instead of
assert for logic bugs, you're looking at undefined behavior. Logic bugs are
not recoverable.


In a multithreaded program, does this mean that the thread must be terminated or
the entire process?  In a multi-user system, does this mean the transaction or
the entire process?  The scope of a logic bug can be known to be quite limited.
Remember my earlier point about Erlang, where a "process" there is actually just
a logical thread in the VM.


This has been asked many times before.

If the threads share memory, the only robust choice is to terminate all the 
threads and the application.


If the thread is in another process, where the memory is not shared, then 
terminating and possibly restarting that process is quite acceptable.


> The scope of a logic bug can be known to be quite limited.

If you know about the bug, then you'd have fixed it already instead of inserting 
recovery code for unknown problems. I can't really accept that one has "unknown 
bugs of known scope".


Re: [Semi OT] Language for Game Development talk

2014-09-28 Thread H. S. Teoh via Digitalmars-d
On Sun, Sep 28, 2014 at 10:19:36AM -0700, Walter Bright via Digitalmars-d wrote:
[...]
> Inlining is not a random thing. If there's a case that doesn't inline,
> ask about it.
[...]

This is not directly related to this thread, but recently in a Phobos PR
we discovered the following case:

// This function gets inlined:
int func1(int a) {
if (someCondition) {
return value1;
} else {
return value2;
}
}

// But this one doesn't:
int func2(int a) {
if (someCondition) {
return value1;
}
return value2;
}

IIRC Kenji said something about the first case being convertible to an
expression, but the second can't. It would be nice if inlining worked
for both cases, since semantically they are the same.


T

-- 
Famous last words: I *think* this will work...


Re: Possible quick win in GC?

2014-09-28 Thread David Nadlinger via Digitalmars-d

On Sunday, 28 September 2014 at 16:29:45 UTC, Abdulhaq wrote:
I got the idea after thinking that it should be fairly simple 
for the compiler to detect straightforward cases of when a 
variable can be declared as going on the stack - i.e. no 
references to it are retained after its enclosing function 
returns.


LDC does the "fairly simple" part of this already in a custom 
LLVM optimizer pass. The issue is that escape analysis is fairly 
hard in general, and currently even more limited because we only 
do it on the LLVM IR level (i.e. don't leverage any additional 
attributes like scope, pure, … that might be present in the D 
source code).


David


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 12:33 PM, Jacob Carlborg wrote:

On 2014-09-28 19:36, Walter Bright wrote:


I suggest removal of stack trace for exceptions, but leaving them in for
asserts.


If you don't like the stack track, just wrap the "main" function in a try-catch
block, catch all exceptions and print the error message.


That's what the runtime that calls main() is supposed to do.



Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 12:33 PM, Sean Kelly wrote:

Then use assert(). That's just what it's for.

What if I don't want to be forced to abort the program in the event of such an
error?


Then we are back to the discussion about can a program continue after a logic 
error is uncovered, or not.


In any program, the programmer must decide if an error is a bug or not, before 
shipping it. Trying to avoid making this decision leads to confusion and using 
the wrong techniques to deal with it.


A program bug is, by definition, unknown and unanticipated. The idea that one 
can "recover" from it is fundamentally wrong. Of course, in D one can try and 
recover from them anyway, but you're on your own trying that, just as you're on 
your own when casting integers to pointers.


On the other hand, input/environmental errors must be anticipated and can often 
be recovered from. But presenting debug traces to the users for these implies at 
the very least a sloppily engineered product, in my not so humble opinion :-)


Re: Possible quick win in GC?

2014-09-28 Thread Freddy via Digitalmars-d

On Sunday, 28 September 2014 at 17:47:42 UTC, Abdulhaq wrote:
Here's a code snippet which mopefully makes things a bit 
clearer:



/**
* In this example the variable foo can be statically analysed 
as safe to go on the stack.
* The new instance of Bar allocated in funcLevelB is only 
referred to by foo. foo can
* be considered a root 'scoped' variable and the GC can delete 
both foo and the new Bar()
* when foo goes out of scope. There is no need (except when 
under memory pressure) for
* the GC to scan the band created for foo and it's related 
child allocations.

*/

import std.stdio;

class Bar {
public:
int x;

this(int x) {
this.x = x;
}
}

class Foo {
public:
Bar bar;
}

void funcLevelA() {
	Foo foo = new Foo(); // static analysis could detect this as 
able to go on the stack

funcLevelB(foo);
writeln(foo.bar.x);
}

void funcLevelB(Foo foo) {
	foo.bar = new Bar(12); // this allocated memory is only 
referred to by foo, which

   // static analysis has 
established can go on the stack   
}

void main() {
funcLevelA();
}

You mean this?
https://en.wikipedia.org/wiki/Escape_analysis


Re: Creeping Bloat in Phobos

2014-09-28 Thread Dmitry Olshansky via Digitalmars-d

28-Sep-2014 23:44, Uranuz пишет:

I totally agree with all of that.

It's one of those cases where correct by default is far too slow (that
would have to be graphemes) but fast by default is far too broken.
Better to force an explicit choice.

There is no magic bullet for unicode in a systems language such as D.
The programmer must be aware of it and make choices about how to treat
it.


I see didn't know about difference between byCodeUnit and
byGrapheme, because I speak Russian and it is close to English,
because it doesn't have diacritics. As far as I remember German,
that I learned at school have diacritics. So you opened my eyes
in this question. My position as usual programmer is that I
speaking language which graphemes coded by 2 bytes


In UTF-16 and UTF-8.


and I alwas
need to do decoding otherwise my programme will be broken. Other
possibility is to use wstring or dstring, but it is less memory
efficient. Also UTF-8 is more commonly used in the Internet so I
don't want to do some conversions to UTF-32, for example.

Where I could read about byGrapheme?


std.uni docs:
http://dlang.org/phobos/std_uni.html#.byGrapheme


Isn't this approach
overcomplicated? I don't want to write Dostoevskiy's book "War
and Peace" in order to write some parser for simple DSL.


It's Tolstoy actually:
http://en.wikipedia.org/wiki/War_and_Peace

You don't need byGrapheme for simple DSL. In fact as long as DSL is 
simple enough (ASCII only) you may safely avoid decoding. If it's in 
Russian you might want to decode. Even in this case there are ways to 
avoid decoding, it may involve a bit of writing in as for typical short 
novel ;)


In fact I did a couple of such literature exercises in std library.

For codepoint lookups on non-decoded strings:
http://dlang.org/phobos/std_uni.html#.utfMatcher

And to create sets of codepoints to detect with matcher:
http://dlang.org/phobos/std_uni.html#.CodepointSet

--
Dmitry Olshansky


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread luka8088 via Digitalmars-d

On 28.9.2014. 21:32, Walter Bright wrote:

On 9/28/2014 11:25 AM, bearophile wrote:

Exceptions are often used to help debugging...



https://www.youtube.com/watch?v=hBhlQgvHmQ0


Example exception messages:

Unable to connect to database
Invalid argument count
Invalid network package format

All this messages do not require a stack trace as they do not require 
code fixes, they indicate an issue outside the program itself. If stack 
trace is required then assert should have been used instead.


Or to better put it: can anyone give an example of exception that would 
require stack trace?




Re: Creeping Bloat in Phobos

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 11:51 AM, bearophile wrote:

Walter Bright:


but do want to stop adding more autodecoding functions like the proposed
std.path.withExtension().


I am not sure that can work. Perhaps you need to create a range2 and algorithm2
modules, and keep adding some autodecoding functions to the old modules.


It can work just fine, and I wrote it. The problem is convincing someone to pull 
it :-( as the PR was closed and reopened with autodecoding put back in.


As I've explained many times, very few string algorithms actually need decoding 
at all. 'find', for example, does not. Trying to make a separate universe out of 
autodecoding algorithms is missing the point.


Certainly, setExtension() does not need autodecoding, and in fact all the 
autodecoding in it does is slow it down, allocate memory on errors, make it 
throwable, and produce dchar output, meaning at some point later you'll need to 
put it back to char.


I.e. there are no operations on paths that require decoding.

I know that you care about performance - you post about it often. I would expect 
that unnecessary and pervasive decoding would be of concern to you.


Re: Creeping Bloat in Phobos

2014-09-28 Thread Uranuz via Digitalmars-d

I totally agree with all of that.

It's one of those cases where correct by default is far too 
slow (that would have to be graphemes) but fast by default is 
far too broken. Better to force an explicit choice.


There is no magic bullet for unicode in a systems language such 
as D. The programmer must be aware of it and make choices about 
how to treat it.


I see didn't know about difference between byCodeUnit and
byGrapheme, because I speak Russian and it is close to English,
because it doesn't have diacritics. As far as I remember German,
that I learned at school have diacritics. So you opened my eyes
in this question. My position as usual programmer is that I
speaking language which graphemes coded by 2 bytes and I alwas
need to do decoding otherwise my programme will be broken. Other
possibility is to use wstring or dstring, but it is less memory
efficient. Also UTF-8 is more commonly used in the Internet so I
don't want to do some conversions to UTF-32, for example.

Where I could read about byGrapheme? Isn't this approach
overcomplicated? I don't want to write Dostoevskiy's book "War
and Peace" in order to write some parser for simple DSL.


Re: Creeping Bloat in Phobos

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 11:39 AM, bearophile wrote:

Walter Bright:


I'm painfully aware of what a large change removing autodecoding is. That
means it'll take a long time to do it. In the meantime, we can stop adding new
code to Phobos that does autodecoding. We have taken the first step by adding
the .byDchar and .byCodeUnit adapters.


We have .representation and .assumeUTF, I am using it to avoid most autodecoding
problems. Have you tried to use them in your D code?


Yes. They don't work. Well, technically they do "work", but your code gets 
filled with explicit casts, which is awful.


The problem is the "representation" of char[] is type char, not type ubyte.



The changes you propose seem able to break almost every D program I have written
(most or all code that uses strings with Phobos ranges/algorithms, and I use
them everywhere). Compared to this change, disallowing comma operator to
implement nice built-in tuples will cause nearly no breakage in my code (I have
done a small analysis of the damages caused by disallowing the tuple operator in
my code). It sounds like a change fit for a D3 language, even more than the
introduction of reference counting. I think this change will cause some people
to permanently stop using D.


It's quite possible we will be unable to make this change. But the question that 
started all this would be what would I change if breaking code was allowed.


I suggest that in the future write code that is explicit about the intention - 
by character or by decoded character - by using adapters .byChar or .byDchar.




Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Sean Kelly via Digitalmars-d

On Sunday, 28 September 2014 at 17:40:49 UTC, Walter Bright wrote:


You can hook D's assert and do what you want with it.


With the caveat that you must finish by either exiting the app or 
throwing an exception, since the compiler doesn't generate a 
stack frame that can be returned from.


Exceptions are meant for RECOVERABLE errors. If you're using 
them instead of assert for logic bugs, you're looking at 
undefined behavior. Logic bugs are not recoverable.


In a multithreaded program, does this mean that the thread must 
be terminated or the entire process?  In a multi-user system, 
does this mean the transaction or the entire process?  The scope 
of a logic bug can be known to be quite limited.  Remember my 
earlier point about Erlang, where a "process" there is actually 
just a logical thread in the VM.


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Jacob Carlborg via Digitalmars-d

On 2014-09-28 19:36, Walter Bright wrote:


I suggest removal of stack trace for exceptions, but leaving them in for
asserts.


If you don't like the stack track, just wrap the "main" function in a 
try-catch block, catch all exceptions and print the error message.


--
/Jacob Carlborg


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Sean Kelly via Digitalmars-d

On Sunday, 28 September 2014 at 17:36:14 UTC, Walter Bright wrote:

On 9/28/2014 2:28 AM, bearophile wrote:
And for exceptions I agree completely with your arguments and 
I think that

there is no need for stack.
I think Walter is not suggesting to remove the stack trace for 
exceptions.


I suggest removal of stack trace for exceptions, but leaving 
them in for asserts.


Asserts are a deliberately designed debugging tool. Exceptions 
are not.


Fair.  So we generate traces for Errors but not Exceptions.


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 11:25 AM, bearophile wrote:

Exceptions are often used to help debugging...



https://www.youtube.com/watch?v=hBhlQgvHmQ0


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Sean Kelly via Digitalmars-d

On Sunday, 28 September 2014 at 17:33:38 UTC, Walter Bright wrote:

On 9/28/2014 9:23 AM, Sean Kelly wrote:
Also, I think the idea that a program is created and shipped 
to an end user is overly simplistic.  In the server/cloud 
programming world, when an error occurs, the client who 
submitted the request will get a response appropriate for them 
and the system will also generate log information intended for 
people working on the system.  So things like stack traces and 
assertion failure information is useful even for production 
software.  Same with any critical system, as I'm sure you're 
aware.  The systems are designed to handle failures in 
specific ways, but they also have to leave a breadcrumb trail 
so the underlying problem can be diagnosed and fixed.  
Internal testing is never perfect, and achieving a high 
coverage percentage is nearly impossible if the system wasn't 
designed from the ground up to be testable in such a way (mock 
frameworks and such).


Then use assert(). That's just what it's for.


What if I don't want to be forced to abort the program in the 
event of such an error?


[OT] Graal (next generation JIT compiler in Java) now has a product page

2014-09-28 Thread Paulo Pinto via Digitalmars-d

For the language geeks in the forum.

Graal, the meta-circular JIT compiler that goes back to the Maxime JVM 
done at Sun Research Labs, now has a product page.


http://www.oracle.com/technetwork/oracle-labs/program-languages/overview/index.html

This is part of the ongoing Oracle effort of reducing the amount of C++
code on the JVM.

Graal is already used by AMD on their GPGPU project with Oracle for Java 
(Sumatra), JRuby optimization efforts on the JVM and SubstrateVM, a 
static code compiler for Java (part of Graal toolchain).


When it will fully replace Hotspot, if ever, remains an open question.

--
Paulo


Re: Creeping Bloat in Phobos

2014-09-28 Thread bearophile via Digitalmars-d

Walter Bright:

but do want to stop adding more autodecoding functions like the 
proposed std.path.withExtension().


I am not sure that can work. Perhaps you need to create a range2 
and algorithm2 modules, and keep adding some autodecoding 
functions to the old modules.


Bye,
bearophile


Re: [Semi OT] Language for Game Development talk

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 10:57 AM, po wrote:

  His entire reason for not using C++ lambda, and wanting local functions
instead, is faulty

His reason: I heard they can perform heap allocation.

Reality: No they do not perform heap allocation, that would only happen if you
stuffed it into an std::function



D's will also do heap allocation, but only if you take the address of the local 
function.


Re: Creeping Bloat in Phobos

2014-09-28 Thread bearophile via Digitalmars-d

Walter Bright:

I'm painfully aware of what a large change removing 
autodecoding is. That means it'll take a long time to do it. In 
the meantime, we can stop adding new code to Phobos that does 
autodecoding. We have taken the first step by adding the 
.byDchar and .byCodeUnit adapters.


We have .representation and .assumeUTF, I am using it to avoid 
most autodecoding problems. Have you tried to use them in your D 
code?


The changes you propose seem able to break almost every D program 
I have written (most or all code that uses strings with Phobos 
ranges/algorithms, and I use them everywhere). Compared to this 
change, disallowing comma operator to implement nice built-in 
tuples will cause nearly no breakage in my code (I have done a 
small analysis of the damages caused by disallowing the tuple 
operator in my code). It sounds like a change fit for a D3 
language, even more than the introduction of reference counting. 
I think this change will cause some people to permanently stop 
using D.


In the end you are the designer and the benevolent dictator of D, 
I am not qualified to refuse or oppose such changes. But before 
doing this change I suggest to study how many changes it causes 
in an average small D program that uses strings and 
ranges/algorithms.


Bye,
bearophile


Re: Creeping Bloat in Phobos

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 5:09 AM, Andrei Alexandrescu wrote:

Stuff that's missing:

* Reasonable effort to improve performance of auto-decoding;

* A study of the matter revealing either new artifacts and idioms, or the
insufficiency of such;

* An assessment of the impact on compilability of existing code

* An assessment of the impact on correctness of existing code (that compiles and
runs in both cases)

* An assessment of the improvement in speed of eliminating auto-decoding

I think there's a very strong need for this stuff, because claims that current
alternatives to selectively avoid auto-decoding use the throwing of hands (and
occasional chairs out windows) without any real investigation into how library
artifacts may help. This approach to justifying risky moves is frustratingly
unprincipled.


I know I have to go a ways further to convince you :-) This is definitely a 
longer term issue, not a stop-the-world-we-must-fix-it-now thing.




Also I submit that diverting into this is a huge distraction at probably the
worst moment in the history of the D programming language.


I don't plan to work on this particular issue for the time being, but do want to 
stop adding more autodecoding functions like the proposed std.path.withExtension().




C++ and GC. C++ and GC...


Currently, the autodecoding functions allocate with the GC and throw as well. 
(They'll GC allocate an exception and throw it if they encounter an invalid UTF 
sequence. The adapters use the more common method of inserting a substitution 
character and continuing on.) This makes it harder to make GC-free Phobos code.




Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread bearophile via Digitalmars-d

Walter Bright:

I suggest removal of stack trace for exceptions, but leaving 
them in for asserts.


I suggest to keep stack trace for both cases, and improve it with 
colors :-) Another possibility is to keep the stack trace for 
exceptions in nonrelease mode only.



Asserts are a deliberately designed debugging tool. Exceptions 
are not.


Exceptions are often used to help debugging... We have even 
allowed exceptions inside D contracts (but I don't know why).


Bye,
bearophile


Re: [Semi OT] Language for Game Development talk

2014-09-28 Thread bearophile via Digitalmars-d

Walter Bright:

Inlining is not a random thing. If there's a case that doesn't 
inline, ask about it.


Even when you use annotations like "forced_inline" and the like 
you can't be certain the compiler is doing what you ask for. I 
have seen several critical template functions not inlined even by 
ldc2 (dmd is usually worse). And if your function is in a 
pre-compiled object where the source code is not available, 
inlining doesn't happen.




@outer implies purity with a list of exceptions.


Right, an empty "@outer()" equals to the D "weakly pure". But if 
I put those exceptions then I am essentially stating that a 
function is not pure. So you usually don't put @outer() on pure 
functions.



Again, template functions do that rather nicely, and no type is 
required.


I think templates are blunt tools for this purpose. But I will 
try to use them for this purpose, to see how they fare. I think I 
have not seen people use templates for this purpose, so I think 
it's not very natural.




void foo() {
  int x;
  @outer(in x) void bar() {
writeln(x);
  }
  bar();
}


This I find to be a bit more lame, because the declarations 
used will be right there.


Yes, and a module-level variable can be defined the line before 
the function definition. Or it can be far from it (it can even be 
from an imported module). The same is true for the variables 
defined in the outer function. That's why I have named it @outer 
instead of @globals. Even a large function is usually smaller 
than a whole module (and modern coding practices suggest to avoid 
very large functions), so the variable definition should be 
closer (and in D it must be lexically before the definition of 
the inner function), so the problems with module-level variables 
is bigger, but it's similar.




But if you really wanted to do it,

 void foo() {
int x;
static void bar(int x) {
  writeln(x);
}
bar(x);
 }


As you said, the same is possible with global variables, you can 
often pass them as arguments, by value or by reference. I agree 
that the case of using @outer() for inner functions is weaker.


Bye,
bearophile


Re: Creeping Bloat in Phobos

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 3:14 AM, bearophile wrote:

I get refusals if I propose tiny breaking changes that require changes in a
small amount of user code.  In comparison the user code changes you are
suggesting are very large.


I'm painfully aware of what a large change removing autodecoding is. That means 
it'll take a long time to do it. In the meantime, we can stop adding new code to 
Phobos that does autodecoding. We have taken the first step by adding the 
.byDchar and .byCodeUnit adapters.




Re: Creeping Bloat in Phobos

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 10:03 AM, John Colvin wrote:

There is no magic bullet for unicode in a systems language such as D. The
programmer must be aware of it and make choices about how to treat it.


That's really the bottom line.

The trouble with autodecode is it is done at the lowest level, meaning it is 
very hard to bypass. By moving the decision up a level (by using .byDchar or 
.byCodeUnit adapters) the caller makes the decision.


Re: Creeping Bloat in Phobos

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 4:46 AM, Dmitry Olshansky wrote:

In all honesty - 2 RAII structs w/o inlining + setting up exception frame +
creating and allocating an exception + idup-ing a string does account to about
this much.


Twice as much generated code as actually necessary, and this is just for 3 lines 
of source code.




Re: Creeping Bloat in Phobos

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 5:06 AM, Uranuz wrote:

A question: can you list some languages that represent UTF-8 narrow strings as
array of single bytes?


C and C++.


Re: [Semi OT] Language for Game Development talk

2014-09-28 Thread po via Digitalmars-d
 His entire reason for not using C++ lambda, and wanting local 
functions instead, is faulty


His reason: I heard they can perform heap allocation.

Reality: No they do not perform heap allocation, that would only 
happen if you stuffed it into an std::function





Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Andrei Alexandrescu via Digitalmars-d

On 9/28/14, 10:36 AM, Walter Bright wrote:

On 9/28/2014 2:28 AM, bearophile wrote:

And for exceptions I agree completely with your arguments and I think
that
there is no need for stack.

I think Walter is not suggesting to remove the stack trace for
exceptions.


I suggest removal of stack trace for exceptions, but leaving them in for
asserts.

Asserts are a deliberately designed debugging tool. Exceptions are not.


I'm fine with that philosophy, with the note that it's customary 
nowadays to inflict things like the stack trace on the user. -- Andrei




Re: Possible quick win in GC?

2014-09-28 Thread Abdulhaq via Digitalmars-d

Here's a code snippet which mopefully makes things a bit clearer:


/**
* In this example the variable foo can be statically analysed as 
safe to go on the stack.
* The new instance of Bar allocated in funcLevelB is only 
referred to by foo. foo can
* be considered a root 'scoped' variable and the GC can delete 
both foo and the new Bar()
* when foo goes out of scope. There is no need (except when under 
memory pressure) for
* the GC to scan the band created for foo and it's related child 
allocations.

*/

import std.stdio;

class Bar {
public:
int x;

this(int x) {
this.x = x;
}
}

class Foo {
public:
Bar bar;
}

void funcLevelA() {
	Foo foo = new Foo(); // static analysis could detect this as 
able to go on the stack

funcLevelB(foo);
writeln(foo.bar.x);
}

void funcLevelB(Foo foo) {
	foo.bar = new Bar(12); // this allocated memory is only referred 
to by foo, which

   // static analysis has 
established can go on the stack   
}

void main() {
funcLevelA();
}


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 8:10 AM, Ary Borenszweig wrote:

For me, assert is useless.

We are developing a language using LLVM as our backend. If you give LLVM
something it doesn't like, you get something this:

~~~
Assertion failed: (S1->getType() == S2->getType() && "Cannot create binary
operator with two operands of differing type!"), function Create, file
Instructions.cpp, line 1850.

Abort trap: 6
~~~

That is what the user gets when there is a bug in the compiler, at least when we
are generating invalid LLVM code. And that's one of the good paths, if you
compiled LLVM with assertions, because otherwise I guess it's undefined 
behaviour.

What I'd like to do, as a compiler, is to catch those errors and tell the user:
"You've found a bug in the app, could you please report it in this URL? Thank
you.". We can't: the assert is there and we can't change it.


You can hook D's assert and do what you want with it.



Now, this is when you interface with C++/C code. But inside our language code we
always use exceptions so that programmers can choose what to do in case of an
error. With assert you loose that possibility.


If you want to use Exceptions for debugging in your code, I won't try and stop 
you. But using them for debugging in official Phobos I strongly object to.




Installing an exception handler is cost-free,


Take a look at the assembler dump from std.file.copy() that I posted in the 
other thread.




so I don't see why there is a need
for a less powerful construct like assert.


Exceptions are meant for RECOVERABLE errors. If you're using them instead of 
assert for logic bugs, you're looking at undefined behavior. Logic bugs are not 
recoverable.




Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 2:28 AM, bearophile wrote:

And for exceptions I agree completely with your arguments and I think that
there is no need for stack.

I think Walter is not suggesting to remove the stack trace for exceptions.


I suggest removal of stack trace for exceptions, but leaving them in for 
asserts.

Asserts are a deliberately designed debugging tool. Exceptions are not.


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 9:16 AM, Sean Kelly wrote:

On Sunday, 28 September 2014 at 00:40:26 UTC, Walter Bright wrote:


Whoa, Camel! You're again thinking of Exceptions as a debugging tool.


They can be.


Of course they can be. But it's inappropriate to use them that way, and we 
should not be eschewing such in the library.



What if an API you're using throws an exception you didn't expect,
and therefore don't handle?


Then the app user sees the error message. This is one of the cool things about D 
- I can write small apps with NO error handling logic in it, and I still get 
appropriate and friendly messages when things go wrong like missing files.


That is, until recently, when I get a bunch of debug stack traces and internal 
file/line messages, which are of no use at all to an app user and look awful.



This might be considered a logic error if the
exception is recoverable and you don't intend the program to abort from that
operation.


Adding file/line to all exceptions implies that they are all bugs, and 
encourages them to be thought of as bugs and debugging tools, when they are NOT. 
Exceptions are for:


1. enabling recovery from input/environmental errors
2. reporting input/environmental errors to the app user
3. making input/environmental errors not ignorable by default

They are not for detecting logic errors. Assert is designed for that.


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 9:23 AM, Sean Kelly wrote:

Also, I think the idea that a program is created and shipped to an end user is
overly simplistic.  In the server/cloud programming world, when an error occurs,
the client who submitted the request will get a response appropriate for them
and the system will also generate log information intended for people working on
the system.  So things like stack traces and assertion failure information is
useful even for production software.  Same with any critical system, as I'm sure
you're aware.  The systems are designed to handle failures in specific ways, but
they also have to leave a breadcrumb trail so the underlying problem can be
diagnosed and fixed.  Internal testing is never perfect, and achieving a high
coverage percentage is nearly impossible if the system wasn't designed from the
ground up to be testable in such a way (mock frameworks and such).


Then use assert(). That's just what it's for.


Re: Messaging

2014-09-28 Thread Paulo Pinto via Digitalmars-d

Am 28.09.2014 15:42, schrieb Paulo Pinto:

Am 28.09.2014 15:16, schrieb Andrei Alexandrescu:

On 9/28/14, 5:46 AM, Paulo Pinto wrote:

Am 28.09.2014 14:16, schrieb Andrei Alexandrescu:

On 9/27/14, 5:57 PM, Walter Bright wrote:

On 9/27/2014 4:27 PM, Peter Alexander wrote:

I've now filed a bug.

https://issues.dlang.org/show_bug.cgi?id=13547


Thanks for filing the bug report. I was going to raise its priority,
and
found you'd already done so!

Any takers?

Andrei, wanna put a bounty on it?


This probably doesn't warrant a bounty. I'm traveling and will be
nearly
out of touch, somebody please just fix it. Peter? -- Andrei


I can try a shot at it.

Should it provide just additional information that allocators and manual
memory management are also available, or something more detailed?


I'd probably refer to the existing std.allocator draft but clarifying
it's work in progress.

Also mention there's work undergoing on making the stdlib usable without
a garbage collector.


Thanks!

Andrei




The ticket has been updated accordingly. I am now updating some parts of
it.


As I need to go out, I will finish it by the evening.

--
Paulo



Changes now awaiting for review.


--
Paulo


Re: [Semi OT] Language for Game Development talk

2014-09-28 Thread Walter Bright via Digitalmars-d

On 9/28/2014 2:16 AM, bearophile wrote:

Walter Bright:

If the function is small enough that parameter setup time is significant, it
is likely to be inlined anyway which will erase the penalty.

This is the theory :-) Sometimes I have seen this not to be true.


Inlining is not a random thing. If there's a case that doesn't inline, ask 
about it.



Then you really aren't encapsulating the globals the function uses, anyway,
and the feature is useless.


The point of having an @outer() is to enforce what module-level names a function
is using and how it is using them (in/out/inout), so it's useful for _impure_
functions. If your functions can be annotated with "pure" you don't need @outer
much.


@outer implies purity with a list of exceptions.



- @outer is more DRY, because you don't need to specify the type of the global
variable received by ref, you just need to know its name.


That's why gawd invented templates.


Templates are a blunt tool to solve the problems faced by @outer(). @outer()
allows you to specify for each variable if it's going to just be read, written,
or read-written.


Again, template functions do that rather nicely, and no type is required.



And @outer() is not named @globals() because it works for inner
functions too, it allows to control and specify the access of names from the
outer scopes, so an inner impure nonstatic function can use @outer() to specify
what names it can use from the scope of the outer function:

void foo() {
   int x;
   @outer(in x) void bar() {
 writeln(x);
   }
   bar();
}


This I find to be a bit more lame, because the declarations used will be right 
there. But if you really wanted to do it,


 void foo() {
int x;
static void bar(int x) {
  writeln(x);
}
bar(x);
 }




Re: Creeping Bloat in Phobos

2014-09-28 Thread John Colvin via Digitalmars-d
On Sunday, 28 September 2014 at 14:38:57 UTC, H. S. Teoh via 
Digitalmars-d wrote:
On Sun, Sep 28, 2014 at 12:06:16PM +, Uranuz via 
Digitalmars-d wrote:
On Sunday, 28 September 2014 at 00:13:59 UTC, Andrei 
Alexandrescu wrote:

>On 9/27/14, 3:40 PM, H. S. Teoh via Digitalmars-d wrote:
>>If we can get Andrei on board, I'm all for killing off 
>>autodecoding.

>
>That's rather vague; it's unclear what would replace it. -- 
>Andrei


I believe that removing autodeconding will make things even 
worse. As
far as understand if we will remove it from front() function 
that
operates on narrow strings then it will return just byte of 
char. I
believe that proceeding on narrow string by `user perceived 
chars`

(graphemes) is more common use case.

[...]

Unfortunately this is not what autodecoding does today. Today's
autodecoding only segments strings into code *points*, which 
are not the
same thing as graphemes. For example, combining diacritics are 
normally
not considered separate characters from the user's POV, but 
they *are*
separate codepoints from their base character. The only reason 
today's
autodecoding is even remotely considered "correct" from an 
intuitive POV
is because most Western character sets happen to use only 
precomposed
characters rather than combining diacritic sequences. If you 
were
processing, say, Korean text, the present autodecoding .front 
would
*not* give you what you might imagine is a "single character"; 
it would
only be halves of Korean graphemes. Which, from a user's POV, 
would
suffer from the same issues as dealing with individual bytes in 
a UTF-8
stream -- any mistake on the program's part in handling these 
half-units
will cause "corruption" of the text (not corruption in the same 
sense as
an improperly segmented UTF-8 byte stream, but in the sense 
that the
wrong glyphs will be displayed on the screen -- from the user's 
POV

these two are basically the same thing).

You might then be tempted to say, well let's make .front return
graphemes instead. That will solve the "single intuitive 
character"
issue, but the performance will be FAR worse than what it is 
today.


So basically, what we have today is neither efficient nor 
complete, but
a halfway solution that mostly works for Western character sets 
but
is incomplete for others. We're paying efficiency for only a 
partial

benefit. Is it worth the cost?

I think the correct solution is not for Phobos to decide for the
application at what level of abstraction a string ought to be 
processed.
Rather, let the user decide. If they're just dealing with 
opaque blocks
of text, decoding or segmenting by grapheme is completely 
unnecessary --
they should just operate on byte ranges as opaque data. They 
should use
byCodeUnit. If they need to work with Unicode codepoints, let 
them use

byCodePoint. If they need to work with individual user-perceived
characters (i.e., graphemes), let them use byGrapheme.

This is why I proposed the deprecation path of making it 
illegal to pass
raw strings to Phobos algorithms -- the caller should specify 
what level
of abstraction they want to work with -- byCodeUnit, 
byCodePoint, or
byGrapheme. The standard library's job is to empower the D 
programmer by
giving him the choice, not to shove a predetermined solution 
down his

throat.


T


I totally agree with all of that.

It's one of those cases where correct by default is far too slow 
(that would have to be graphemes) but fast by default is far too 
broken. Better to force an explicit choice.


There is no magic bullet for unicode in a systems language such 
as D. The programmer must be aware of it and make choices about 
how to treat it.


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Xiao Xie via Digitalmars-d
On Sunday, 28 September 2014 at 15:10:26 UTC, Ary Borenszweig 
wrote:


What I'd like to do, as a compiler, is to catch those errors 
and tell the user: "You've found a bug in the app, could you 
please report it in this URL? Thank you.". We can't: the assert 
is there and we can't change it.




Why is SIGABRT handler not working for your usecase? print and 
exit?


Possible quick win in GC?

2014-09-28 Thread Abdulhaq via Digitalmars-d
Perhaps I've too had much caffeine today but I've had an idea 
which might give a fairly quick win on the GC speed situation. 
It's a simple idea at heart so it's very possible/likely that 
this is a well known idea that has already been discarded but 
anyway here goes.


I got the idea after thinking that it should be fairly simple for 
the compiler to detect straightforward cases of when a variable 
can be declared as going on the stack - i.e. no references to it 
are retained after its enclosing function returns. At the moment 
AIUI it is necessary for a class instance to be declared by the 
programmer as 'scoped' for this to take place.


Further, I was considering the type of ownership and boundary 
considerations that could be used to improve memory management - 
e.g. using the notion of an owner instance which, upon 
destruction, destroys all owned objects.


Afer some consideration it seems to me that by using only static 
analysis a tree of references could be constructed of references 
from a root 'scoped' object to all referred to objects that are 
allocated after the allocation of the root object. When the root 
object goes out of scope it is destroyed and all the descendent 
objects from the root object (as identified by the static 
analysis) could also be destroyed in one simple shot. The static 
analysis of course constructs the tree by analysing the capturing 
of references from one object to another. It could be the case 
that even a simple static analysis at first (e.g. discard the 
technique in difficult situations) could cover a lot of use cases 
(statistically).


Of course, if one of the descendent objects is referred to by an 
object which is not in the object tree, then this technique 
cannot be used. However, I envisage that there are many 
situations where upon the destruction of a root object all 
related post-allocated objects can also be destroyed.


In terms of implementation I see this being done by what I am 
calling 'bands' within the GC. With the allocation of any 
identified root object, a new band (heap) is created in the GC. 
Child objects of the root object (i.e. only referred to by the 
root object and other child objects in its tree) are placed in 
the same band. When the root object goes out of scope the entire 
band is freed. This by definition is safe because the static 
analysis has ensured that there are no 'out-of-tree' references 
between child objects in the tree and out-of-tree (out-of-band) 
objects. This property also means that normal GC runs do not need 
to add the scoped root object as a GC root object - this memory 
will normally only be freed when the scoped root object at the 
top of the tree goes out of scope. If memory becomes constrained 
then the bands can be added as root objects to the GC and memory 
incrementally freed just as with regularly allocated objects.



Sorry if this idea is daft and I've wasted your time!



Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Sean Kelly via Digitalmars-d

On Sunday, 28 September 2014 at 16:16:09 UTC, Sean Kelly wrote:
On Sunday, 28 September 2014 at 00:40:26 UTC, Walter Bright 
wrote:


Whoa, Camel! You're again thinking of Exceptions as a 
debugging tool.


They can be.  What if an API you're using throws an exception 
you didn't expect, and therefore don't handle?  This might be 
considered a logic error if the exception is recoverable and 
you don't intend the program to abort from that operation.


Also, I think the idea that a program is created and shipped to 
an end user is overly simplistic.  In the server/cloud 
programming world, when an error occurs, the client who submitted 
the request will get a response appropriate for them and the 
system will also generate log information intended for people 
working on the system.  So things like stack traces and assertion 
failure information is useful even for production software.  Same 
with any critical system, as I'm sure you're aware.  The systems 
are designed to handle failures in specific ways, but they also 
have to leave a breadcrumb trail so the underlying problem can be 
diagnosed and fixed.  Internal testing is never perfect, and 
achieving a high coverage percentage is nearly impossible if the 
system wasn't designed from the ground up to be testable in such 
a way (mock frameworks and such).


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Sean Kelly via Digitalmars-d

On Sunday, 28 September 2014 at 00:40:26 UTC, Walter Bright wrote:


Whoa, Camel! You're again thinking of Exceptions as a debugging 
tool.


They can be.  What if an API you're using throws an exception you 
didn't expect, and therefore don't handle?  This might be 
considered a logic error if the exception is recoverable and you 
don't intend the program to abort from that operation.


Re: Program logic bugs vs input/environmental errors

2014-09-28 Thread Ary Borenszweig via Digitalmars-d

On 9/27/14, 8:15 PM, Walter Bright wrote:

This issue comes up over and over, in various guises. I feel like
Yosemite Sam here:

 https://www.youtube.com/watch?v=hBhlQgvHmQ0

In that vein, Exceptions are for either being able to recover from
input/environmental errors, or report them to the user of the application.

When I say "They are NOT for debugging programs", I mean they are NOT
for debugging programs.

assert()s and contracts are for debugging programs.


For me, assert is useless.

We are developing a language using LLVM as our backend. If you give LLVM 
something it doesn't like, you get something this:


~~~
Assertion failed: (S1->getType() == S2->getType() && "Cannot create 
binary operator with two operands of differing type!"), function Create, 
file Instructions.cpp, line 1850.


Abort trap: 6
~~~

That is what the user gets when there is a bug in the compiler, at least 
when we are generating invalid LLVM code. And that's one of the good 
paths, if you compiled LLVM with assertions, because otherwise I guess 
it's undefined behaviour.


What I'd like to do, as a compiler, is to catch those errors and tell 
the user: "You've found a bug in the app, could you please report it in 
this URL? Thank you.". We can't: the assert is there and we can't change it.


Now, this is when you interface with C++/C code. But inside our language 
code we always use exceptions so that programmers can choose what to do 
in case of an error. With assert you loose that possibility.


Raising an exception is costly, but that should happen in exceptional 
cases. Installing an exception handler is cost-free, so I don't see why 
there is a need for a less powerful construct like assert.


Re: User-defined "opIs"

2014-09-28 Thread via Digitalmars-d

On Sunday, 28 September 2014 at 14:27:56 UTC, Marco Leise wrote:

Am Sun, 28 Sep 2014 10:44:47 +
schrieb "Marc Schütz" :

On Saturday, 27 September 2014 at 11:38:51 UTC, Marco Leise 
wrote:

>   A byte for byte comparison of both operands is performed.
>   For reference types this is the reference itself.

Maybe allow this only for types that somehow implicitly 
convert to each other, i.e. via alias this?


That sounds like a messy rule set on top of the original when
alias this does not represent all of the type. You have a
size_t and a struct with a pointer that aliases itself to some
size_t that can be retrieved through that pointer.
The alias this will make it implicitly convert to size_t and
the byte for byte comparison will happily compare two equally
sized varibles (size_t and pointer).
So how narrow would the rule have to be defined before it
reads:

  If you compare with a struct that consists only of one member
  that the struct aliases itself with, a variable of the type
  of that member will be compared byte for byte with the
  struct.


Yeah, it wasn't a good idea. Somehow it felt strange to throw all 
type safety out of the window, but on the other hand, bit-level 
comparison is the purpose of `is`, which isn't typesafe to begin 
with.


Re: Creeping Bloat in Phobos

2014-09-28 Thread H. S. Teoh via Digitalmars-d
On Sun, Sep 28, 2014 at 12:06:16PM +, Uranuz via Digitalmars-d wrote:
> On Sunday, 28 September 2014 at 00:13:59 UTC, Andrei Alexandrescu wrote:
> >On 9/27/14, 3:40 PM, H. S. Teoh via Digitalmars-d wrote:
> >>If we can get Andrei on board, I'm all for killing off autodecoding.
> >
> >That's rather vague; it's unclear what would replace it. -- Andrei
> 
> I believe that removing autodeconding will make things even worse. As
> far as understand if we will remove it from front() function that
> operates on narrow strings then it will return just byte of char. I
> believe that proceeding on narrow string by `user perceived chars`
> (graphemes) is more common use case.
[...]

Unfortunately this is not what autodecoding does today. Today's
autodecoding only segments strings into code *points*, which are not the
same thing as graphemes. For example, combining diacritics are normally
not considered separate characters from the user's POV, but they *are*
separate codepoints from their base character. The only reason today's
autodecoding is even remotely considered "correct" from an intuitive POV
is because most Western character sets happen to use only precomposed
characters rather than combining diacritic sequences. If you were
processing, say, Korean text, the present autodecoding .front would
*not* give you what you might imagine is a "single character"; it would
only be halves of Korean graphemes. Which, from a user's POV, would
suffer from the same issues as dealing with individual bytes in a UTF-8
stream -- any mistake on the program's part in handling these half-units
will cause "corruption" of the text (not corruption in the same sense as
an improperly segmented UTF-8 byte stream, but in the sense that the
wrong glyphs will be displayed on the screen -- from the user's POV
these two are basically the same thing).

You might then be tempted to say, well let's make .front return
graphemes instead. That will solve the "single intuitive character"
issue, but the performance will be FAR worse than what it is today.

So basically, what we have today is neither efficient nor complete, but
a halfway solution that mostly works for Western character sets but
is incomplete for others. We're paying efficiency for only a partial
benefit. Is it worth the cost?

I think the correct solution is not for Phobos to decide for the
application at what level of abstraction a string ought to be processed.
Rather, let the user decide. If they're just dealing with opaque blocks
of text, decoding or segmenting by grapheme is completely unnecessary --
they should just operate on byte ranges as opaque data. They should use
byCodeUnit. If they need to work with Unicode codepoints, let them use
byCodePoint. If they need to work with individual user-perceived
characters (i.e., graphemes), let them use byGrapheme.

This is why I proposed the deprecation path of making it illegal to pass
raw strings to Phobos algorithms -- the caller should specify what level
of abstraction they want to work with -- byCodeUnit, byCodePoint, or
byGrapheme. The standard library's job is to empower the D programmer by
giving him the choice, not to shove a predetermined solution down his
throat.


T

-- 
Life is unfair. Ask too much from it, and it may decide you don't
deserve what you have now either.


Re: User-defined "opIs"

2014-09-28 Thread Marco Leise via Digitalmars-d
Am Sun, 28 Sep 2014 10:44:47 +
schrieb "Marc Schütz" :

> On Saturday, 27 September 2014 at 11:38:51 UTC, Marco Leise wrote:
> >   A byte for byte comparison of both operands is performed.
> >   For reference types this is the reference itself.
> 
> Maybe allow this only for types that somehow implicitly convert 
> to each other, i.e. via alias this?

That sounds like a messy rule set on top of the original when
alias this does not represent all of the type. You have a
size_t and a struct with a pointer that aliases itself to some
size_t that can be retrieved through that pointer.
The alias this will make it implicitly convert to size_t and
the byte for byte comparison will happily compare two equally
sized varibles (size_t and pointer).
So how narrow would the rule have to be defined before it
reads:

  If you compare with a struct that consists only of one member
  that the struct aliases itself with, a variable of the type
  of that member will be compared byte for byte with the
  struct.

-- 
Marco



Re: Messaging

2014-09-28 Thread Paulo Pinto via Digitalmars-d

Am 28.09.2014 15:24, schrieb Mike:

On Sunday, 28 September 2014 at 12:46:30 UTC, Paulo Pinto wrote:

Should it provide just additional information that allocators and
manual memory management are also available, or something more detailed?



You may want to point readers here: http://wiki.dlang.org/Memory_Management

Mike


Thanks for the info.

Already added a small intro to @nogc.

http://wiki.dlang.org/Memory_Management#Writing_GC_free_code

--
Paulo


Re: std.experimental.logger formal review round 3

2014-09-28 Thread Robert burner Schadek via Digitalmars-d

On Sunday, 28 September 2014 at 12:24:23 UTC, Dicebot wrote:
Previous review thread : 
http://forum.dlang.org/post/zhvmkbahrqtgkptdl...@forum.dlang.org


Previous voting thread (also contains discussion in the end) : 
http://forum.dlang.org/post/vbotavcclttrgvzcj...@forum.dlang.org


Code : 
https://github.com/D-Programming-Language/phobos/pull/1500


Important changes since last review:
- new approach for compile-time log level filtering
- thread-safe API (by Marco Leise, commits 3b32618..e71f317)
- documentation enhancements all over the place
- more @nogc annotations
- "raw" log overload that expects pre-formatted message and all 
metadata (file/line/function) as run-time function arguments


(anything I have missed Robert?)


I added more unittests, but unfortunately didn't find any bugs.
The "raw" log overload is actually a template function that has 
one template parameter, the value you want to log.


That's all, I think



  1   2   >