RE: FFI, safe vs unsafe

2006-04-13 Thread Simon Marlow
On 13 April 2006 10:02, Marcin 'Qrczak' Kowalczyk wrote:

> John Meacham <[EMAIL PROTECTED]> writes:
> 
>> Checking thread local state for _every_ foregin call is definitly
>> not an option either. (but for specificially annotated ones it is
>> fine.)
> 
> BTW, does Haskell support foreign code calling Haskell in a thread
> which the Haskell runtime has not seen before? Does it work in GHC?

Yes, yes.

> If so, does it show the same ThreadId from that point until OS
> thread's death (like in Kogut), or a new ThreadId for each callback
> (like in Python)?

A new ThreadId, but that's not a conscious design decision, just a
symptom of the fact that we don't re-use old threads.

Cheers,
Simon
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-04-13 Thread Marcin 'Qrczak' Kowalczyk
John Meacham <[EMAIL PROTECTED]> writes:

> Checking thread local state for _every_ foregin call is definitly
> not an option either. (but for specificially annotated ones it is
> fine.)

BTW, does Haskell support foreign code calling Haskell in a thread
which the Haskell runtime has not seen before? Does it work in GHC?

If so, does it show the same ThreadId from that point until OS
thread's death (like in Kogut), or a new ThreadId for each callback
(like in Python)?

-- 
   __("< Marcin Kowalczyk
   \__/   [EMAIL PROTECTED]
^^ http://qrnik.knm.org.pl/~qrczak/
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-04-12 Thread John Meacham
On Wed, Apr 12, 2006 at 11:37:57PM -0400, Wolfgang Thaller wrote:
> John Meacham wrote:
> 
> >However, in order to achieve that we would have to annotate the  
> >foreign
> >functions with whether they use thread local state.
> 
> I am not opposed to that; however, you might not like that here  
> again, there would be the safe, possibly inefficient default choice,  
> which means "might access thread local data", and the possibly more  
> efficient annotation that comes with a proof obligation, which says  
> "guaranteed not to access thread local data".
> The main counterargument is that some libraries, like OpenGL require  
> many *fast* nonconcurrent, nonreentrant but tls-using calls (and,  
> nost likely, one reentrant and possibly concurrent call for the GLUT  
> main event loop). Using OpenGL would probably be infeasible from an  
> implementation which requires a "notls" annotation to make foreign  
> imports fast.

this is getting absurd, 95% of foreign imports are going to be
nonreentrant, nonconcurrent, nonthreadlocalusing. Worrying about the
minor inconvinience of the small chance someone might accidentally
writing buggy code is silly when you have 'peek' and 'poke' and the
ability to just deadlock right out there in the open.

The FFI is inherently unsafe. We do not need to coddle the programer who
is writing raw FFI code.  

_any_ time you use the FFI there are a large number of proof obligations
you are commiting to that arn't necessarily apparent, why make these
_very rare_ cases so visible. There is a reason they arn't named
'unsafePoke' and 'unsafePeek', the convinience of using the names poke
and peek outweighs the unsafety concern becaues you are already using
the FFI and already know everything is unsafe and you need to be
careful. these problems can't even crash the runtime, way safer than a
lot of the unannotated routines in the FFI.


> >it would pretty much
> >be vital for implementing them efficiently on a non OS-threaded
> >implemenation of the language.
> 
> True, with the implementation plan you've outlined so far.
> Have you considered hybrid models where most threads are state  
> threads (all running in one OS thread) and a few threads (=the bound  
> threads) are OS threads which are prevented from actually executing  
> in parallel by a few well-placed locks and condition variables? You  
> could basically write an wrapper around the state threads and  
> pthreads libraries, and you'd get the best of both worlds. I feel it  
> wouldn't be that hard to implement, either.

well, I plan a hybrid model of some sort, simply because it is needed to
support foreign concurrent calls. exactly where I will draw the line
between them is still up in the air.

but in any case, I really like explicit annotations on everything as we
can't predict what future implementations might come about so we should
play it safe in the standard.

> >Oddly enough, depending on the implementation it might actually be
> >easier to just make every 'threadlocal' function fully concurrent. you
> >have already paid the cost of dealing with OS threads.
> 
> Depending on the implementation, yes. This is the case for the  
> inefficient implementation we recommended for interpreters like Hugs  
> in our bound threads paper; there, the implementation might be  
> constrained by the fact that Hugs implements cooperative threading in  
> Haskell using continuation passing in the IO monad; the interpreter  
> itself doesn't even really know about threads. For jhc, I fell that a  
> hybrid implementation would be better.

yeah, what I am planning is just providing a create new stack and jump
to a different stack(longjmp) primitive, and everything else being
implemented in haskell as a part of the standard libraries.  (with
liberal use of the FFI to call things like pthread_create and epoll)

so actually fairly close to the hugs implementation in that it is mostly
haskell based, but with some better primitives to work with. (from what
I understand of how hugs works)



> >you seem to be contradicting yourself, above you say a performance
> >penalty is vitally important in the GUI case if a call takes too  
> >long, [...]
> 
> I am not. What I was talking about above was not performance, but  
> responsiveness; it's somewhat related to fairness in scheduling.
> If a foreign call takes 10 microseconds instead of 10 nanoseconds,  
> that is a performance penalty that will matter in some circumstances,  
> and not in others (after all, people are writing real programs in  
> Python...). If a GUI does not respond to events for more than two  
> seconds, it is badly written. If the computer or the programming  
> language implementation are just too slow (performance) to achieve a  
> certain task in that time, the Right Thing To Do is to put up a  
> progress bar and keep processing screen update events while doing it,  
> or even do it entirely "in the background".
> Of course, responsiveness is not an issue for non-int

Re: FFI, safe vs unsafe

2006-04-12 Thread Wolfgang Thaller

John Meacham wrote:

However, in order to achieve that we would have to annotate the  
foreign

functions with whether they use thread local state.


I am not opposed to that; however, you might not like that here  
again, there would be the safe, possibly inefficient default choice,  
which means "might access thread local data", and the possibly more  
efficient annotation that comes with a proof obligation, which says  
"guaranteed not to access thread local data".
The main counterargument is that some libraries, like OpenGL require  
many *fast* nonconcurrent, nonreentrant but tls-using calls (and,  
nost likely, one reentrant and possibly concurrent call for the GLUT  
main event loop). Using OpenGL would probably be infeasible from an  
implementation which requires a "notls" annotation to make foreign  
imports fast.



it would pretty much
be vital for implementing them efficiently on a non OS-threaded
implemenation of the language.


True, with the implementation plan you've outlined so far.
Have you considered hybrid models where most threads are state  
threads (all running in one OS thread) and a few threads (=the bound  
threads) are OS threads which are prevented from actually executing  
in parallel by a few well-placed locks and condition variables? You  
could basically write an wrapper around the state threads and  
pthreads libraries, and you'd get the best of both worlds. I feel it  
wouldn't be that hard to implement, either.



Oddly enough, depending on the implementation it might actually be
easier to just make every 'threadlocal' function fully concurrent. you
have already paid the cost of dealing with OS threads.


Depending on the implementation, yes. This is the case for the  
inefficient implementation we recommended for interpreters like Hugs  
in our bound threads paper; there, the implementation might be  
constrained by the fact that Hugs implements cooperative threading in  
Haskell using continuation passing in the IO monad; the interpreter  
itself doesn't even really know about threads. For jhc, I fell that a  
hybrid implementation would be better.



they are a bonus in that you can't run concurrent computing haskell
threads at the same time. you get "free" concurrent threads in other
languages that you would not get if the libraries just happened to be
implemented in haskell. However, if the libraries were implemented in
haskell, you would still get concurrency on OS blocking events because
the progress guarentee says so.


Hmm... it sounds like you've been assuming cooperative scheduling,  
while I've been assuming preemptive scheduling (at least GHC-style  
preemption, which only checks after x bytes of allocation). Maybe, in  
a cooperative system, it is a little bit of a bonus, although I'd  
still want it for practical reasons. I can make my Haskell  
computations call yield, but how do I make a foreign library (whose  
author will just say "Let them use threads") cooperate? In a  
preemtive system, the ability to run a C computation in the  
background remains a normal use case, not a bonus.



The question "can I provide a certain guarantee or not" could be
answered with "no" by default to flatten the learning curve a bit. My
objection against having "no default" is not very strong, but I do
object to specifying this "in neutral language". This situation does
not call for neutral language; rather, it has to be made clear that
one of the options comes with a proof obligation and the other only
with a performance penalty.


you seem to be contradicting yourself, above you say a performance
penalty is vitally important in the GUI case if a call takes too  
long, [...]


I am not. What I was talking about above was not performance, but  
responsiveness; it's somewhat related to fairness in scheduling.
If a foreign call takes 10 microseconds instead of 10 nanoseconds,  
that is a performance penalty that will matter in some circumstances,  
and not in others (after all, people are writing real programs in  
Python...). If a GUI does not respond to events for more than two  
seconds, it is badly written. If the computer or the programming  
language implementation are just too slow (performance) to achieve a  
certain task in that time, the Right Thing To Do is to put up a  
progress bar and keep processing screen update events while doing it,  
or even do it entirely "in the background".
Of course, responsiveness is not an issue for non-interactive  
processes, but for GUIs it is very important.



Who is to say whether a app that
muddles along is better or worse than one that is generally snappy but
has an occasional delay.


I am ;-). Apart from that, I feel that is a false dichotomy, as even  
a factor 1000 slowdown in foreign calls is no excuse to make a GUI  
"generally muddle along".



Though, I am a fan of neutral language in general. you can't crash the
system like you can with unsafePerformIO, FFI calls that take a while
and arn't already wrapped by the 

Re: FFI, safe vs unsafe

2006-04-12 Thread Marcin 'Qrczak' Kowalczyk
Taral <[EMAIL PROTECTED]> writes:

> fast - takes very little time to execute

I was thinking about "quick". It seems to be less literal about speed
if my feeling of English is good enough; the effect is indeed not just
speed.

They fit both as a description of the foreign function, and as a
request from an implementation (to make the call faster at the expense
of something).

The trouble with both is that they raise the question "shouldn't I
mark all FFI calls as fast, so they are faster?". They don't hint
at a cost of the optimization: faster at the expense of what?

-- 
   __("< Marcin Kowalczyk
   \__/   [EMAIL PROTECTED]
^^ http://qrnik.knm.org.pl/~qrczak/
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-04-12 Thread John Meacham
On Wed, Apr 12, 2006 at 07:35:22PM -0400, Wolfgang Thaller wrote:
> John Meacham wrote:
> >This doesn't have to do with bound threads, [...]
> 
> I brought it up because the implementation you are proposing  
> fullfills the most important feature provided by bound threads,  
> namely to be able to access the thread local state of the "main" OS  
> thread (the one that runs C main ()), only for nonconcurrent calls,  
> but not concurrent calls. This gives people a reason to specify some  
> calls as nonconcurrent, even when they are actually expected to  
> block, and it is desirable for other threads to continue running.
> This creates an implementation-specific link between the concurrent/ 
> nonconcurrent question and support for OS-thread-local-state. I would  
> probably end up writing different code for different Haskell  
> implemlementations in this situation.

Oh, I made that proposal a while ago as a first draft, bound threads
should be possible whether calls are concurrent or not, I am not
positive I like the ghc semantics, but bound threads themselves pose not
much more overhead than supporting concurrent in the first place (which
is a fairly substantial overhead to begin with). but that doesn't matter
to me if there isn't a performance impact in the case where they arn't
used. 

However, in order to achieve that we would have to annotate the foreign
functions with whether they use thread local state. it would pretty much
be vital for implementing them efficiently on a non OS-threaded
implemenation of the language. you need to perform a
stack-pass-the-baton dance between threads to pass the haskell stack to
the right OS thread which is a substantial overhead you can't pay just
in case it might be running in a 'forkOS' created thread. Checking
thread local state for _every_ foregin call is definitly not an option
either. (but for specificially annotated ones it is fine.) ghc doesn't
have this issue because it can have multiple haskell threads running at
once on different OS threads, so it just needs to create one that
doesn't jump between threads and let foreign calls proceed naturally.
non-os threaded implementations have the opposite problem, they need to
support a haskell thread that _can_ (and does) jump between OS threads.
one pays the cost at thread creation time, the other pays the cost at
foreign call time. the only way to reconcile these would be to annotate
both. (which is perfectly fine by me if bound threads are needed, which I
presume they are)

Oddly enough, depending on the implementation it might actually be
easier to just make every 'threadlocal' function fully concurrent. you
have already paid the cost of dealing with OS threads.

> Calculations done by foreign calls are not a "bonus", but an  
> important use case for concurrent calls. Think of a library call that  
> causes a multimedia library to recompress an entire video file;  
> estimated time required is between a few seconds and a day. In a  
> multithreaded program, this call needs to be concurrent. It is true  
> that the program will still terminate even if the call is  
> nonconcurrent, but in the real world, termination will probably occur  
> by the user choosing to "force quit" an application that is "not  
> responding" (also known as sending SIGTERM or even SIGKILL).

they are a bonus in that you can't run concurrent computing haskell
threads at the same time. you get "free" concurrent threads in other
languages that you would not get if the libraries just happened to be
implemented in haskell. However, if the libraries were implemented in
haskell, you would still get concurrency on OS blocking events because
the progress guarentee says so.

> The question "can I provide a certain guarantee or not" could be  
> answered with "no" by default to flatten the learning curve a bit. My  
> objection against having "no default" is not very strong, but I do  
> object to specifying this "in neutral language". This situation does  
> not call for neutral language; rather, it has to be made clear that  
> one of the options comes with a proof obligation and the other only  
> with a performance penalty.

you seem to be contradicting yourself, above you say a performance
penalty is vitally important in the GUI case if a call takes too long,
but here you call it 'just a performance penalty'. The overhead of
concurrent calls is quite substantial. Who is to say whether a app that
muddles along is better or worse than one that is generally snappy but
has an occasional delay.

Though, I am a fan of neutral language in general. you can't crash the
system like you can with unsafePerformIO, FFI calls that take a while
and arn't already wrapped by the standard libraries are relatively rare.
no need for strong language.


John


-- 
John Meacham - ⑆repetae.net⑆john⑈
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-04-12 Thread Wolfgang Thaller

John Meacham wrote:

This doesn't have to do with bound threads, [...]


I brought it up because the implementation you are proposing  
fullfills the most important feature provided by bound threads,  
namely to be able to access the thread local state of the "main" OS  
thread (the one that runs C main ()), only for nonconcurrent calls,  
but not concurrent calls. This gives people a reason to specify some  
calls as nonconcurrent, even when they are actually expected to  
block, and it is desirable for other threads to continue running.
This creates an implementation-specific link between the concurrent/ 
nonconcurrent question and support for OS-thread-local-state. I would  
probably end up writing different code for different Haskell  
implemlementations in this situation.


Note that some predictable way of interacting with OS threads (and OS- 
thread local state) is necessary in order to be able to use some  
libraries at all, so not using OS threads *at all* might not be a  
feasible method of implementing a general-purpose programming  
language (or at least, not a feasible implementation method for  
general purpose implementations of general purpose programming  
languages).


The whole point of having bound threads is to NOT require a 1:1  
correspondence between OS threads and Haskell threads but still be  
able to interact with libraries that use OS-thread-local-state. They  
allow implementers to use OS threads just for *some* threads (i.e.  
just where necessary), while still having full efficiency and freedom  
of implementation for the other ("non-bound") threads.


There might be simpler schemes that can support libraries requiring  
OS-thread-local-state for the most common use cases, but there is a  
danger that it will interact with the concurrent/nonconcurrent issue  
in implementation-specific ways if we don't pay attention.


I object to the idea that concurrent calls are 'safer'. getting it  
wrong
either way is a bug. it should fail in the most obvious way rather  
than

the way that can remain hidden for a long time.


How can making a call "concurrent" rather than "nonconcurrent" ever  
be a bug?



in any case, blocking is a pretty well defined operation on operating
systems, it is when the kernel can context switch you away waiting  
for a
resource, which is the main use of concurrent calls. the ability to  
use
them for long calculations in C is a sort of bonus, the actual main  
use

is to ensure the progress guarentee,


I disagree with this. First, concurrent calls serve a real-world  
purpose for all interactive programs. GUI applications are soft  
realtime systems; if a GUI application stops processing events for  
more than 2 seconds (under regular system load), I consider it buggy.
Second, although blocking is well-defined for kernel operations, the  
documented interface of most libraries does not include any  
guarantees on whether they will block the process or not; sometimes  
the difference might be entirely irrelevant; does it make a  
difference whether a drawing function in a library writes to video  
memory or sends an X request accross the network? Saying something  
"takesawhile" doesn't muddy things; it is a strictly weaker condition  
than whether something blocks.
Calculations done by foreign calls are not a "bonus", but an  
important use case for concurrent calls. Think of a library call that  
causes a multimedia library to recompress an entire video file;  
estimated time required is between a few seconds and a day. In a  
multithreaded program, this call needs to be concurrent. It is true  
that the program will still terminate even if the call is  
nonconcurrent, but in the real world, termination will probably occur  
by the user choosing to "force quit" an application that is "not  
responding" (also known as sending SIGTERM or even SIGKILL).


Reducing the issue to the question whether a function blocks or not  
is just plain wrong.



I'd actually prefer it if there were no default and it had to
be specified in neutral language because I think this is one of those
things I want FFI library writers to think about.


But as I have been saying, the decision that FFI library writers have  
to make, or rather the only decision that they *can* make, is the  
simple decision of "can I guarantee that this call will return to its  
caller (or reenter Haskell via a callback) before the rest of the  
program is negatively affected by a pause". If the function could  
block, the answer is a definite "no", otherwise the question is  
inherently fuzzy. Unfortunately, I don't see a way of avoiding this  
fuzziness.


The question "can I provide a certain guarantee or not" could be  
answered with "no" by default to flatten the learning curve a bit. My  
objection against having "no default" is not very strong, but I do  
object to specifying this "in neutral language". This situation does  
not call for neutral language; rather, it has to be made clear that 

Re: FFI, safe vs unsafe

2006-04-12 Thread Claus Reinke

if I may repeat myself (again), since my old suggestion now seems to
agree with Wolfgang, Ross, and Simon:

   http://www.haskell.org//pipermail/haskell-prime/2006-March/001129.html
   ...
   so my suggestion would be to make no assumption about
   unannotated calls (don't rely on the programmer too much;),
   and to have optional keywords "atomic" and "non-reentrant".

but yes, "non-reentrant" is rather too long - perhaps "external" (is outside
Haskell and stays out)?

   foreign import - we don't know anything, some implementations
   might not support this
   foreign import atomic - function is neither blocking nor long-running
   foreign import external - function has no callbacks to Haskell

cheers,
claus

---
Wolfgang Thaller:
|Personally, I'm still in favour of inverting this. We are not in  
|court here, so every foreign function is guilty until proven  
|innocent. Every foreign function might be "longrunning" unless the  
|programmer happens to know otherwise. So maybe... "returnsquickly"?


---

On 2006-04-11, Ross Paterson <[EMAIL PROTECTED]> wrote:

On Tue, Apr 11, 2006 at 09:13:00AM +0100, Simon Marlow wrote:

 - the default should be... concurrent reentrant, presumably,
   because that is the safest.  (so we need to invert the notation).

I think the name "concurrent" has a similar problem to "safe": it
reads as an instruction to the implementation, rather than a
declaration by the programmer of the properties of a particular
function; as Wolfgang put it, "this function might spend a lot of
time in foreign lands". 

I'd like to second this.


I agree.  So other suggestions?  longrunning?  mightblock or mayblock?

I don't much like 'nonreentrant', it's a bit of a mouthful.  Any other
suggestions for that?  nocallback?

Cheers,
Simon
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-04-12 Thread John Meacham
On Thu, Apr 13, 2006 at 12:43:26AM +0200, Marcin 'Qrczak' Kowalczyk wrote:
> What about getaddrinfo()? It doesn't synchronize with the rest of the
> program, it will eventually complete no matter whether other threads
> make progress, so making it concurrent is not necessary for correctness.
> It should be made concurrent nevertheless because it might take a long
> time. It does block; if it didn't block but needed the same time for
> an internal computation which doesn't go back to Haskell, it would
> still benefit from making the call concurrent.

getaddrinfo most definitly blocks so should be made concurrent, it uses
sockets internally. The progress guarentee is meant to imply "if
something can effectivly use the CPU it will be given it if nothing else
is using it" not that it will just eventually complete. Performing a
long calculation is progress whether in haskell or C, waiting on a file
descriptor isn't.

John

-- 
John Meacham - ⑆repetae.net⑆john⑈
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-04-12 Thread Marcin 'Qrczak' Kowalczyk
John Meacham <[EMAIL PROTECTED]> writes:

> I object to the idea that concurrent calls are 'safer'. getting it
> wrong either way is a bug. it should fail in the most obvious way
> rather than the way that can remain hidden for a long time.

I wouldn't consider it a bug of an implementation if it makes a call
behave like concurrent when it's specified as non-concurrent. If a
library wants to make it a critical section, it should use a mutex
(MVar).

Or there should be another kind of foreign call which requires
serialization of calls. But of which calls? it's rarely the case that
it needs to be serialized with other calls to the same function only,
and also rare that it must be serialized with everything else, so the
granularity of the mutex must be explicit. It's fine to code the mutex
explicitly if there is a kosher way to make it global.

Non-concurrent calls which really blocks other thread should be
treated only as an efficiency trick, as in implementations where the
runtime is non-reentrant and dispatches threads running Haskell code
internally, making such call without ensuring that other Haskell
threads have other OS threads to run them is faster.

OTOH in implementations which run Haskell threads truly in parallel,
the natural default is to let C code behave concurrently. Ensuring
that it is serialized would require extra work which is counter-productive.
For functions like sqrt() the programmer wants to say that there is no
need to make it concurrent, without also saying that it requires calls
to be serialized.

> Which is why I'd prefer some term involving 'blocking' because that
> is the issue. blocking calls are exactly those you need to make
> concurrent in order to ensure the progress guarentee.

What about getaddrinfo()? It doesn't synchronize with the rest of the
program, it will eventually complete no matter whether other threads
make progress, so making it concurrent is not necessary for correctness.
It should be made concurrent nevertheless because it might take a long
time. It does block; if it didn't block but needed the same time for
an internal computation which doesn't go back to Haskell, it would
still benefit from making the call concurrent.

It is true that concurrent calls often coincide with blocking. It's
simply the most common reason for a single non-calling-back function
to take a long time, and one which can often be predicted statically
(operations of extremely long integers might take a long time too,
but it would be hard to differentiate them from the most which don't).

The name 'concurrent' would be fine with me if the default is 'not
necessarily concurrent'. If concurrent calls are the default, the name
'nonconcurrent' is not so good, because it would seem to imply some
serialization which is not mandatory.

-- 
   __("< Marcin Kowalczyk
   \__/   [EMAIL PROTECTED]
^^ http://qrnik.knm.org.pl/~qrczak/
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-04-12 Thread John Meacham
On Wed, Apr 12, 2006 at 04:40:29PM -0500, Taral wrote:
> pure - side-effect free
we don't really need pure because not having an IO type in the result
implies pure.
John

-- 
John Meacham - ⑆repetae.net⑆john⑈
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-04-12 Thread Taral
On 4/12/06, Wolfgang Thaller <[EMAIL PROTECTED]> wrote:
> Personally, I'm still in favour of inverting this. We are not in
> court here, so every foreign function is guilty until proven
> innocent. Every foreign function might be "longrunning" unless the
> programmer happens to know otherwise. So maybe... "returnsquickly"?

Hear, hear:

fast - takes very little time to execute
pure - side-effect free
nocallback - does not call back into Haskell

--
Taral <[EMAIL PROTECTED]>
"You can't prove anything."
-- Gödel's Incompetence Theorem
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-04-12 Thread John Meacham
On Wed, Apr 12, 2006 at 12:07:06PM -0400, Wolfgang Thaller wrote:
> 3) There might be implementations where concurrent calls run on a  
> different thread than nonconcurrent calls.

this is necessarily true for non-OS threaded implementations. there is
no other way to wait for an arbitrary C call to end than to spawn a
thread to run it in.

This doesn't have to do with bound threads, to support those you just
need to make sure the other thread you run concurrent calls on is always
the same thread. it is the cost of setting up the mechanism to pass
control to the other thread and wait for the response that is an issue.
turning a single call instruction into several system calls, some memory
mashing and a context switch or two.

I object to the idea that concurrent calls are 'safer'. getting it wrong
either way is a bug. it should fail in the most obvious way rather than
the way that can remain hidden for a long time.

in any case, blocking is a pretty well defined operation on operating
systems, it is when the kernel can context switch you away waiting for a
resource, which is the main use of concurrent calls. the ability to use
them for long calculations in C is a sort of bonus, the actual main use
is to ensure the progress guarentee, that if the OS is going to take
away the CPU because one part of your program is waiting for something
another part of your program can make progress. Which is why I'd prefer
some term involving 'blocking' because that is the issue. blocking calls
are exactly those you need to make concurrent in order to ensure the
progress guarentee. sayng something like 'takesawhile' muddies things,
what is a while? not that concurrent calls shouldn't be used for long C
calculations, it is quite a nice if uncommonly needed perk, but I don't
want the report to confuse matters by making a quantitative real matter,
meeting the progress guarentee, with a qualitiative one "does this take
a while". I'd actually prefer it if there were no default and it had to
be specified in neutral language because I think this is one of those
things I want FFI library writers to think about.

John

-- 
John Meacham - ⑆repetae.net⑆john⑈
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-04-12 Thread Wolfgang Thaller

Simon Marlow wrote:


I agree.  So other suggestions?  longrunning?  mightblock or mayblock?


I don't like "*block", because the question of blocking is irrelevant  
to this issue. It's about whether the foreign call returns sooner or  
later, not about whether it spends the time until then blocked or  
runnning.


Personally, I'm still in favour of inverting this. We are not in  
court here, so every foreign function is guilty until proven  
innocent. Every foreign function might be "longrunning" unless the  
programmer happens to know otherwise. So maybe... "returnsquickly"?


As far as I can gather, there are two arguments *against* making  
longrunning/concurrent the default:


1) It's not needed often, and it might be inefficient.
2) There might be implementations that don't support it at all (I  
might have convinced John that everyone should support it though..).
3) There might be implementations where concurrent calls run on a  
different thread than nonconcurrent calls.


Now I don't buy argument 1; For me the safety/expected behaviour/easy  
for beginners argument easily trumps the performance argument.


ad 3). For implementations that don't support Bound Threads, John  
Meacham proposed saying that nonconcurrent calls are guaranteed to be  
executed on the main OS thread, but no guarantees where made about  
concurrent calls; that makes a lot of sense implementation-wise.
However, this means that calls to the main loops for GLUT, Carbon and  
Cocoa (and maybe others) have to be annotated  
"nonconcurrent"/"returnsquickly" despite the fact that they return  
only after a long time, just to keep access to the right thread-local  
state. I see a big fat #ifdef heading our way.


Cheers,

Wolfgang

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


RE: FFI, safe vs unsafe

2006-04-12 Thread Simon Marlow
On 11 April 2006 17:49, Aaron Denney wrote:

> On 2006-04-11, Ross Paterson <[EMAIL PROTECTED]> wrote:
>> On Tue, Apr 11, 2006 at 09:13:00AM +0100, Simon Marlow wrote:
>>>  - the default should be... concurrent reentrant, presumably,
>>>because that is the safest.  (so we need to invert the notation).
>> 
>> I think the name "concurrent" has a similar problem to "safe": it
>> reads as an instruction to the implementation, rather than a
>> declaration by the programmer of the properties of a particular
>> function; as Wolfgang put it, "this function might spend a lot of
>> time in foreign lands". 
> 
> I'd like to second this.

I agree.  So other suggestions?  longrunning?  mightblock or mayblock?

I don't much like 'nonreentrant', it's a bit of a mouthful.  Any other
suggestions for that?  nocallback?

Cheers,
Simon
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-04-11 Thread Aaron Denney
On 2006-04-11, Ross Paterson <[EMAIL PROTECTED]> wrote:
> On Tue, Apr 11, 2006 at 09:13:00AM +0100, Simon Marlow wrote:
>>  - the default should be... concurrent reentrant, presumably, because
>>that is the safest.  (so we need to invert the notation).
>
> I think the name "concurrent" has a similar problem to "safe": it reads
> as an instruction to the implementation, rather than a declaration by the
> programmer of the properties of a particular function; as Wolfgang put it,
> "this function might spend a lot of time in foreign lands".

I'd like to second this.

-- 
Aaron Denney
-><-

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-04-11 Thread John Meacham
On Tue, Apr 11, 2006 at 09:13:00AM +0100, Simon Marlow wrote:
> What are the conclusions of this thread?
> 
> I think, but correct me if I'm wrong, that the eventual outcome was:
> 
>  - concurrent reentrant should be supported, because it is not 
>significantly more difficult to implement than just concurrent.

It wasn't a difficulty of implementation issue, it was whether there
were unavoidable performance traeoffs. I have no problem with very
difficult things if they are well specified and don't require
unreasonable concessions elsewhere in the design.

in any case, I think the __thread local storage trick makes this fast
enough to implement everywhere and there were strong arguments for not
having it causing issues for library developers.


>  - the different varieties of foreign call should all be identifiable,
>because there are efficiency gains to be had in some implementations.

indeed. 

>  - the default should be... concurrent reentrant, presumably, because
>that is the safest.  (so we need to invert the notation).

well, I like to reserve the word 'safe' for things that might crash the
runtime, unsafePerformIO, so making things nonconcurrent isn't so much
something unsafe as a decision. I'd prefer nonconcurrent be the default
because it is the much more common case and is just as safe in that
regard IMHO.

> So, can I go ahead and update the wiki?  I'll try to record the
> rationale from the discussion too.

sure.

> I'd like to pull out something from the discussion that got a bit lost
> in the swamp: the primary use case we have for concurrent reentrant is
> for calling the main loop of a GUI library.  The main loop usually never
> returns (at least, not until the application exits), hence concurrent,
> and it needs to invoke callbacks, hence reentrant.

this is a pain. (making various libraries main loops play nice
together). not that it is a haskell specific problem though I guess we
have to deal with it.  I was thikning of using something like
http://liboop.org/ internally in jhc.. but am not sure and would prefer
a pure haskell solution without compelling reason to do otherwise.


John

-- 
John Meacham - ⑆repetae.net⑆john⑈
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-04-11 Thread Ross Paterson
On Tue, Apr 11, 2006 at 09:13:00AM +0100, Simon Marlow wrote:
>  - the default should be... concurrent reentrant, presumably, because
>that is the safest.  (so we need to invert the notation).

I think the name "concurrent" has a similar problem to "safe": it reads
as an instruction to the implementation, rather than a declaration by the
programmer of the properties of a particular function; as Wolfgang put it,
"this function might spend a lot of time in foreign lands".

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


RE: FFI, safe vs unsafe

2006-04-11 Thread Simon Marlow
What are the conclusions of this thread?

I think, but correct me if I'm wrong, that the eventual outcome was:

 - concurrent reentrant should be supported, because it is not 
   significantly more difficult to implement than just concurrent.

 - the different varieties of foreign call should all be identifiable,
   because there are efficiency gains to be had in some implementations.

 - the default should be... concurrent reentrant, presumably, because
   that is the safest.  (so we need to invert the notation).

So, can I go ahead and update the wiki?  I'll try to record the
rationale from the discussion too.

I'd like to pull out something from the discussion that got a bit lost
in the swamp: the primary use case we have for concurrent reentrant is
for calling the main loop of a GUI library.  The main loop usually never
returns (at least, not until the application exits), hence concurrent,
and it needs to invoke callbacks, hence reentrant.

Cheers,
Simon
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-04-06 Thread Fergus Henderson
On 03-Apr-2006, John Meacham <[EMAIL PROTECTED]> wrote:
> On Mon, Apr 03, 2006 at 02:00:33PM -0400, Wolfgang Thaller wrote:
> > About how fast thread-local state really is:
> > __thread attribute on Linux: ~ 2 memory load instructions.
> > __declspec(thread) in MSVC++ on Windows: about the same.
> > pthread_getspecific on Mac OS X/x86 and Mac OS X/G5: ~10 instructions
> > pthread_getspecific on Linux and TlsGetValue on Windows: ~10-20  
> > instructions
> > pthread_getspecific on Mac OS X/G4: a system call :-(.
> 
> how prevelant is support for __thread BTW? is it required by any
> standards or an ELFism?

It's a recent innovation that standards have not yet had time to catch
up with.  But __thread is the way of the future, IMHO.
GNU C/C++ supports __thread for at least the following architectures:

IA-32, x86-64,
IA-64,
SPARC (32- and 64-bit),
Alpha,
S390 (31- and 64-bit),
SuperHitachi (SH),
HP/PA 64-bit.

It's also supported by Borland C++ builder, Sun Studio C/C++, and Visual C++.
(VC++ does use a slightly different syntax, but you can use macros to work
around that.)

-- 
Fergus J. Henderson |  "I have always known that the pursuit
Galois Connections, Inc.|  of excellence is a lethal habit"
Phone: +1 503 626 6616  | -- the last words of T. S. Garp.
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-04-05 Thread Marcin 'Qrczak' Kowalczyk
I think the following kinds of foreign calls wrt. concurrency are
sensible:

1. Other Haskell threads might get paused (but don't have to).

   Examples: sqrt, qsort (we assume that qsort never needs a long time
   between calls to the comparison function, so there is no need to
   allow other threads, and it's more important to avoid context
   switching overhead).

2. Other Haskell threads should run if possible, but this is not
   strictly necessary if the implementation doesn't support that.

   Examples: stat, computing md5 of a byte array (the call doesn't
   block for an arbitrarily long time, so pausing other threads is
   acceptable, but with slow hardware or on a multiprocessor it might
   be preferable to allow for more concurrency).

3. Other Haskell threads must run.

   Examples: wait, blocking read (a blocking call; not running Haskell
   threads might lead to deadlocks).

2 is the same as 1 on some implementations, and the same as 3 on others.

3 is possible only in implementations which make use of OS threads.
A foreign call annotated with 3 would be an error in some implementations.

Old GHC, before bound threads, can't support 3. In GHC with bound threads
2 is equivalent to 3. In SMP version even 1 allows some other threads
(but multiple threads doing calls of kidn 1 might stop other threads).

1 or 2 are reasonable defaults.

Variant 1 has two subvariants: allowing callbacks or not. I don't know
whether it makes sense to differentiate this for other variants, i.e.
whether disallowing callbacks allows to generate faster code. Anyway,
if 2 is chosen as the default, I wish to be able to specify variant 1
sans callbacks with a single keyword, like 'unsafe' today, because
it's quite common.

I'm not sure whether 3 should be provided at all. Perhaps when
wrapping a foreign function it's generally not known whether full
concurrency is essential or not, so it's always better to block other
threads than to reject code.

Some functions need to use a different implementation when variant 2
blocks other threads. For example instead of calling a blocking
function, the program might call some its variant with a timeout,
and after the timeout other threads are given a chance to run.
This way the waiting thread uses its timeslices and wakes up the
processor every couple of milliseconds, but at least it works.
In my implementation of my language Kogut I simply don't block
the timer signal in this case.

So I propose to not provide 3, but instead provide a constant which
a program can use to discover whether 2 blocks other threads or not.
Using that constant it can either choose an alternative strategy,
or abort if it knows that 3 would be essential.


Wolfgang Thaller <[EMAIL PROTECTED]> writes:

> 1.) Assume thread A and B are running. Thread A makes a non-
> concurrent, reentrant call to Foreign Lands. The foreign function
> calls a foreign-exported Haskell function 'foo'.
> While 'foo' is executing, does thread B resume running?

Yes, when the scheduler chooses it.

> 2.) Assume the same situation as in 1, and assume that the answer to
> 1 is yes. While 'foo' is running, (Haskell) thread B makes a non-
> concurrent, reentrant foreign call. The foreign function calls back
> to the foreign-exported Haskell function 'bar'. Because the answer to
> 1 was yes, 'foo' will resume executing concurrently with 'bar'.
> If 'foo' finishes executing before 'bar' does, what will happen?

There are sensible implementations where the foreign code of thread A
after calling 'foo' continues running, and is running alone, while
thread B is paused until thread A either calls another Haskell
function or returns to Haskell. Bound threads in GHC work like this
I think.

And there are sensible implementations where thread A is paused trying
to return from 'foo', until 'bar' returns. This might lead to a
deadlock if 'bar' will wait for something only thread A can do, but
it's unavoidable if the implementation doesn't use OS threads at all.
I think these implementations coincide with those which are unable to
provide variant 3, so providing the constant I mentioned allows to
distinguish these cases too.

> 3.) Same situation as in 1. When 'foo' is called, it forks (using
> forkIO) a Haskell thread C. How many threads are running now?

Three.

> 4.) Should there be any guarantee about (Haskell) threads not making
> any progress while another (Haskell) thread is executing a non-
> concurrent call?

No. In an implementation which runs every Haskell threads on its
own OS thread, with a concurrent runtime, all foreign calls are
actually concurrent and the modifiers have no effect.

> 5.) Assume that Haskell Programmer A writes a Haskell library that
> uses some foreign code with callbacks, like for example, the GLU
> Tesselator (comes with OpenGL), or, as a toy example, the C Standard
> Library's qsort function. Should Programmer A specify "concurrent
> reentrant" on his foreign import?

If the call can take a long time before enter

Re: FFI, safe vs unsafe

2006-04-03 Thread Wolfgang Thaller

John Meacham wrote (... but I've reordered things):


My only real 'must-have' is that the 4 modes all can be explicitly and
unambiguously specified. I have opinions on the syntax/hints but  
that is

more flexable.


I basically agree (the syntax discussion will take place in the years  
after the semantics discussion), but...


I want programmers to have a way of saying "this function might spend  
a lot of time in foreign lands". These calls should be concurrent on  
all implementations that support it (because some separately  
developed/maintained piece of Haskell code might expect to run a  
computation in the background), but if there are implementations that  
don't support it shouldn't flag an error, because that would  
encourage library writers to specify nonconcurrent when they can't  
prove that it's safe, or make code needlessly nonportable.
Another way to look at it: You cannot decide whether the call  
actually has to be done concurrently by just looking at the call site  
- you'd need to look at the entire program, and asking people  
(especially library writers) to state and guarantee global properties  
of a program that might not even be finished yet is a Bad Thing.  
Therefore, the concurrency annotation on the foreign import can only  
be a hint on whether the foreign function is guaranteed to return  
quickly or not; the actual requirement for the call to be  
"concurrent" is hidden in the other code that expects to run at the  
same time. Therefore, it would be wrong for an implementation that  
doesn't support concurrent calls (reentrant or nonreentrant, I don't  
care) to flag an error; the foreign import declaration correctly  
refuses to give a guarantee that the function will return quickly.  
The error is in the code that expects to run concurrently with a  
foreign import on an implementation that doesn't support that (but of  
course, a compiler can't detect such an error).



Another nice minor thing would be if haskell implementations were
required to ignore annotations starting with 'x-' for implementation
specific hints.


Sounds good. Syntax discussion postponed again ('x-' looks so mime- 
typish. Could we add a meaningless 'application/' to the front? Just  
kidding).



In my survey of when 'reentrant concurrent' was needed, I looked at  
all
the standard libraries and didn't find anywhere it was actually  
needed.

Are there some compelling examples of when it is really needed in a
setting that doesn't have OS threads to begin with? (I am not  
asserting

they don't exist, I just want to see some example uses of this feature
to get a better handle on the implementation cost)


In my experience, reentrant calls are rare in non-GUI code, but they  
become quite common in GUI code (OK, in some GUI libraries, there is  
only one, called something like RunMainEventLoop, but then it runs  
almost all of the time and is absolutely crucial). And with most GUI  
libraries, the GUI's main event loop will refuse to cooperate well  
with a Haskell's implementation's scheduler, so it will need to be  
called as a "concurrent" foreign import if your application is to do  
any background processing while waiting for events.
Other libraries that rely on callbacks would include the GLU  
Tesselator that I already mentioned, as well as several packages for  
solving optimisation problems. For those, concurrency would probably  
only become an issue when they are used with a GUI (even if it's only  
to display a progress bar).
Another reason why you don't see them in Haskell standard library  
code might be that everyone prefers Data.List.sort to foreign import  
ccall qsort.


Any particular reason hugs and GHC didn't use the state-threads  
approach
out of curiosity? did it conflict with the push-enter model?  (jhc  
uses

the eval-apply model so I am more familier with that)


It was before my time. I guess it's because GHC uses a separate heap- 
allocated Haskell thread, so it made sense not to bother to allocate  
a separate C stack for every one of them. Don't know about Hugs.



It also implys that a function call will run on the same OS thread as
the OS thread the current haskell thread is running on.


This shouldn't be put into a standard, as the bound threads proposal  
already gives a different guarantee about that, and both guarantees  
taken together probably guarantee too much - taken together, they  
probably mean every Haskell thread has to be an OS thread. It might  
be an implementation-specific guarantee, unless the bound threads  
become a part of the standard in their entirety.



'OS thread the current haskell
thread is running on' (GHC already doesn't when bound threads arn't  
used

I am led to believe?)


There should be no such thing as the 'OS thread the current haskell  
thread is running on' in any standard; OS thread identity is only  
observed through the FFI.



 this means that things like 'log' and 'sin' and
every basic operation goes through the FFI

Re: FFI, safe vs unsafe

2006-04-03 Thread John Meacham
On Mon, Apr 03, 2006 at 02:00:33PM -0400, Wolfgang Thaller wrote:
> Sorry for the length of this. There are three sections: the first is  
> about how I don't like for "nonconcurrent" to be the default, the  
> second is about bound threads and the third is about implementing  
> concurrent reentrant on top of state threads.
> 
> >no, state-threads, a la NSPR, state-threads.sf.net, or any other of a
> >bunch of implementations.
> 
> Ah. I was thinking of old-style GHC or hugs only, where there is one  
> C stack and only the Haskell state is per-haskell-thread. My bad.
> So now that I know of an implementation method where they don't cause  
> the same problems they used to cause in GHC, I am no longer opposed  
> to the existance of nonconcurrent reentrant imports.

Any particular reason hugs and GHC didn't use the state-threads approach
out of curiosity? did it conflict with the push-enter model?  (jhc uses
the eval-apply model so I am more familier with that)

> To me, "nonconcurrent" is still nothing but a hint to the  
> implementation for improving performance; if an implementation  
> doesn't support concurrent reentrancy at all, that is a limitation of  
> the implementation.

It also implys that a function call will run on the same OS thread as
the OS thread the current haskell thread is running on. however, we need
not codify this behavior as some future implementations might not follow
it or have a well defined meaning for 'OS thread the current haskell
thread is running on' (GHC already doesn't when bound threads arn't used
I am led to believe?)

> Maybe the default should be "as concurrent as the implementation  
> supports", with an optional "nonconcurrent" annotation for  
> performance, and an optional "concurrent" annotation to ensure an  
> error/warning when the implementation does not support it. Of course,  
> implementations would be free to provide a flag *as a non-standard  
> extension* that changes the behaviour of unannotated calls.

A concurrent hint would be okay. I have a preference for including an
explicit annotation in that case. In fact, I'd support it if all
properties were explicitly annotated and we didn't allow a blank
specification.

My only real 'must-have' is that the 4 modes all can be explicitly and
unambiguously specified. I have opinions on the syntax/hints but that is
more flexable.

Another nice minor thing would be if haskell implementations were
required to ignore annotations starting with 'x-' for implementation
specific hints.

In jhc there are no such thing as compiler primitives*, there are only
FFI imports and a couple primitive imports that don't translate to code
(seq,unsafeCoerce). this means that things like 'log' and 'sin' and
every basic operation goes through the FFI mechanism so it needs to be
_fast_ _fast_. A neat side effect is that jhcs implementation of the
prelude is mostly portable to different compilers.

* almost true, for historical reasons I hope to fix there are a few
  built in numeric operators.

>  Bound Threads 
> 
> In GHC, there is a small additional cost for each switch to and from  
> a bound thread, but no additional cost for actual foreign call-outs.
> For jhc, I think you could implement a similar system where there are  
> multiple OS threads, one of which runs multiple state threads; this  
> would have you end up in pretty much the same situation as GHC, with  
> the added bonus of being able to implement foreign import  
> nonconcurrent reentrant for greater performance.
> If you don't want to spend the time to implement that, then you could  
> go with a possibly simpler implementation involving inter-thread  
> messages for every foreign call from a bound thread, which would of  
> course be slow (that's the method I'd have recommended to hugs).

I am not quite sure whether you are saying something different from what
I plan for jhc or not, my current thinking for jhc is,

the single one true OS thread for all haskell threads in an EDSM loop
(epoll based). the EDSM loop has its own stack (and is, in fact, just
another haskell thread as the scheduler is implemented in haskell), each
haskell thread has its own stack.

non concurrent calls are just C 'call's. nothing more.

concurrent nonreentrant calls are desugared to haskell code that

FFI calls 'socketpair(2)'
pokes arguments into structure
FFI calls 'pthread_create'
  pthread_create is passed a function that unpacks the args, calls the
  foreign function, stores the result and writes one byte to one end of the 
socketpair.
calls 'Concurrent.IO.threadWaitRead' on the other end of the socket pair.
peeks the return value

(in practice, socketpair will be a much faster 'futex' on linux and OS
threads may or may not be cached)

very low overhead, probably the minimal possible for an arbitrary
unknown FFI call.

An alternate mode I'd like to experiment with one day is the complete
oposite side of the spectrum:

one OS thread per haskell thread, no guarentees about

Re: FFI, safe vs unsafe

2006-04-03 Thread Wolfgang Thaller
Sorry for the length of this. There are three sections: the first is  
about how I don't like for "nonconcurrent" to be the default, the  
second is about bound threads and the third is about implementing  
concurrent reentrant on top of state threads.



no, state-threads, a la NSPR, state-threads.sf.net, or any other of a
bunch of implementations.


Ah. I was thinking of old-style GHC or hugs only, where there is one  
C stack and only the Haskell state is per-haskell-thread. My bad.
So now that I know of an implementation method where they don't cause  
the same problems they used to cause in GHC, I am no longer opposed  
to the existance of nonconcurrent reentrant imports.


To me, "nonconcurrent" is still nothing but a hint to the  
implementation for improving performance; if an implementation  
doesn't support concurrent reentrancy at all, that is a limitation of  
the implementation.
I think that this is a real problem for libraries; library writers  
will have to choose whether they preclude their library from being  
used in multithreaded programs or whether they want to sacrifice  
portability (unless they spend the time messing around with cpp or  
something like it).


Some foreign calls are known never to take much time; those can be  
annotated as nonconcurrent. For calls that might take nontrivial  
amounts of time, the question whether they should be concurrent or  
not *cannot be decided locally*; it depends on what other code is  
running in the same program.


Maybe the default should be "as concurrent as the implementation  
supports", with an optional "nonconcurrent" annotation for  
performance, and an optional "concurrent" annotation to ensure an  
error/warning when the implementation does not support it. Of course,  
implementations would be free to provide a flag *as a non-standard  
extension* that changes the behaviour of unannotated calls.


 Bound Threads 

In GHC, there is a small additional cost for each switch to and from  
a bound thread, but no additional cost for actual foreign call-outs.
For jhc, I think you could implement a similar system where there are  
multiple OS threads, one of which runs multiple state threads; this  
would have you end up in pretty much the same situation as GHC, with  
the added bonus of being able to implement foreign import  
nonconcurrent reentrant for greater performance.
If you don't want to spend the time to implement that, then you could  
go with a possibly simpler implementation involving inter-thread  
messages for every foreign call from a bound thread, which would of  
course be slow (that's the method I'd have recommended to hugs).


If the per-call cost is an issue, we could have an annotation that  
can be used whenever the programmer knows that a foreign function  
does not access thread-local storage. This annotation, the act of  
calling a foreign import from a forkIO'ed (=non-bound) thread, and  
the act of calling a foreign import from a Haskell implementation  
that does not support bound threads, all place this proof obligation  
on the programmer. Therefore I'd want it to be an explicit  
annotation, not the default.



"if an implementation supports haskell code running on multiple OS
threads, it must support the bound threads proposal. if it does not,
then all 'nonconcurrent' foreign calls must be made on the one true OS
thread"


*) "Haskell code running on multiple OS threads" is irrelevant. Only  
the FFI allows you to observe which OS thread you are running in.  
This should be worded in terms of what kind of concurrent FFI calls  
are supported, or whether call-in from arbitrary OS threads is  
supported.


*) Note though that this makes it *impossible* to make a concurrent  
call to one of Apple's GUI libraries (both Carbon and Cocoa insist on  
being called from the OS thread that runs the C main function). So  
good-bye to calculating things in the background while a GUI is  
waiting for user input.


We could also say that a modified form of the bound threads proposal  
is actually mandatory; the implementation you have in mind would  
support it with the following exceptions:


a) Foreign calls from forkIO'ed threads can read and write (a.k.a.  
interfere with) the thread local state of the "main" OS thread;  
people are not supposed to call functions that use thread local state  
from forkIO'ed threads anyway.


b) Concurrent foreign imports might not see the appropriate thread  
local state.


c) Call-ins from OS threads other than the main thread are not  
allowed, therefore there is no forkOS and no runInBoundThread. (Or,  
alternatively, call-ins from other OS threads create unbound threads  
instead).


 On the implementability of "concurrent reentrant" 


It might not be absolutely easy to implement "concurrent reentrant",
but it's no harder than concurrent non-reentrant calls.


it is much much harder. you have to deal with your haskell run-time
being called into from an _alternate O

Re: FFI, safe vs unsafe

2006-04-03 Thread John Meacham
On Sat, Apr 01, 2006 at 02:30:30PM +0400, Bulat Ziganshin wrote:
> new stacks can be allocated by alloca() calls. all these
> alloca-allocated stack segments can be used as pool of stacks assigned
> to the forked threads. although i don't tried this, my own library
> also used processor-specific method.

so you alloca new big areas and then use 'longjmp' to jump back and
forth within the same stack simulating many stacks?

that is a neat trick. will confuse the hell out of the bohem garbage
collector but I don't want to rely on that much longer anyway :)

however, it would be a good thing to fall back to if no processor
specific stack creation routine is available.

this minimal threads library 
http://www.cs.uiowa.edu/%7Ejones/opsys/threads/
uses an interesting trick where it probes the setjmp structure to find
the SP reasonably portably on any stack-based architecture. pretty
clever.

John

-- 
John Meacham - ⑆repetae.net⑆john⑈
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-03-31 Thread John Meacham
On Fri, Mar 31, 2006 at 06:41:18PM -0500, Wolfgang Thaller wrote:
> >I am confused, why would anything in particular need to happen at all?
> >
> >the threads are completly independent.  The non-concurrent calls could
> >just be haskell code that happens to not contain any pre-emption  
> >points
> >for all it cares. in particular, in jhc, non-concurrent foreign  
> >imports
> >and exports are just C function calls. no boilerplate at all in either
> >direction.  calling an imported foreign function is no different than
> >calling one written in haskell so the fact that threads A and B are
> >calling foregin functions doesn't really change anything.
> 
> In an implementation which runs more than one Haskell thread inside  
> one OS thread, like ghc without -threaded or hugs, the threads are  
> NOT completely independent, because they share one C stack. So while  
> bar executes, stack frames for both foreign functions will be on the  
> stack, and it will be impossible to return from foo before bar and  
> the foreign function that called it completes. I think this kind of  
> semantics is seriously scary and has no place as default behaviour in  
> the language definition.

no, state-threads, a la NSPR, state-threads.sf.net, or any other of a
bunch of implementations.

each thread has its own stack, you 'longjmp' between them. it can almost
practically be done in portable C except the mallocing of the new stack,
but there are many free libraries (less a library, more a header file
with some #ifdefs) for all processors out there that support that, or at
least that gcc supports in the first place.

this would be by far the easist way to add concurrency to any haskell
compiler, other than the addition of the 'create a new stack' and
'longjmp' primitives, it can be implemented 100% in the standard
libraries with haskell code. that is why I am confident in saying it
probably won't rule out future implementations we have not thought of
yet. since it is mostly pure haskell anyway.


> If you implement concurrency by using the pthreads library, you need  
> to either make sure that only one thread mutates the heap at a time,  
> or deal with SMP. In either case, concurrent foreign calls would be  
> trivial.

indeed. but pthreads has its own tradeoffs. there is certainly room for
both types of haskell implementations. 

> >>4.) Should there be any guarantee about (Haskell) threads not making
> >>any progress while another (Haskell) thread is executing a non-
> >>concurrent call?
> >
> >I don't understand why we would need that at all.
> 
> Good. Neither do I, but in the discussions about this issue that we  
> had three years ago several people seemed to argue for that.

wacky. I can't think of a reason, it would be quite tricky to pull off
with a fully pthreaded implementation anyway. 

> >>5.) [...] So what
> >>should the poor library programmer A do?
> >
> >He should say just 'reentrant' since concurrent isn't needed for
> >correctness because the tessalation routines are basic calculations  
> >and
> >will return.
> 
> Let's say they will return after a few minutes. So having them block  
> the GUI is a show-stopper for programmer C.
> And if programmer C happens to use a Haskell implementation that  
> supports "concurrent reentrant" but also a more efficient "non- 
> concurrent reentrant", he will not be able to use the library.

well, I think he has a choice to make there about what is more important
to him. I admit, it has to be a judgement call at some point, as
eventually performance problems become correctness ones.

but perhaps this is an argument for a concurrent-hint flag, "make this
concurrent and reentrant if possible, but its gonna be reentrant anyway
no matter what"

I mean, one could bend the rules any say coooperative systems do
implement "concurrent reentrant" with just an incredibly crappy
scheduling algorithm, but I think I'd rather have it fail outright than
"pretend". 

but a 'concurrent-hint' flag could be useful, as a library writer may
not know the preference of his user.

a completely different solution would be just to foreign import the
routine twice, with each convention and have some way for the user of a
library to choose which one they want, perhaps with a flag. of course,
both might not be available with all implementations.

in any case, I don't think it is a showstopper.

> >everyone wins. in the absolute worst case there are always #ifdefs  
> >but I
> >doubt they will be needed.
> 
> Except for programmer C on some haskell implementations. I don't buy  
> it yet :-).

Well, certain implementations will always have their own extensions that
people might rely on. I just don't want the language standard itself to
rule out valid and useful implementation methods. Haskell with IO
multiplexing is a very powerful platform indeed and this proposal lets
us keep it in the language proper and that is very nice, from an
implementor and a library writers point of view. often concurrent

Re: FFI, safe vs unsafe

2006-03-31 Thread Wolfgang Thaller

John Meacham wrote:

first of all, a quick note, for GHC, the answers will be "the same  
thing
it does now with -threaded". but I will try to answer with what a  
simple

cooperative system would do.


Sure. Unless someone dares answer "yes" to question 4, GHC will stay  
as it is.




2.) Assume the same situation as in 1, and assume that the answer to
1 is yes. While 'foo' is running, (Haskell) thread B makes a non-
concurrent, reentrant foreign call. The foreign function calls back
to the foreign-exported Haskell function 'bar'. Because the answer to
1 was yes, 'foo' will resume executing concurrently with 'bar'.
If 'foo' finishes executing before 'bar' does, what will happen?


I am confused, why would anything in particular need to happen at all?

the threads are completly independent.  The non-concurrent calls could
just be haskell code that happens to not contain any pre-emption  
points
for all it cares. in particular, in jhc, non-concurrent foreign  
imports

and exports are just C function calls. no boilerplate at all in either
direction.  calling an imported foreign function is no different than
calling one written in haskell so the fact that threads A and B are
calling foregin functions doesn't really change anything.


In an implementation which runs more than one Haskell thread inside  
one OS thread, like ghc without -threaded or hugs, the threads are  
NOT completely independent, because they share one C stack. So while  
bar executes, stack frames for both foreign functions will be on the  
stack, and it will be impossible to return from foo before bar and  
the foreign function that called it completes. I think this kind of  
semantics is seriously scary and has no place as default behaviour in  
the language definition.


If you implement concurrency by using the pthreads library, you need  
to either make sure that only one thread mutates the heap at a time,  
or deal with SMP. In either case, concurrent foreign calls would be  
trivial.



4.) Should there be any guarantee about (Haskell) threads not making
any progress while another (Haskell) thread is executing a non-
concurrent call?


I don't understand why we would need that at all.


Good. Neither do I, but in the discussions about this issue that we  
had three years ago several people seemed to argue for that.




5.) [...] So what
should the poor library programmer A do?


He should say just 'reentrant' since concurrent isn't needed for
correctness because the tessalation routines are basic calculations  
and

will return.


Let's say they will return after a few minutes. So having them block  
the GUI is a show-stopper for programmer C.
And if programmer C happens to use a Haskell implementation that  
supports "concurrent reentrant" but also a more efficient "non- 
concurrent reentrant", he will not be able to use the library.


everyone wins. in the absolute worst case there are always #ifdefs  
but I

doubt they will be needed.


Except for programmer C on some haskell implementations. I don't buy  
it yet :-).



6.) Why do people consider it too hard to do interthread messaging
for handling a "foreign export" from arbitrary OS threads, when they
already agree to spend the same effort on interthread messaging for
handling a "foreign import concurrent"? Are there any problems that I
am not aware of?


it is not that it is hard (well it is sort of), it is just absurdly
inefficient and you would have no choice but to pay that price for
_every_ foregin export. even when not needed which it mostly won't be.
the cost of a foreign export should be a simple 'call' instruction
(potentially) when an implementation supports that.


As we seem to agree that the performance issue is non-existant for  
implementations that use one OS thread for every haskell thread, and  
that we don't want to change how GHC works, the following refers to a  
system like hugs where all Haskell code and the entire runtime system  
always runs in a single OS thread.


It might not be absolutely easy to implement "concurrent reentrant",  
but it's no harder than concurrent non-reentrant calls. If a haskell  
implementation has a hacker on its team who is able to do the former,  
then this is no problem either.
As for the efficiency argument: if it is sufficiently slow, then that  
is an argument for including "nonconcurrent reentrant" as an option.  
It is not an argument for making it the default, or for leaving out  
"concurrent reentrant".



the cost of a foreign import concurrent nonreentrant is only paid when
actually using such a function, and quite cheap. on linux at least, a
single futex, a cached pthread and it gets rolled into the main event
loop. so a couple system calls max overhead.


Sure. But what gives you the idea that the cost of a foreign export  
or a foreign import concurrent reentrant would be paid when you are  
not using them?
If we include nonconcurrent reentrant foreign imports in the system,  
or if we just optimise foreign i

Re: FFI, safe vs unsafe

2006-03-31 Thread John Meacham
On Fri, Mar 31, 2006 at 03:16:50PM -0500, Wolfgang Thaller wrote:
> So I'm going to ask a few questions about the semantics of non- 
> concurrent reentrant calls, and if people can provide answers that  
> don't scare me, I'll concede that they have a place in the language  
> standard.

first of all, a quick note, for GHC, the answers will be "the same thing
it does now with -threaded". but I will try to answer with what a simple
cooperative system would do.

> 1.) Assume thread A and B are running. Thread A makes a non- 
> concurrent, reentrant call to Foreign Lands. The foreign function  
> calls a foreign-exported Haskell function 'foo'.
> While 'foo' is executing, does thread B resume running?

if 'foo' blocks on a mvar,read,write,etc... then yes.

> 2.) Assume the same situation as in 1, and assume that the answer to  
> 1 is yes. While 'foo' is running, (Haskell) thread B makes a non- 
> concurrent, reentrant foreign call. The foreign function calls back  
> to the foreign-exported Haskell function 'bar'. Because the answer to  
> 1 was yes, 'foo' will resume executing concurrently with 'bar'.
> If 'foo' finishes executing before 'bar' does, what will happen?

I am confused, why would anything in particular need to happen at all?

the threads are completly independent.  The non-concurrent calls could
just be haskell code that happens to not contain any pre-emption points
for all it cares. in particular, in jhc, non-concurrent foreign imports
and exports are just C function calls. no boilerplate at all in either
direction.  calling an imported foreign function is no different than
calling one written in haskell so the fact that threads A and B are
calling foregin functions doesn't really change anything.

> 3.) Same situation as in 1. When 'foo' is called, it forks (using  
> forkIO) a Haskell thread C. How many threads are running now?

3 potentially runable.

> 4.) Should there be any guarantee about (Haskell) threads not making  
> any progress while another (Haskell) thread is executing a non- 
> concurrent call?

I don't understand why we would need that at all.

> Two more questions, not related to semantics:
> 
> 5.) Assume that Haskell Programmer A writes a Haskell library that  
> uses some foreign code with callbacks, like for example, the GLU  
> Tesselator (comes with OpenGL), or, as a toy example, the C Standard  
> Library's qsort function. Should Programmer A specify "concurrent  
> reentrant" on his foreign import?
> Programmer B will say "please don't", as he wants to use a Haskell  
> implementation which doesn't support "concurrent reentrant".  
> Programmer C will say "please do", as he wants his application's GUI  
> to stay responsive while the library code is executing. So what  
> should the poor library programmer A do?

He should say just 'reentrant' since concurrent isn't needed for
correctness because the tessalation routines are basic calculations and
will return.

However, on a system like GHC that actually can run code concurrently
and actually would have issues enforcing a 'non-concurrent' guarentee it
would run concurrently anyway. It would be hard not to on an
implementation that supported true OS threads actually.

everyone wins. in the absolute worst case there are always #ifdefs but I
doubt they will be needed.

> 6.) Why do people consider it too hard to do interthread messaging  
> for handling a "foreign export" from arbitrary OS threads, when they  
> already agree to spend the same effort on interthread messaging for  
> handling a "foreign import concurrent"? Are there any problems that I  
> am not aware of?

it is not that it is hard (well it is sort of), it is just absurdly
inefficient and you would have no choice but to pay that price for
_every_ foregin export. even when not needed which it mostly won't be.
the cost of a foreign export should be a simple 'call' instruction
(potentially) when an implementation supports that.  

the cost of a foreign import concurrent nonreentrant is only paid when
actually using such a function, and quite cheap. on linux at least, a
single futex, a cached pthread and it gets rolled into the main event
loop. so a couple system calls max overhead.


John

-- 
John Meacham - ⑆repetae.net⑆john⑈
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-03-31 Thread John Meacham
On Fri, Mar 31, 2006 at 03:16:50PM -0500, Wolfgang Thaller wrote:
> Before adding non-concurrent, reentrant calls to the language  
> standard, please take some time to think about what that means. If  
> you have forkIO'ed multiple threads, things start to interact in  
> strange ways. I think this is a can of worms we don't want to open.  
> (Or open again. It's still open in GHC's non-threaded RTS, and the  
> worms are crawling all over  the place there).

I am still digesting your message, but a quick note is that when you
specify non-concurrent, you arn't saying "it can't be concurrent" but
rather "I don't absolutely need it to be"

so GHC would still treat all reentrant calls as concurrent and that is
a-okay by the spec.

John

-- 
John Meacham - ⑆repetae.net⑆john⑈
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-03-31 Thread Claus Reinke

This is the way it is right now in GHC: the default is "safe", and safe
means both reentrant and concurrent.  This is for the reason you give:
the default should be the safest, in some sense.
..
So we can't have the default (unanotated) foreign call be something that
isn't required by the standard. 


why not? you'd only need to make sure that in standard mode,
no unannotated foreign declarations are accepted (or that a warning
is given).

claus

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


RE: FFI, safe vs unsafe

2006-03-31 Thread Simon Marlow
On 30 March 2006 21:40, Claus Reinke wrote:

>> I updated the ForeignBlocking wiki page with what I believe is the
>> current state of this proposal; see
> 
> didn't I mention that "concurrent" may be inappropriate and
> misleading, and that I think it is bad practice to rely on the
> programmer annotating the dangerous cases, instead of the safe cases?
> 
> wouldn't the safe approach be to assume that the foreign call may do
> anything, unless the programmer explicitly tells you about what things
> it won't do (thus taking responsibility).

This is the way it is right now in GHC: the default is "safe", and safe
means both reentrant and concurrent.  This is for the reason you give:
the default should be the safest, in some sense.

However, John has argued, and I agree, that requiring the combination of
concurrent and reentrant to be supported is too much, and furthermore is
often unnecessary.

So we can't have the default (unanotated) foreign call be something that
isn't required by the standard.  Hence, the proposal states that
concurrent foreign calls have to be annotated as such, and it is the
specific case of 'concurrent' alone, as opposed to 'concurrent
nonreentrant' that is an extension.

http://haskell.galois.com/cgi-bin/haskell-prime/trac.cgi/wiki/ForeignBlo
cking

Cheers,
Simon
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-03-30 Thread John Meacham
On Fri, Mar 31, 2006 at 12:52:11AM +0100, Claus Reinke wrote:
> >>didn't I mention that "concurrent" may be inappropriate and misleading, 
> >>and that I think it is bad practice to rely on the programmer annotating 
> >>the dangerous cases, instead of the safe cases?
> >
> >I think dangerous is a misleading term here. you are already using the
> >FFI, all bets are off. and it is not really dangerous to accidentally
> >hold up your VM when you didn't expect, it is more just a simple bug.
> 
> perhaps "dangerous" was too strong a term, but if programmers don't
> annotate an ffi declaration, what is more likely: that they meant to state
> a property of that function, or that they didn't mean to? 
> 
> if there is a way to avoid simple bugs by not making assumptions about 
> undeclared properties, then I'd prefer that to be the default route. if, 
> on the other hand, programmers do annotate the ffi declaration, then 
> it is up to them to make sure that the function actually has the property 
> they claim for it (even in such cases, Haskell usually checks the 
> declaration, but that isn't an option here).

Well, I would consider the performance bug the more serious one. in
fact, they both are performance/scalability bugs rather than correctness
ones. but one is obvious when you get it wrong, the other is subtle and
could go unnoticed a long time and just make you think haskell is a slow
language. we should make it so the obvious one is the more likely
failure route so people fix it right away.
> 
> >>wouldn't the safe approach be to assume that the foreign call may do 
> >>anything, unless the programmer explicitly tells you about what things 
> >>it won't do (thus taking responsibility).
> >
> >I think the worse problem will be all the libraries that are only tested
> >on ghc that suddenly get very poor performance or don't compile at all
> >when attempted elsewhere.
> 
> - GHC and the other implementations should issue a warning for
>using non-standard or non-implemented features (that includes code
>that won't obviously run without non-standard features)
> - if an implementation doesn't implement a feature, there is no way
>around admitting that, standard or not

well, there is if you didn't need the feature in the first place, but
didn't realize it because it was obscured. the bigger danger is that the
feature will be implemented, but very sub-optimally as in, hundreds of
times slower than a fast call could easily be true so you get a very
silent but fatal bug. FFI routines do need to be annotated correctly,
sometimes for correctness and sometimes for performance. when
correctness is at stake, you should err on the side of correct, when
performance is at stake you should err on the side of what will cause
the most rukus when you get it wrong :)

> >However, the 'nonreentrant' case actually is dangerous in that it could
> >lead to undefined behavior which is why that one was not on by default.
> 
> why not be consistent then, and name all attributes so that they are off 
> by default, and so that implementations that can't handle the off case will
> issue a warning at least?

yeah, that is what I originally proposed, but Simon brought up the good
point (paraphrasing, I think this was his reasoning) that 'reentrant' is
important for the safety of the system (as in, segfaults and corruption
result when getting it wrong) while 'concurrent' is simply a choice on
the part of the programmer as to what behavior they want.

John

-- 
John Meacham - ⑆repetae.net⑆john⑈
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-03-30 Thread Claus Reinke
didn't I mention that "concurrent" may be inappropriate and misleading, 
and that I think it is bad practice to rely on the programmer annotating 
the dangerous cases, instead of the safe cases?


I think dangerous is a misleading term here. you are already using the
FFI, all bets are off. and it is not really dangerous to accidentally
hold up your VM when you didn't expect, it is more just a simple bug.


perhaps "dangerous" was too strong a term, but if programmers don't
annotate an ffi declaration, what is more likely: that they meant to state
a property of that function, or that they didn't mean to? 

if there is a way to avoid simple bugs by not making assumptions about 
undeclared properties, then I'd prefer that to be the default route. if, 
on the other hand, programmers do annotate the ffi declaration, then 
it is up to them to make sure that the function actually has the property 
they claim for it (even in such cases, Haskell usually checks the 
declaration, but that isn't an option here).


Unsafe or dangerous means potentially leading to undefined behavior, 
not just incorrect behavior or we'd have to label 2 as unsafe becaues 
you might have meant to write 3. :)


you mean your compiler won't catch such a simple mistake?-)

but, seriously, that isn't quite the same: if I write a Num, it's my 
responsibility to write the Num I meant, because the implementation

can't check that. but if I don't write a Num, I'd rather not have the
implementation insert one that'll make the code go fastest, assuming
that would always be my main objective! (although that would be
a nice optional feature!-)

wouldn't the safe approach be to assume that the foreign call may do 
anything, unless the programmer explicitly tells you about what things 
it won't do (thus taking responsibility).


I think the worse problem will be all the libraries that are only tested
on ghc that suddenly get very poor performance or don't compile at all
when attempted elsewhere.


- GHC and the other implementations should issue a warning for
   using non-standard or non-implemented features (that includes code
   that won't obviously run without non-standard features)
- if an implementation doesn't implement a feature, there is no way
   around admitting that, standard or not
- if adding valid annotations are necessary to make non-GHC 
   implementations happy, then that's what programmers will have to do 
   if they want portable code; if such annotation would not be valid, we 
   can't pretend it is, and we can't pretend that other implementations 
   will be able to handle the code


- if only performance is affected, that is another story; different
   implementations have different strengths, and the standard shouldn't
   assume any particular implementation, if several are viable
- but: if certain kinds of program will only run well on a single 
   implementation, then programmers depending on that kind of program 
   will only use that single implementation, no matter what the standard 
   says (not all my Haskell programs need concurrency, but for those 
   that do, trying to fit them into Hugs is not my aim)



However, the 'nonreentrant' case actually is dangerous in that it could
lead to undefined behavior which is why that one was not on by default.


why not be consistent then, and name all attributes so that they are off 
by default, and so that implementations that can't handle the off case will

issue a warning at least?

cheers,
claus

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-03-30 Thread John Meacham
On Thu, Mar 30, 2006 at 09:39:44PM +0100, Claus Reinke wrote:
> >I updated the ForeignBlocking wiki page with what I believe is the
> >current state of this proposal; see
> 
> didn't I mention that "concurrent" may be inappropriate and misleading, 
> and that I think it is bad practice to rely on the programmer annotating 
> the dangerous cases, instead of the safe cases?

I think dangerous is a misleading term here. you are already using the
FFI, all bets are off. and it is not really dangerous to accidentally
hold up your VM when you didn't expect, it is more just a simple bug.

Unsafe or dangerous means potentially leading to undefined behavior, not
just incorrect behavior or we'd have to label 2 as unsafe becaues you
might have meant to write 3. :)

> wouldn't the safe approach be to assume that the foreign call may do 
> anything, unless the programmer explicitly tells you about what things 
> it won't do (thus taking responsibility).

I think the worse problem will be all the libraries that are only tested
on ghc that suddenly get very poor performance or don't compile at all
when attempted elsewhere.

However, the 'nonreentrant' case actually is dangerous in that it could
lead to undefined behavior which is why that one was not on by default.

John

-- 
John Meacham - ⑆repetae.net⑆john⑈
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-03-30 Thread Claus Reinke

I updated the ForeignBlocking wiki page with what I believe is the
current state of this proposal; see


didn't I mention that "concurrent" may be inappropriate and misleading, 
and that I think it is bad practice to rely on the programmer annotating 
the dangerous cases, instead of the safe cases?


wouldn't the safe approach be to assume that the foreign call may do 
anything, unless the programmer explicitly tells you about what things 
it won't do (thus taking responsibility).


cheers,
claus


http://haskell.galois.com/cgi-bin/haskell-prime/trac.cgi/wiki/ForeignBlocking


___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-03-30 Thread Malcolm Wallace
"Simon Marlow" <[EMAIL PROTECTED]> wrote:

> > I thought yhc supported unboxed values, so a loop like
> > 
> > count 0 = 0
> > count n = count (n - 1)
> > 
> > count 10
> > 
> > could block the runtime (assuming it was properly unboxed by the
> > compiler) since it never calls back into it and is just a straight
> > up countdown loop?
> 
> are we talking about the same compiler?  YHC is fully interpreted, has
> no unboxed types, and AFAIK it is impossible to write any code that
> doesn't get preempted after a while.

Indeed.  But unboxing is not the issue - the main reason is that yhc
cannot currently compile that code into a loop - jumps only go forwards
in the bytecode, never backwards.  The only possible bytecode
representation of a loop is as a recursive call, which immediately
presents an opportunity to insert a yield.

Regards,
Malcolm
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


RE: FFI, safe vs unsafe

2006-03-30 Thread Simon Marlow
I updated the ForeignBlocking wiki page with what I believe is the
current state of this proposal; see

 
http://haskell.galois.com/cgi-bin/haskell-prime/trac.cgi/wiki/ForeignBlo
cking

Cheers,
Simon
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


RE: FFI, safe vs unsafe

2006-03-30 Thread Simon Marlow
On 30 March 2006 13:05, John Meacham wrote:

> but the debugging/deterministic
> benefits could be useful. you could be guarenteed to reproduce a given
> sequence of context switches which could make finding concurrent
> heisenbugs easier.

Actually +RTS -C0 already gives deterministic concurrency in GHC.  And
you're right, it's essential for debugging.  SMP has made my life
somewhat more painful of late :-)

> or something like concurrent 'hat' or another
> debugger might find it easier to work in such a mode.
> 
> In any case, what I think of when I think of 'writing a portable app'
> is that from _the spec alone_ I can write something that I can expect
> to work on any compliant system. This goal can be achieved to various
> degrees. But if the specification says, 'the implementation might be
> cooperative' and I write assuming that, then it pretty much will
> definitly work anywhere perhaps with some spurious 'yields'.

Absolutely, but a preemptive implementation has no way to tell you if
you missed out a 'yield', and that essentially is the same as
non-portabiliy.  It doesn't matter that the spec told you you needed the
yield, if the implementation you're using works fine without it,
non-portable software will be the result.

What's more, in some cases it isn't even possible to insert enough
yields.  It's entirely reasonable to have an application that runs some
CPU-bound pure computation in one thread and a GUI in some other
threads.  This type of application can't be implemented if the standard
guarantees nothing more than cooperative scheduling.  Worse, no static
property of the code tells you that.

> however
> if it says something to the effect of 'runnable threads will be
> timeshared via some fair algorithm for some definition of fair'

No, I'm suggesting the specific fairness guarantees mentioned earlier
(and on the wiki page).

> then
> it doesn't help much writing portable apps since you would want to
> test on the various compilers to see what their definiton of "fair"
> is.

Given those fairness guarantees, programmers will not need to care
whether the implementation is using preemption based on allocation, or
one based on reductions, or arbitrary inter-instruction preemption.
Because it is hard to write a program that can tell the difference,
especially if you stick to using proper synchronisation primitives,
nobody will do it by accident.

Contrast this with a standard that allows both cooperative and
preemptive scheduling.  It's much easier to write a program that can
show the difference, and I'm worried that people will do it all the
time, by accident.  That's bad.

> I thought yhc supported unboxed values, so a loop like
> 
> count 0 = 0
> count n = count (n - 1)
> 
> count 10
> 
> could block the runtime (assuming it was properly unboxed by the
> compiler) since it never calls back into it and is just a straight up
> countdown loop?

are we talking about the same compiler?  YHC is fully interpreted, has
no unboxed types, and AFAIK it is impossible to write any code that
doesn't get preempted after a while.

Cheers,
Simon
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-03-30 Thread John Meacham
On Thu, Mar 30, 2006 at 01:16:08PM +0100, Claus Reinke wrote:
> >It is not like inserting yields needs to be done much at all since we have
> >progress guarentees, so we know the program is doing something and on
> >any blocking call that could potentially take a while, the library will
> >yield for you.
> 
> where do we get the progress guarantees from? do we need a 
> "yield-analysis"? something that will automatically insert yields
> in the code after every n atomic steps, and complain if it cannot 
> infer that some piece of code is atomic, but cannot insert a yield 
> either? how much of the burden do you want to shift from the
> implementer to the programmer?

no, because there are only certain defined actions that can switch a
thread's state from 'runnable' to 'not-runnable'. In order to
meet the progress guarentee you just need to make sure that when the
current thread switches from 'runnable' to 'not-runnable' that another
thread is chosen.

examples of these points would be:

 - calling a foreign concurrent import
 - waiting for input on a handle
 - waiting for a UNIX signal
 - changing thread priorities (possibly)

in any case, the compiler need do nothing special in general, it is
basically a library issue.

John


-- 
John Meacham - ⑆repetae.net⑆john⑈
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-03-30 Thread Claus Reinke

It is not like inserting yields needs to be done much at all since we have
progress guarentees, so we know the program is doing something and on
any blocking call that could potentially take a while, the library will
yield for you.


where do we get the progress guarantees from? do we need a 
"yield-analysis"? something that will automatically insert yields
in the code after every n atomic steps, and complain if it cannot 
infer that some piece of code is atomic, but cannot insert a yield 
either? how much of the burden do you want to shift from the

implementer to the programmer?

cheers,
claus
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-03-30 Thread John Meacham
On Thu, Mar 30, 2006 at 12:26:58PM +0100, Simon Marlow wrote:
> On 30 March 2006 11:42, John Meacham wrote:
> 
> > Although I was skeptical at the beginning that we could come up with a
> > standard based on forkIO that could encompass both models without
> > compromising performance or implementation flexability, I now think
> > that we can! and that is good, because it means we won't need to make
> > concurrency an addendum or just accept the fact that many
> > haskell-prime implementations will be incomplete!
> 
> Which sounds like a win-win, but the concern Manuel rose earlier in this
> thread still holds: namely that if we allow too much flexibility in the
> standard, it becomes too hard to write portable applications.  Someone
> coding for GHC will almost certainly not insert enough 'yield's to make
> their program work properly on a cooperative implementation.  (hmm, I
> wonder if GHC could be made to optionally behave like a cooperative
> scheduler without too much effort.  That would help).

I was actually just thinking that if not using '-threaded' gave a true
cooperative model, that could be useful. for pure IO multiplexing
applications (editors,chat clients,web servers) it tends to be faster
(hence the switch from pthreading to state-threading in many network
servers).

Though since ghc already has indirect function calls everywhere, the
speed benefit might be neglegable. but the debugging/deterministic
benefits could be useful. you could be guarenteed to reproduce a given
sequence of context switches which could make finding concurrent
heisenbugs easier. or something like concurrent 'hat' or another
debugger might find it easier to work in such a mode.


In any case, what I think of when I think of 'writing a portable app' is
that from _the spec alone_ I can write something that I can expect to
work on any compliant system. This goal can be achieved to various
degrees. But if the specification says, 'the implementation might be
cooperative' and I write assuming that, then it pretty much will
definitly work anywhere perhaps with some spurious 'yields'. however if
it says something to the effect of 'runnable threads will be timeshared
via some fair algorithm for some definition of fair' then it doesn't
help much writing portable apps since you would want to test on the
various compilers to see what their definiton of "fair" is. It is not
like inserting yields needs to be done much at all since we have
progress guarentees, so we know the program is doing something and on
any blocking call that could potentially take a while, the library will
yield for you.

at least it is clear from the spec exactly when you need to resort to
implementation testing and that you shouldn't count on the scheduler
behaving in too specific a way, which is just good advice in general
when writing concurrent code.


> Still, I think this makes an interesting proposal.  Could you put it on
> the wiki, perhaps replacing Proposal 3?

Okay. I'll put something on the wiki along with the rationale. though,
feel free to preempt (pun?) me with anything anyone wants to put there
too.

> > A sticky point might be whether we say anything about duplicated work,
> > however, the haskell report never really says anything about
> > guarenteed sharing anyway so we can probably be silent on the matter.
> 
> Quite right, the standard doesn't need to mention this.
> 
> > we certainly shouldn't treat state-threads as second class or a
> > "lesser" implementation of the standard though! they can often be
> > faster than OS threads but with their own set of tradeoffs.
> > 
> > glossary:
> > 
> > OS threaded - ghc -threaded, context switching at arbitrary points,
> > not necessarily under the control of the haskell runtime.
> > 
> > state-threading - hugs,jhc context switching at block-points chosen
> > by the implementation and user via yield.
> > 
> > yhc is somewhere in between. basically state-threading, but with more
> > context switching under the control of the yhc run-time.
> 
> I'm not sure I'd separate YHC from GHC.  They both satisfy the fairness
> properties we talked about earlier, and from a programmer's point of
> view would be indistinguishable (to a very close approximation).  It's
> very hard to write a program that can tell them apart.

I thought yhc supported unboxed values, so a loop like 

count 0 = 0
count n = count (n - 1)

count 10

could block the runtime (assuming it was properly unboxed by the
compiler) since it never calls back into it and is just a straight up
countdown loop?

in any case, even if not, yhc might want to support unboxed values one
day at which point it might jump categories, but get a speed boost in
the process or they will come up with some clever way to have their cake
and eat it too :)

> I think if the standard were to go in this direction, then it would be
> helpful to establish two versions of the fairness properties: the strong
> version that includes GHC/YHC, and a weaker version that 

RE: FFI, safe vs unsafe

2006-03-30 Thread Simon Marlow
On 30 March 2006 11:42, John Meacham wrote:

> Although I was skeptical at the beginning that we could come up with a
> standard based on forkIO that could encompass both models without
> compromising performance or implementation flexability, I now think
> that we can! and that is good, because it means we won't need to make
> concurrency an addendum or just accept the fact that many
> haskell-prime implementations will be incomplete!

Which sounds like a win-win, but the concern Manuel rose earlier in this
thread still holds: namely that if we allow too much flexibility in the
standard, it becomes too hard to write portable applications.  Someone
coding for GHC will almost certainly not insert enough 'yield's to make
their program work properly on a cooperative implementation.  (hmm, I
wonder if GHC could be made to optionally behave like a cooperative
scheduler without too much effort.  That would help).

Still, I think this makes an interesting proposal.  Could you put it on
the wiki, perhaps replacing Proposal 3?

> A sticky point might be whether we say anything about duplicated work,
> however, the haskell report never really says anything about
> guarenteed sharing anyway so we can probably be silent on the matter.

Quite right, the standard doesn't need to mention this.

> we certainly shouldn't treat state-threads as second class or a
> "lesser" implementation of the standard though! they can often be
> faster than OS threads but with their own set of tradeoffs.
> 
> glossary:
> 
> OS threaded - ghc -threaded, context switching at arbitrary points,
> not necessarily under the control of the haskell runtime.
> 
> state-threading - hugs,jhc context switching at block-points chosen
> by the implementation and user via yield.
> 
> yhc is somewhere in between. basically state-threading, but with more
> context switching under the control of the yhc run-time.

I'm not sure I'd separate YHC from GHC.  They both satisfy the fairness
properties we talked about earlier, and from a programmer's point of
view would be indistinguishable (to a very close approximation).  It's
very hard to write a program that can tell them apart.

I think if the standard were to go in this direction, then it would be
helpful to establish two versions of the fairness properties: the strong
version that includes GHC/YHC, and a weaker version that also admits
Hugs (eg. the progress guarantee you mentioned earlier).  An
implementation would be required to say which of these it satisfies.

Cheers,
Simon
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-03-30 Thread John Meacham
On Thu, Mar 30, 2006 at 10:44:36AM +0100, Simon Marlow wrote:
> You're optimising for the single-threaded case, and that's fine.  In
> GHC, a call-in is similar to what I outlined above except that we can
> optimise away the RPC and perform the call directly in the OS thread
> that requested it, due to the way bound threads are implemented.  Doing
> that requires that a lot more of the runtime needs to be thread-safe,
> though. 


yeah, if you actually have a OS threaded RTS, then everything is a whole
different ball of wax. But there is a lot to be said for a
state-threaded version like hugs. Even in C-land many people choose
state-threads over posix threads or vice versa depending on many
criteria and we shouldn't assume that one is necessarily superior.
state-threads arn't second class, just a different way to go.

Although I was skeptical at the beginning that we could come up with a
standard based on forkIO that could encompass both models without
compromising performance or implementation flexability, I now think that
we can! and that is good, because it means we won't need to make
concurrency an addendum or just accept the fact that many haskell-prime
implementations will be incomplete!


mainly, I think we need to keep a couple goals in mind, which are sometimes
in opposition, but not really so much:

 * not require anything that will rule out or arbitrarily reduce the
  efficiency of a absolutely-zero-overhead in the non-concurrent case
  implementation of straightforward state-threads.

 * not require anything that will inhibit the SMP scalability or
   scheduling freedom of OS threaded implementations. 

I think if we stick to these 'caps' at both ends then all the
intermediate implementations we have talked about will be accomodated
and since state-threads can _almost_ be implemented in pure haskell, we
can be pretty sure we arn't constraining future as yet to be thought of
implementation models too much.

A sticky point might be whether we say anything about duplicated work,
however, the haskell report never really says anything about guarenteed
sharing anyway so we can probably be silent on the matter.

we certainly shouldn't treat state-threads as second class or a "lesser"
implementation of the standard though! they can often be faster than OS
threads but with their own set of tradeoffs.

glossary:

OS threaded - ghc -threaded, context switching at arbitrary points, not
necessarily under the control of the haskell runtime.

state-threading - hugs,jhc context switching at block-points chosen by the
implementation and user via yield.

yhc is somewhere in between. basically state-threading, but with more
context switching under the control of the yhc run-time.

> It's true that this is a fairly large overhead to impose on all Haskell
> implementations.  I'm coming around to the idea that requiring this is
> too much, and perhaps multi-threaded call-ins should be an optional
> extra (including concurrent/reentrant foreign calls).

yeah, a much touted feature of haskell concurrency is that it is _fast_
_fast_, we shouldn't compromise that or its potential without very good
reason.

John

-- 
John Meacham - ⑆repetae.net⑆john⑈
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


RE: FFI, safe vs unsafe

2006-03-30 Thread Simon Marlow
On 29 March 2006 16:53, John Meacham wrote:

> On Wed, Mar 29, 2006 at 04:11:56PM +0100, Simon Marlow wrote:
>> Ok, let's explore how difficult it really is.
>> 
>> Take a single-threaded implementation that forks OS threads for
>> concurrent foreign calls.  Let's call the OS thread running Haskell
>> code the "runtime thread".  An OS thread wanting to call a foreign
>> export must make an RPC to the runtime thread.  You could do this by:
>> 
>>   - have a channel for RPC requests
>> 
>>   - the callee creates a condition var for the result, and sends
>> the call details down the channel with the condition var.
>> 
>>   - the runtime thread picks up the request in its event loop and
>> forks a Haskell thread to handle it
>> 
>>   - when the Haskell thread completes it places the result in the
>> right place and signals the condition variable
>> 
>>   - the callee picks up the result, frees the resources and
>> continues on its merry way 
>> 
>> can't be more than a couple of days work, unless I'm missing
>> something? It's not particularly fast, but then call-ins in GHC
>> aren't fast either. 
> 
> still seems rather complicated for something that as far as I know has
> never come up as needed :) and given that, it is certainly
> unacceptable to pay that cost for every reentrant or blockable call
> on the off chance it might want to do both.

It's not just for callbacks: you need this working if you want to
support call-ins in a multi-threaded environment.  That is, implementing
a library in Haskell with a C interface that can be called by multiple
OS threads.

For example, our Visual Studio plugin needed this to be working because
Visual Studio likes to call APIs in the plugin from different threads.

> Trading a single 'call' instruction for a condition variable, a rpc
> call, and some value passing and many memory accesess and potential
> SMP bus locks is more than just not particularly fast :)
>
> why are call-ins in ghc not very fast? with jhc they are just
> straight C function calls.

You're optimising for the single-threaded case, and that's fine.  In
GHC, a call-in is similar to what I outlined above except that we can
optimise away the RPC and perform the call directly in the OS thread
that requested it, due to the way bound threads are implemented.  Doing
that requires that a lot more of the runtime needs to be thread-safe,
though.

It's true that this is a fairly large overhead to impose on all Haskell
implementations.  I'm coming around to the idea that requiring this is
too much, and perhaps multi-threaded call-ins should be an optional
extra (including concurrent/reentrant foreign calls).

Cheers,
Simon
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-03-29 Thread Neil Mitchell
Hi

> - we've been told here that concurrency is just a library
No, its not. The interface to concurrency is just a library, but
internally certain things in the runtime have to change.

> - FFI allows other Haskell' implementations to import that library
If all Haskell' prime implementations depend on "GHC the library",
then do we really have many Haskell' prime implementations, or just a
pile of wrappers around GHC?

Thanks

Neil
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-03-29 Thread John Meacham
On Wed, Mar 29, 2006 at 04:11:56PM +0100, Simon Marlow wrote:
> Ok, let's explore how difficult it really is.
> 
> Take a single-threaded implementation that forks OS threads for
> concurrent foreign calls.  Let's call the OS thread running Haskell code
> the "runtime thread".  An OS thread wanting to call a foreign export
> must make an RPC to the runtime thread.  You could do this by:
> 
>   - have a channel for RPC requests
> 
>   - the callee creates a condition var for the result, and sends
> the call details down the channel with the condition var.
> 
>   - the runtime thread picks up the request in its event loop and
> forks a Haskell thread to handle it
> 
>   - when the Haskell thread completes it places the result in the
> right place and signals the condition variable
> 
>   - the callee picks up the result, frees the resources and continues
> on its merry way
> 
> can't be more than a couple of days work, unless I'm missing something?
> It's not particularly fast, but then call-ins in GHC aren't fast either.


still seems rather complicated for something that as far as I know has
never come up as needed :) and given that, it is certainly unacceptable
to pay that cost for every reentrant or blockable call on the off chance
it might want to do both. 

Trading a single 'call' instruction for a condition variable, a rpc
call, and some value passing and many memory accesess and potential SMP
bus locks is more than just not particularly fast :) 

why are call-ins in ghc not very fast? with jhc they are just straight C
function calls. I just make the appropriate haskell function with its
arguments unboxed available and optimize like normal then don't put
'static' in front of it when spitting out the c code. now... 'wrapped'
functions are a different beast. one I have not needed to solve yet and
I don't look forward to it. but not particularly more expensive (one
more call, a bit of stack rearrangement) just rather ugly.

John


-- 
John Meacham - ⑆repetae.net⑆john⑈
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-03-29 Thread Claus Reinke
here's another possible way to look at the complexities, and interactions 
of FFI and Haskell' concurrency:


- we've been told here that concurrency is just a library
- GHC implements such a library
- all Haskell' implementations will support FFI
- FFI allows GHC to export that concurrency library
- FFI allows other Haskell' implementations to import that library
==> all Haskell' implementations can support GHC-style concurrency

even if that looks like one of those 0==1 proofs, it might still be 
worthwhile to discuss this as a concrete example, to highlight the

difficulties?-)

cheers,
claus

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


RE: FFI, safe vs unsafe

2006-03-29 Thread Simon Marlow
On 29 March 2006 14:35, John Meacham wrote:

> On Wed, Mar 29, 2006 at 02:05:35PM +0100, Simon Marlow wrote:
>
>> What you are suggesting is that there may be implementations that do
>> not support reentrant/blockable, but do support the others.  And in
>> that case, of course you really need to know the difference between
>> blockable and reentrant.  I'm just not sure the standard should
>> allow such implementations.
> 
> It would be a very odd thing to disallow, seeing as how it would rule
> out or at least place fairly large implementation restrictions on
> yhc,hugs and jhc and not a single foreign import in the standard
> libraries needs to be 'blockable reentrant' as far as I can tell.
>
> though, I do need to look at hugs and yhcs source more carefully to
> know whether that is the case for sure. it depends on a lot of
> implementation details and how they handle OS-level threads and
> blockable in general. 

Ok, let's explore how difficult it really is.

Take a single-threaded implementation that forks OS threads for
concurrent foreign calls.  Let's call the OS thread running Haskell code
the "runtime thread".  An OS thread wanting to call a foreign export
must make an RPC to the runtime thread.  You could do this by:

  - have a channel for RPC requests

  - the callee creates a condition var for the result, and sends
the call details down the channel with the condition var.

  - the runtime thread picks up the request in its event loop and
forks a Haskell thread to handle it

  - when the Haskell thread completes it places the result in the
right place and signals the condition variable

  - the callee picks up the result, frees the resources and continues
on its merry way

can't be more than a couple of days work, unless I'm missing something?
It's not particularly fast, but then call-ins in GHC aren't fast either.

Cheers,
Simon
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-03-29 Thread Claus Reinke

Malcolm correctly notes that when I say "non-blocking" I'm referring to
the behaviour from Haskell's point of view, not a property of the
foreign code being invoked.
In fact, whether the foreign code being invoked blocks or not is largely
immaterial.  The property we want to capture is just this:
 During execution of the foreign call, other Haskell threads
 should make progress as usual.


if that is really what you want to capture, the standard terminology 
would be "asynchronous call" (as opposed to "synchronous call"). 
hence all that separation between synchronous and asynchronous 
concurrent languages (so "concurrent" would not be a useful qualifier).


the only remaining ambiguity would be that concurrent languages
(eg, Erlang) tend to use "asynchronous calls" to mean that the 
_calling thread_ does not need to synchronise, whereas you 
want to express that the _calling RTS_ does not need to 
synchronise while the _calling thread_ does need to. 

which makes me wonder why one would ever want the RTS to 
block if one of its threads makes a call? if the RTS is sequential 
(with or without user-level threads), it can't do anything but 
synchronous foreign calls, can it? and if the RTS does support 
non-sequential execution, I can see few reasons for it to block 
other threads when one thread makes a foreign call.


I think what you're after is something quite different: by default,
we don't know anything about the behaviour of foreign call, so
once we pass control to foreign, it is out of our hands until
foreign decides to return it to us. 

for sequential RTS, that's the way it is, no way around it. for 
non-sequential RTS, that need not be a problem: if the foreign 
call can be given its own asynchronous (OS-level) thread of 
control, it can take however long it needs to before returning, 
and other (user-level) threads can continue to run, 
asynchronously. but that means overhead that may not 
always be necessary.


so what I think you're trying to specify is whether it is safe for
the RTS to assume that the foreign call is just another primitive
RTS execution step (it will return control, and it won't take long
before doing so). the standard terminology for that is, I believe,
"atomic action".

in other words, if the programmer assures the RTS that a foreign
call is "atomic", the RTS is free to treat it as any other RTS step
(it won't block the current OS-level thread of control entirely, 
and it won't hog the thread for long enough to upset scheduling

guarantees). if, on the other hand, a foreign call is not annotated
as "atomic", there is a potential problem: non-sequential RTS
can work around that, with some overhead, while sequential
RTS can at best issue a warning and hope for the best.

so my suggestion would be to make no assumption about
unannotated calls (don't rely on the programmer too much;),
and to have optional keywords "atomic" and "non-reentrant".

[one might assume that an "atomic" call should never be 
permitted to reenter, so the annotations could be ordered

instead of accumulated, but such assumptions tend to
have exceptions]

cheers,
claus


It doesn't matter whether the foreign call "blocks" or not (although
that is a common use for this feature).  I'd rather call it
'concurrent', to indicate that the foreign call runs concurrently with
other Haskell threads.


___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-03-29 Thread Duncan Coutts
On Wed, 2006-03-29 at 07:32 -0600, Taral wrote:
> On 3/29/06, Simon Marlow <[EMAIL PROTECTED]> wrote:
> > If we were to go down this route, we have to make reentrant the default:
> > 'unsafe' is so-called for a good reason, you should be required to write
> > 'unsafe' if you're doing something unsafe.  So I'd suggest
> >
> >   unsafe
> >   concurrent unsafe
> >   concurrent  -- the hard one
> >   {- nothing -}
> 
> Can I suggest "sef" in this? Most cases of "unsafe" are actually
> claims that the call is side-effect free.

c2hs uses the keyword "pure" for this purpose, which I rather like.

c2hs transforms:
{# call pure foo_bar #}

into a call plus a foreign import with the "unsafe" tag.

Duncan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-03-29 Thread John Meacham
On Wed, Mar 29, 2006 at 02:05:35PM +0100, Simon Marlow wrote:
> > will all have different concrete implementations and generate
> > different code. for correctness reasons, not efficiency ones.
> 
> Well, for correctness all you need is reentrant/blockable.  If you have
> that, all the others are efficiency hacks. 

yeah, but sometimes efficiency hacks become so pronounced they turn into
correctness problems. (tail-call optimization being the canonical
example) 


> What you are suggesting is that there may be implementations that do not
> support reentrant/blockable, but do support the others.  And in that
> case, of course you really need to know the difference between blockable
> and reentrant.  I'm just not sure the standard should allow such
> implementations.

It would be a very odd thing to disallow, seeing as how it would rule
out or at least place fairly large implementation restrictions on
yhc,hugs and jhc and not a single foreign import in the standard
libraries needs to be 'blockable reentrant' as far as I can tell.

though, I do need to look at hugs and yhcs source more carefully to know
whether that is the case for sure. it depends on a lot of implementation
details and how they handle OS-level threads and blockable in general.


> If we were to go down this route, we have to make reentrant the default:
> 'unsafe' is so-called for a good reason, you should be required to write
> 'unsafe' if you're doing something unsafe.  So I'd suggest
> 
>   unsafe
>   concurrent unsafe
>   concurrent  -- the hard one
>   {- nothing -}

I don't really like the word 'unsafe' because it doesn't really tell you
much about what is actually unsafe. I'd go with the more descriptive:

>   nonreentrant
>   concurrent nonreentrant
>   concurrent  -- the hard one
>   {- nothing -}


where 'nonreentrant' means a proof obligation is on the programmer to
show that routine does not call back into the haskell run-time.  This
feels more future-safe too in case we add another unrelated type of
unsafety in the future.


John


-- 
John Meacham - ⑆repetae.net⑆john⑈
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-03-29 Thread Taral
On 3/29/06, Simon Marlow <[EMAIL PROTECTED]> wrote:
> If we were to go down this route, we have to make reentrant the default:
> 'unsafe' is so-called for a good reason, you should be required to write
> 'unsafe' if you're doing something unsafe.  So I'd suggest
>
>   unsafe
>   concurrent unsafe
>   concurrent  -- the hard one
>   {- nothing -}

Can I suggest "sef" in this? Most cases of "unsafe" are actually
claims that the call is side-effect free.

--
Taral <[EMAIL PROTECTED]>
"You can't prove anything."
-- Gödel's Incompetence Theorem
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


RE: FFI, safe vs unsafe

2006-03-29 Thread Simon Marlow
On 29 March 2006 13:17, John Meacham wrote:

> On Wed, Mar 29, 2006 at 12:48:54PM +0100, Simon Marlow wrote:
>> I agree with what you say, but let me summarise it if I may, because
>> there's an assumption in what you're saying that's easy to miss.
>> 
>>   IF
>>  the combination of 'blockable' and 'reentrant' is not
>>  required by the standard,
>>   THEN
>>  we should allow foreign calls to be annotated with
>>  one or the other, rather than requiring both.
>> 
>> I agree with this statement, but I don't necessarily agree that the
>> predicate should be true.  Indeed, given that it requires us to
>> complicate the language and puts a greater burden on FFI library
>> writers, there's a good argument not to.
> 
> it is just an implementation fact.
> 
> In jhc (and likely yhc and hugs may find themselves in the same boat)
> 
> unsafe
> blockable
> reentrant
> reentrant blockable
> 
> will all have different concrete implementations and generate
> different code. for correctness reasons, not efficiency ones.

Well, for correctness all you need is reentrant/blockable.  If you have
that, all the others are efficiency hacks.

What you are suggesting is that there may be implementations that do not
support reentrant/blockable, but do support the others.  And in that
case, of course you really need to know the difference between blockable
and reentrant.  I'm just not sure the standard should allow such
implementations.

If we were to go down this route, we have to make reentrant the default:
'unsafe' is so-called for a good reason, you should be required to write
'unsafe' if you're doing something unsafe.  So I'd suggest

  unsafe
  concurrent unsafe
  concurrent  -- the hard one
  {- nothing -}

Cheers,
Simon
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-03-29 Thread John Meacham
On Wed, Mar 29, 2006 at 12:48:54PM +0100, Simon Marlow wrote:
> I agree with what you say, but let me summarise it if I may, because
> there's an assumption in what you're saying that's easy to miss.
> 
>   IF 
>  the combination of 'blockable' and 'reentrant' is not
>  required by the standard,
>   THEN
>  we should allow foreign calls to be annotated with
>  one or the other, rather than requiring both.
> 
> I agree with this statement, but I don't necessarily agree that the
> predicate should be true.  Indeed, given that it requires us to
> complicate the language and puts a greater burden on FFI library
> writers, there's a good argument not to.

it is just an implementation fact.

In jhc (and likely yhc and hugs may find themselves in the same boat)

unsafe
blockable
reentrant
reentrant blockable

will all have different concrete implementations and generate different
code. for correctness reasons, not efficiency ones.

though, it would not surprise me if many did not support "reentrant
blockable" as it is a real pain to do properly.


or, to put it another way, if they were not separate concepts then
cooperative implementations would have no choice but to reject any
program using 'safe' since there is a chance they might mean 'reentrant
blockable' rather than just reentrant or just blockable.


John

-- 
John Meacham - ⑆repetae.net⑆john⑈
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


RE: FFI, safe vs unsafe

2006-03-29 Thread Simon Marlow
I agree with what you say, but let me summarise it if I may, because
there's an assumption in what you're saying that's easy to miss.

  IF 
 the combination of 'blockable' and 'reentrant' is not
 required by the standard,
  THEN
 we should allow foreign calls to be annotated with
 one or the other, rather than requiring both.

I agree with this statement, but I don't necessarily agree that the
predicate should be true.  Indeed, given that it requires us to
complicate the language and puts a greater burden on FFI library
writers, there's a good argument not to.

Nevertheless, we're filling out the design space, and that's a good
thing.  I'll try to digest the stuff that has gone past recently on to
the wiki.

Cheers,
Simon

On 29 March 2006 11:36, John Meacham wrote:

> On Wed, Mar 29, 2006 at 11:15:27AM +0100, Simon Marlow wrote:
>> On 29 March 2006 09:11, John Meacham wrote:
>> 
>>> It would be nice if we can deprecate the not very informative 
>>> 'safe' and 'unsafe' names and use more descriptive ones that tell
>>> you what is actually allowed. 
>>> 
>>> 'reentrant' - routine might call back into the haskell run-time
>>> 'blockable' - routine might block indefinitly
>> 
>> I've been meaning to bring this up.  First, I don't think
>> 'blockable' is the right term here.  This relates to Malcolm's point
>> too: 
> 
> yeah, I am not happy with that term either. 'blocking'? 'canblock'?
> 
>> 
>>> Another piece of terminology to clear up.  By "non-blocking foreign
>>> call", you actually mean a foreign call that *can* block.  As a
>>> consequence of the fairness policy, you wish to place the
>>> requirement on implementations that such a blocking foreign call
>>> _should_not_ block progress of other Haskell threads.  The
>>> thread-nature of the foreign call is "blocking".  The Haskell-API
>>> nature is desired to be "non-blocking".
>> 
>> Malcolm correctly notes that when I say "non-blocking" I'm referring
>> to the behaviour from Haskell's point of view, not a property of the
>> foreign code being invoked. 
>> 
>> In fact, whether the foreign code being invoked blocks or not is
>> largely immaterial.  The property we want to capture is just this:
>> 
>>   During execution of the foreign call, other Haskell threads
>>   should make progress as usual.
>> 
>> It doesn't matter whether the foreign call "blocks" or not (although
>> that is a common use for this feature).  I'd rather call it
>> 'concurrent', to indicate that the foreign call runs concurrently
>> with other Haskell threads.
> 
> 'concurrent' sounds fine to me, I have little preference. other than
> please not 'threadsafe', a word so overloaded as to be meaningless :)
> 
>> 
>> Back to 'reentrant' vs. 'blockable'.  I'm not convinced that
>> 'blockable unsafe' is that useful.  The reason is that if other
>> Haskell threads continue running during the call, at some point a GC
>> will be required, at which point the runtime needs to traverse the
>> stack of the thread involved in the foreign call, which means the
>> call is subject to the same requirements as a 'reentrant' call
>> anyway.  I don't think it's necessary to add this finer distinction.
>> Unless perhaps you have in mind an implementation that doesn't do GC
>> in the traditional way... but then I'm concerned that this is
>> requiring programmers to make a distinction in their code to improve
>> performance for a minority implementation technique, and that's not
>> good language design. 
> 
> it has nothing to do with performance, they are just fundamentally
> different concepts that just happen by coincidence to have the same
> solution in ghc. there is no fundamental relation between the two. 
> This is one of those things that I said was "GHC-centric even though
> no one realizes it" :)
> 
> in any cooperative/event loop based system, 'blockable unsafe' can be
> implemented by
> 
> 1 spawning a new system thread, calling the routine in it, having the
> routine write a value to a pipe when done. the pipe is integrated into
> the standard event loop of the run-time.
> 
> however, 'blockable safe' or 'blockable reentrant' now implies that a
> call may come back into the haskell run-time _on another OS level
> thread_ which implys we have to set up pthread_mutexes everywhere,
> perhaps switch to a completely different run-time or at least switch
> to a different incoming foreign calling boilerplate.
> 
> note that none of this has anything to do with the GC (though, likely
> implementations will have to do something special with their GC stack
> too) and there are a lot of other possible models of concurrency that
> we have not even thought of yet.
> 
> 
>> If 'reentrant' in its full glory is too hard to implement, then by
>> all means don't implement it, and emit a run-time error if someone
>> tries to use it.
> 
> but reentrant is perfectly fine, blocking is perfectly fine, the
> combination is not. giving up the ability to have haskell callbacks
> from C code is n

Re: FFI, safe vs unsafe

2006-03-29 Thread Malcolm Wallace
John Meacham <[EMAIL PROTECTED]> wrote:

> It would be nice if we can deprecate the not very informative  'safe'
> and 'unsafe' names and use more descriptive ones that tell you what is
> actually allowed.

Yes.  I have always found that naming convention confusing and
non-declarative.  "Safe" means that the foreign call is unsafe, so
please can the compiler do some extra work to make it safe.  Rather than
declaring the nature of the foreign function, it is an imperative
instruction to the runtime system.  (Likewise "unsafe", which means the
foreign call is safe, so please tell the compiler to omit the extra
safety checking.)

> 'reentrant' - routine might call back into the haskell run-time
> 'blockable' - routine might block indefinitly

These are indeed more descriptive, and I do hope we don't get the sense
inverted with these terms.  I would be in favour of adding them to the
FFI spec, and deprecating the "safe" and "unsafe" terms.  Do you have a
suggestion for replacing "unsafe"?

> 'reentrant_tail' - will tail call a haskell routine
> 'reentrant_nonglobal' - will only call arguments passed to it.
> 'fatal' - routine always aborts or performs a non-local return
> 'cheap' - routine is cheap to call, may be duplicated freely by optimizer
> 'speculatable' - routine may be reordered and called speculatively, even
>if the optimizer can't prove it will eventually be used

I don't see any need to standardise these extra terms - as you say, they
feel more like pragmas, and may well be meaningless for many compilers.

Regards,
Malcolm
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: FFI, safe vs unsafe

2006-03-29 Thread John Meacham
On Wed, Mar 29, 2006 at 11:15:27AM +0100, Simon Marlow wrote:
> On 29 March 2006 09:11, John Meacham wrote:
> 
> > It would be nice if we can deprecate the not very informative  'safe'
> > and 'unsafe' names and use more descriptive ones that tell you what is
> > actually allowed.
> > 
> > 'reentrant' - routine might call back into the haskell run-time
> > 'blockable' - routine might block indefinitly
> 
> I've been meaning to bring this up.  First, I don't think 'blockable' is
> the right term here.  This relates to Malcolm's point too:

yeah, I am not happy with that term either. 'blocking'? 'canblock'?

> 
> > Another piece of terminology to clear up.  By "non-blocking foreign
> > call", you actually mean a foreign call that *can* block.  As a
> > consequence of the fairness policy, you wish to place the
> > requirement on implementations that such a blocking foreign call
> > _should_not_ block progress of other Haskell threads.  The
> > thread-nature of the foreign call is "blocking".  The Haskell-API
> > nature is desired to be "non-blocking".
> 
> Malcolm correctly notes that when I say "non-blocking" I'm referring to
> the behaviour from Haskell's point of view, not a property of the
> foreign code being invoked.
> 
> In fact, whether the foreign code being invoked blocks or not is largely
> immaterial.  The property we want to capture is just this:
> 
>   During execution of the foreign call, other Haskell threads
>   should make progress as usual.
> 
> It doesn't matter whether the foreign call "blocks" or not (although
> that is a common use for this feature).  I'd rather call it
> 'concurrent', to indicate that the foreign call runs concurrently with
> other Haskell threads.

'concurrent' sounds fine to me, I have little preference. other than
please not 'threadsafe', a word so overloaded as to be meaningless :)

> 
> Back to 'reentrant' vs. 'blockable'.  I'm not convinced that 'blockable
> unsafe' is that useful.  The reason is that if other Haskell threads
> continue running during the call, at some point a GC will be required,
> at which point the runtime needs to traverse the stack of the thread
> involved in the foreign call, which means the call is subject to the
> same requirements as a 'reentrant' call anyway.  I don't think it's
> necessary to add this finer distinction.  Unless perhaps you have in
> mind an implementation that doesn't do GC in the traditional way... but
> then I'm concerned that this is requiring programmers to make a
> distinction in their code to improve performance for a minority
> implementation technique, and that's not good language design.

it has nothing to do with performance, they are just fundamentally
different concepts that just happen by coincidence to have the same
solution in ghc. there is no fundamental relation between the two.  This
is one of those things that I said was "GHC-centric even though no one
realizes it" :)

in any cooperative/event loop based system, 'blockable unsafe' can be
implemented by 

1 spawning a new system thread, calling the routine in it, having the
routine write a value to a pipe when done. the pipe is integrated into
the standard event loop of the run-time.

however, 'blockable safe' or 'blockable reentrant' now implies that a
call may come back into the haskell run-time _on another OS level
thread_ which implys we have to set up pthread_mutexes everywhere,
perhaps switch to a completely different run-time or at least switch to
a different incoming foreign calling boilerplate.

note that none of this has anything to do with the GC (though, likely
implementations will have to do something special with their GC stack
too) and there are a lot of other possible models of concurrency that
we have not even thought of yet.


> If 'reentrant' in its full glory is too hard to implement, then by all
> means don't implement it, and emit a run-time error if someone tries to
> use it.

but reentrant is perfectly fine, blocking is perfectly fine, the
combination is not. giving up the ability to have haskell callbacks from
C code is not so good.


besides, for a language standard we should avoid any implementation
details so specifying _exactly_ what we mean is a good thing. the fact
that reentrant and blocking produce the same code in GHC is _very much_
an implementation detail.

John


-- 
John Meacham - ⑆repetae.net⑆john⑈
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


RE: FFI, safe vs unsafe

2006-03-29 Thread Simon Marlow
On 29 March 2006 09:11, John Meacham wrote:

> It would be nice if we can deprecate the not very informative  'safe'
> and 'unsafe' names and use more descriptive ones that tell you what is
> actually allowed.
> 
> 'reentrant' - routine might call back into the haskell run-time
> 'blockable' - routine might block indefinitly

I've been meaning to bring this up.  First, I don't think 'blockable' is
the right term here.  This relates to Malcolm's point too:

> Another piece of terminology to clear up.  By "non-blocking foreign
> call", you actually mean a foreign call that *can* block.  As a
> consequence of the fairness policy, you wish to place the
> requirement on implementations that such a blocking foreign call
> _should_not_ block progress of other Haskell threads.  The
> thread-nature of the foreign call is "blocking".  The Haskell-API
> nature is desired to be "non-blocking".

Malcolm correctly notes that when I say "non-blocking" I'm referring to
the behaviour from Haskell's point of view, not a property of the
foreign code being invoked.

In fact, whether the foreign code being invoked blocks or not is largely
immaterial.  The property we want to capture is just this:

  During execution of the foreign call, other Haskell threads
  should make progress as usual.

It doesn't matter whether the foreign call "blocks" or not (although
that is a common use for this feature).  I'd rather call it
'concurrent', to indicate that the foreign call runs concurrently with
other Haskell threads.


Back to 'reentrant' vs. 'blockable'.  I'm not convinced that 'blockable
unsafe' is that useful.  The reason is that if other Haskell threads
continue running during the call, at some point a GC will be required,
at which point the runtime needs to traverse the stack of the thread
involved in the foreign call, which means the call is subject to the
same requirements as a 'reentrant' call anyway.  I don't think it's
necessary to add this finer distinction.  Unless perhaps you have in
mind an implementation that doesn't do GC in the traditional way... but
then I'm concerned that this is requiring programmers to make a
distinction in their code to improve performance for a minority
implementation technique, and that's not good language design.

If 'reentrant' in its full glory is too hard to implement, then by all
means don't implement it, and emit a run-time error if someone tries to
use it.

Cheers,
Simon
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime