Re: [Haskell-cafe] Re: [Haskell] Re: SimonPJ and Tim Harris explain STM - video

2006-11-23 Thread Liyang HU


On 24/11/06, Benjamin Franksen <[EMAIL PROTECTED]> wrote:

> So, you could simply return the console output as (part of) the result
> of the atomic action. Wrap it in a WriterT monad transformer, even.
But this would break atomicity, wouldn't it?

In the sense as you just described, yes. You're right: there's no
guarantee that something else might not jump in between the call to
atomic and the following putStr, so the TVar changes in the atomic
block no longer take place in step with the output actions.

I have a pretty good idea how much data is going to be produced by
my own code, and if it's a bit more than I calculated then the whole
process merely uses up some more memory, which is usually not a
big problem. However, with input things are different:

Really? I'd have said that I have a pretty good idea how much data is
going to be consumed by my own code, and if it's a bit more than I
calculated then I'd merely read some more at the beginning (putting
any unused bits back on the input queue afterwards of course), which
is usually not a big problem. :)

Yes, I do get your point. It's easier to allocate a larger buffer for
your output as needed, than to anticipate how much input you might
require. I'd still claim they're different instances of the same
scheme though!

[If] I still haven't  got enough data, my transaction will be stuck with no way 
to demand more input.

Take your output channel idea, and use that for input too? (Separate
thread to read the input and place it at the end of some queue.) You
would basically retry and block (or rather, STM would do the latter
for you) if you haven't enough data, until more came along.

Haskell-Cafe mailing list

[Haskell-cafe] Re: [Haskell] Re: SimonPJ and Tim Harris explain STM - video

2006-11-23 Thread Benjamin Franksen
Hi Liyang HU

you wrote:
> On 23/11/06, Benjamin Franksen <[EMAIL PROTECTED]> wrote:
>> One answer is in fact "to make it so that Console.Write can be rolled
>> back too". To achieve this one can factor the actual output to another
>> task and inside the transaction merely send the message to a
>> transactional channel (TChan):
> So, you could simply return the console output as (part of) the result
> of the atomic action. Wrap it in a WriterT monad transformer, even.

But this would break atomicity, wouldn't it? Another call to doSomething
from another task could interrupt before I get the chance to do the actual
output. With a channel whatever writes will happen in the same order in
which the STM actions commit (which coincides with the order in which the
counters get incremented).

>> Another task regularly takes messages from the channel
> With STM, the outputter task won't see any messages from the channel
> until your main atomic block completes, after which you're living in
> IO-land, so you might as well do the output yourself.

Yeah, right. Separate task might still be preferable, otherwise you have to
take care not to forget to actually do the IO after each transaction. I
guess it even makes sense to hide the channel stuff behind some nice
abstraction, so inside the transaction it looks similar to a plain IO

  output port msg

The result is in fact mostly indistiguishable from a direct IO call due to
the fact that IO is buffered in the OS anyway.

> Pugs/Perl 6 takes the approach that any IO inside an atomic block
> raises an exception.
>> Unfortunately I can't see how to generalize this to input as well...
> The dual of how you described the output situation: read a block of
> input before the transaction starts, and consume this during the
> transaction. I guess you're not seeing how this generalises because
> potentially you won't know how much of the input you will need to read
> beforehand... (so read all available input?(!) You have the dual
> situation in the output case, in that you can't be sure how much
> output it may generate / you will need to buffer.)

You say it. I guess the main difference is that I have a pretty good idea
how much data is going to be produced by my own code, and if it's a bit
more than I calculated then the whole process merely uses up some more
memory, which is usually not a big problem. However, with input things are
different: in many cases the input length is not under my control and could
be arbitrarily large. If I read until my buffer is full and I still haven't
got enough data, my transaction will be stuck with no way to demand more
input. (however, see below)

>input <- hGetContent file
>atomic $ flip runReaderT input $ do
>input <- ask
>-- do something with input
>return 42
> (This is actually a bad example, since hGetContents reads the file
> lazily with interleaved IO...)

In fact reading everything lazily seems to be the only way out, if you don't
want to have arbitrary limits for chunks of input.

OTOH, maybe limiting the input chunks to some maximum length is a good idea
regardless of STM and whatnot. Some evil data source may want to crash my
process by making it eat more and more memory...

So, after all you are probably right and there is an obvious generalization
to input. Cool.


Haskell-Cafe mailing list