I forgot about links. In that case, consider:
getUniqueFilesInDirRecursive.
Attacking irrelevant details in an argument is often called a
"strawman attack". Such attacks are pointless because they do not
address the real substance of the issue. My example is easily modified
to avoid the issues you raise.
Consider the fact that many file-based operations _can and are
parallelized manually by developers_. The challenge for next
generation language and effect system designers is to figure out _how_
such operations can be automatically parallelized, given sufficient
constraints, high-level constructs, and a powerful effect system.
Saying, "I don't know exactly how it will look," is quite a bit
different from saying "It can't be done." I claim the former.
Regards,
John A. De Goes
N-Brain, Inc.
The Evolution of Collaboration
http://www.n-brain.net | 877-376-2724 x 101
On Aug 16, 2009, at 12:38 AM, Artem V. Andreev wrote:
"John A. De Goes" <j...@n-brain.net> writes:
On Aug 15, 2009, at 6:36 AM, Jason Dusek wrote:
2009/08/14 John A. De Goes <j...@n-brain.net>:
Hmmm, my point (perhaps I wasn't clear), is that different
effects have different commutability properties. In the case
of a file system, you can commute two sequential reads from
two different files.
I think this is a bad example -- it's not something that's
safe in general and discredits your idea. How would the
compiler even know that two files are not actually the same
file?
I don't think the file system is the best example. However, I do
think it's a reasonable one.
Let's say the type of the function getFilesInDir is annotated in
such a way as to tell the effect
system that every file in the returned array is unique. Further,
let's say the type of the
function makeNewTempFile is annotated in such a way as to tell the
effect system that the
function will succeed in creating a new temp file with a name
unique from any other existing
file.
Sorry, but this example is ridiculuous. While file *names* in this
case might be reasonably assumed
to be unique, the *files* themselves may not. Any modern filesystem
does support file aliasing,
and usually several forms thereof. And what does makeNewTempFile
function do? Does it create a new
file like POSIX mktemp() and return its name, or does it rather
behave as POSIX mkstemp()?
The first case is a well known security hole, and the second case
does not, as it seems to me, fit
well into the rest of your reasoning.
However, let's consider further file system tree traversal. In some
cases you might not care, whether
some of the directories you descend into are actually the same
directory, so your proposed optimization
would be `safe'. However, in other cases sequential traversal would
work, while a parallelized version
would not, unless special additional measures are taken. E.g.
consider a case of a build system. It
traverses a source tree, finds sources files and if corresponding
object files are non-existent or
outdated, does something to regenerate them. Now if you have a
directory that's actually a link to
another directory, and you do sequential traversal, everything is
fine: you descend into the directory
the first time, build everything there and when you descend into it
the second time, there's just nothing
to do. If you do parallel traversal, you may well end up in the
situation where two threads check
simultaneously for an object file, discover it's outdated and run
two build processes simultaneously,
with the most likely effect of corrupted object file.
Then if you write a recursive function that loops through all files
in a directory, and for each
file, it parses and compiles the file into a new temp file, then a
sufficiently sophisticated
compiler should be able to safely transform the recursion into
parallel parsing and compilation
-- in a way that's provably correct, assuming the original program
was correct.
The promise of a language with a purely functional part and a
powerful effect system for
everything else is very great. And very important in the massively
concurrent world we are
entering.
Well, yes -- which sounds like, there are no guarantees
in general. Something that works half the time leaves you with
two responsibilities -- the old responsibility of the work you
did when you didn't have it and the new responsibility of
knowing when it applies and when it doesn't.
In the other thread, I brought up the example of buffering reads.
Library authors make the
decision to buffer for one reason: because if some other program
is messing with the data, you're
screwed no matter what.
And yeah, "they might be screwing with the data in just the way
you need it to be screwed with,"
(Sebastian), in which case my advice is use C and hope for the
best. :-)
Regards,
John A. De Goes
N-Brain, Inc.
The Evolution of Collaboration
http://www.n-brain.net | 877-376-2724 x 101
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
--
S. Y. A(R). A.
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe