Re: [julia-users] Re: [ANN] Two packages: Lazy.jl Mathematica.jl

2014-03-09 Thread Mike Innes
I haven't done any real benchmarks but I imagine custom iterator types are
still much faster than generators, given that they're essentially zero
overhead.

Lazy sequences aren't exactly speed demons either - they're basically just
closures, which are a known sore spot for performance in Julia. This may
well improve, but you would generally want to use a more functional style
when you aren't as worried about performance anyway.

Stefan: you want all the best and most powerful tools available Yes, yes,
yes - this a thousand times.

The problem with demanding that every problem has a single solution is
that you end up with every solution fits a single problem as well. You
can't possibly foresee all of the solutions and determine which is best,
and you can't foresee all of the problems a tool can solve either. So
instead of trying to create a one-size-fits all solution to each class of
problem, let's make powerful tools that work seamlessly together and trust
our users to do what they do best - problem solving.


On 9 March 2014 00:03, andrew cooke and...@acooke.org wrote:


 i don't think anyone was doubting that iterators are more efficient than
 tasks (me: the only reason i can see for julia adding a separate mechanism
 for iterators separate from tasks is efficiency; mike: [coroutines...]
 impossible to make iteration over custom data types fast or efficient.).

 what i was questioning more (although in a circumspect way, which i
 thought was being poite but may have only been annoying) was whether there
 is a need for (yet) another mechanism besides iterators and tasks.  lazy
 sequences (at least when i've used them) are equivalent to generators, so
 are less expressive than tasks.  and it seems the justification is one or
 more of: (1) some (perhaps many) people find them easier to understand; (2)
 they are more suited to some problems than others (and i was hoping to look
 at the challenge posed for tasks at some point); (3) they are more
 efficient than tasks (i don't think mike said this, but it may be true).

 andrew


 On Saturday, 8 March 2014 20:12:50 UTC-3, Stefan Karpinski wrote:

 I'm going to try to answer the question by taking it to its logical
 conclusion. Why does Julia have if statements, while loops and for loops
 when all of these things can be accomplished with closures? In fact,
 coroutines are even more powerful still, so why bother with anything
 besides coroutines? Everything that can be accomplished with iteration can
 also be done with recursion. Should we get rid of while loops and for
 loops? Or maybe we should disallow recursion - you can always rewrite a
 recursive algorithm using iteration and an explicit stack.

 Python's use of generators for custom iteration - not to mention raising
 and catching an exception to terminate iteration - ensures that it is all
 but impossible to make iteration over custom data types fast or efficient.
 Python gets away with this because no one expects using data structures
 defined in Python to be especially fast or efficient. The language's built
 in types are special and iterating over them efficiently is baked into the
 implementation. In Julia, ranges are just a user-defined 
 typehttps://github.com/JuliaLang/julia/blob/master/base/range.jlthat 
 happens to be defined before your program begins - if iteration used
 coroutines and exceptions in Julia, all for loops - even just iterating
 from 1 to n - would be ridiculously slow and inefficient. We'd either be
 forced to implement everything important in C, or we'd still be screwing
 around with crazy optimization techniques to eliminate the coroutines and
 exceptions. Instead, we used a design for iteration that's simple and easy
 to make fast and efficient for simple things like iterating over a range or
 an array, yet flexible enough for all kinds of user-defined types. There
 are also coroutines and exceptions if you need them, but they have some
 overhead, so they should be used as appropriate.

 So you are left with a choice between using a task that produces values -
 which is easy but not very efficient - or implementing an iteration type -
 which is more work, but also much more efficient. But this kind of choice
 isn't unusual in programming - writing code is full of choices between
 different ways to accomplish the same thing with different tradeoffs. The
 Python mantra that there should be one - and preferably only one - obvious
 way to do it has always stuck me as hopelessly naïve - or maybe it's just
 wishful thinking. Sure, it's nice if there is an obvious way to do
 something, and even better if there's one clearly right way. But that's a
 lucky and rare situation. Most problems don't have exactly one obvious
 solution. Easy problems have many obvious solutions. Hard problems have no
 obvious solutions (by definition). Trying to design a language so that
 there's only one obvious solution to most problems strikes me as a recipe
 for having a language that's not powerful 

Re: [julia-users] Re: [ANN] Two packages: Lazy.jl Mathematica.jl

2014-03-09 Thread Mike Innes
I'm sorry if that came off as though it was targeted at you - I meant to a
general statement about the philosophy of having zero duplication. Of
course, you're right, duplication has a cost too, and it doesn't work to
just throw everything together either, so like everything else in life it's
about compromise.

For what it's worth, I think you asked a good question, made a valid point,
and an interesting discussion came out of it. Regardless of one's approach,
it's always right to question whether you're really doing the right thing -
so I didn't take your question as being demanding at all, far from it.
Thanks for taking the time to join the discussion, and again, apologies if
you've felt that there's any hostility involved.


On 9 March 2014 11:23, andrew cooke and...@acooke.org wrote:



 On Sunday, 9 March 2014 07:48:25 UTC-3, Mike Innes wrote:

 Stefan: you want all the best and most powerful tools available Yes,
 yes, yes - this a thousand times.

 The problem with demanding that every problem has a single solution is
 that you end up with every solution fits a single problem as well. You
 can't possibly foresee all of the solutions and determine which is best,
 and you can't foresee all of the problems a tool can solve either. So
 instead of trying to create a one-size-fits all solution to each class of
 problem, let's make powerful tools that work seamlessly together and trust
 our users to do what they do best - problem solving.



 Go read the Julia issues.  They're full of tradeoffs between simplicity
 and functionality.

 Life, and language and library design, aren't as simple as you are making
 out.  There are costs to duplication and alternatives.

 And painting my posts as demanding a single solution is plain wrong.  I
 asked a question.  I didn't demand anything.

 Andrew




[julia-users] Re: [ANN] Two packages: Lazy.jl Mathematica.jl

2014-03-08 Thread Mike Innes
So, to clarify, Iterators aren't a thing in themselves. Iteration is an 
interface, and to call something an iterator just means that you can put it 
in a for loop. Tasks and Lazy Lists are both iterators; so are arrays, 
sets, dictionaries, and a whole bunch of other things. But although you can 
use them in a similar way if you want to, they are all designed to solve 
very different problems.

Now, Tasks and Lazy Lists do look similar in that you can produce and 
consume a stream of values with both, but conceptually they are quite 
different - Tasks are a mechanism for control flow, whereas Lazy Lists are 
a data structure. Perhaps you could call them the procedural and functional 
analogies of each other. I can't tell you what's best for you, but if 
you're thinking of Tasks as representing a sequence of data, then there's a 
good chance you'll find Lazy Lists easier to reason about.

For example, consider the partition() function. In Lazy.jl terms this 
splits a single list into a list of lists - it's fairly easy to visualise 
this:

 partition(3, seq(1:9))
List:
  (1 2 3)
  (4 5 6)
  (7 8 9)

If you wanted to write partition() for Tasks, you'd end up with tasks that 
produce tasks. I don't know about you, but that gives me a headache.

You'll also notice that working with general iterators takes a lot of work; 
consider the Iterators.jl version of take(), which takes about twenty 
lines, versus the two-liner in Lazy.jl. Some things are simply impossible 
to do generically, like flatten().

That's not to say that Tasks aren't useful - they're better if you want to 
do more things in terms of control flow and less in terms of manipulating 
the data itself, for example. Both Tasks and Lazy Lists are extremely 
powerful, but each within their own scope - hence it's useful to have both.

Is this roughly what you were looking for? Let me know if I've missed 
anything.


[julia-users] Re: [ANN] Two packages: Lazy.jl Mathematica.jl

2014-03-08 Thread andrew cooke

i realise that in julia iterators are a protocol (that they rely on start, 
done and next, and that the underlying type used to do the iteration 
depends on what is being iterated over).  but that's not true in python, 
for example, where all iterators are implemented as coroutines.  the only 
reason i can see for julia adding a separate mechanism for iterators 
separate from tasks is efficiency - it's less work to use the iterator 
protocol to effectively manage an integer than to have a task.  or maybe 
it's that consume is explicit in julia while it's not in python, so tasks 
look uglier in julia?

to me this seems confusing.  for example, it would be nice to have 
something that takes a task and generates a new task than is the contents 
of the old task repeated?  but the repeat() function in Iterators.jl 
doesn't do that.  instead it gives you an iterator.  i don't know if this 
matters in practice - i haven't use tasks and iterators enough - but it 
seems like a mess.  why two different things?

similarly, i understand, i think, that both lazy streams and tasks are 
implemented differently.  but a task that produces tasks doesn't give me a 
headache any more than lazy streams of lazy streams.  in fact tasks 
generally seem simpler (to me) because you don't have to worry about making 
the flow work nicely - you can just bail out with a produce.  but maybe 
it's just that i am more used to python than to scheme.  again, why two 
different things?  just because you are used to programming in scheme and i 
am used to python?  that's not a great answer in my book.

(and the task version of take doesn't require 20 lines, for example - 
https://github.com/andrewcooke/BlockCipherSelfStudy.jl/blob/master/src/Tasks.jl#L5
 
)

someone else has pointed me to 
http://journal.stuffwithstuff.com/2013/01/13/iteration-inside-and-out/ 
which i haven't read yet, saying that it explains the difference between 
iterators and tasks.  maybe that will help me.

thinking more about this last night i did realise that my instinctive 
aversion to having lots of ways to do the same thing isn't necessarily 
reasonable in julia.  in a sense, what does it matter if julia has lazy 
streams, tasks and iterators, if they all use the same names for 
functions?  because then you can swap types out and code will still work.  
so i guess the cost to have take defined for iterators, and for tasks and 
for lazy streams is less than i imagined.

andrew

On Saturday, 8 March 2014 08:06:33 UTC-3, Mike Innes wrote:

 So, to clarify, Iterators aren't a thing in themselves. Iteration is an 
 interface, and to call something an iterator just means that you can put it 
 in a for loop. Tasks and Lazy Lists are both iterators; so are arrays, 
 sets, dictionaries, and a whole bunch of other things. But although you can 
 use them in a similar way if you want to, they are all designed to solve 
 very different problems.

 Now, Tasks and Lazy Lists do look similar in that you can produce and 
 consume a stream of values with both, but conceptually they are quite 
 different - Tasks are a mechanism for control flow, whereas Lazy Lists are 
 a data structure. Perhaps you could call them the procedural and functional 
 analogies of each other. I can't tell you what's best for you, but if 
 you're thinking of Tasks as representing a sequence of data, then there's a 
 good chance you'll find Lazy Lists easier to reason about.

 For example, consider the partition() function. In Lazy.jl terms this 
 splits a single list into a list of lists - it's fairly easy to visualise 
 this:

  partition(3, seq(1:9))
 List:
   (1 2 3)
   (4 5 6)
   (7 8 9)

 If you wanted to write partition() for Tasks, you'd end up with tasks that 
 produce tasks. I don't know about you, but that gives me a headache.

 You'll also notice that working with general iterators takes a lot of 
 work; consider the Iterators.jl version of take(), which takes about twenty 
 lines, versus the two-liner in Lazy.jl. Some things are simply impossible 
 to do generically, like flatten().

 That's not to say that Tasks aren't useful - they're better if you want to 
 do more things in terms of control flow and less in terms of manipulating 
 the data itself, for example. Both Tasks and Lazy Lists are extremely 
 powerful, but each within their own scope - hence it's useful to have both.

 Is this roughly what you were looking for? Let me know if I've missed 
 anything.



Re: [julia-users] Re: [ANN] Two packages: Lazy.jl Mathematica.jl

2014-03-08 Thread Mike Innes
Ok, fair enough - I think the confusion for me lies in the fact that I
wouldn't have said that Julia has lazy lists, tasks and iterators, in the
same way that I wouldn't say it has floats, integers and numbers, because
the former two are just types of the latter. But now I think I understand
that by iterator you mean iterator implementation via a custom type -
like the Take and Repeat types that Iterators.jl uses. Right? Also, I want
to separate the idea of tasks and generators, because tasks are just
coroutines - they can be used to make generators, as you have, but it's not
their only purpose.

I think I'm in agreement with you that iterators, in that sense, are best
reserved for when they have a specific purpose (like Ranges, for example).
I'm not convinced that the Iterators.jl style is the best idea myself, so
lets leave that alone for now. Then it comes down to generators and lazy
sequences, which as you've pointed out are two different ways to solve the
same problem.

As I've mentioned, these are both reflections of two very different styles
of programming, procedural vs. functional. In my view, the fact that
different people have different tastes is *exactly *the reason to support
both paradigms, as opposed to deciding on one true way for everyone. That
article, while it doesn't apply 1:1 to our discussion, also looks at the
idea that in many cases one style is objectively preferable to another - in
which case, it only make sense for Julia to support both.

I'd be interested to see the tree-walking iterator mentioned in the article
implemented via a task. I could be wrong, but I imagine it would be
reasonably difficult compared to the lazy sequence version. Equally, I
don't know of anything that's harder with sequences than with generators,
so if you can think of anything I'd be interested in having a go at it.


On 8 March 2014 11:44, andrew cooke and...@acooke.org wrote:


 i realise that in julia iterators are a protocol (that they rely on start,
 done and next, and that the underlying type used to do the iteration
 depends on what is being iterated over).  but that's not true in python,
 for example, where all iterators are implemented as coroutines.  the only
 reason i can see for julia adding a separate mechanism for iterators
 separate from tasks is efficiency - it's less work to use the iterator
 protocol to effectively manage an integer than to have a task.  or maybe
 it's that consume is explicit in julia while it's not in python, so tasks
 look uglier in julia?

 to me this seems confusing.  for example, it would be nice to have
 something that takes a task and generates a new task than is the contents
 of the old task repeated?  but the repeat() function in Iterators.jl
 doesn't do that.  instead it gives you an iterator.  i don't know if this
 matters in practice - i haven't use tasks and iterators enough - but it
 seems like a mess.  why two different things?

 similarly, i understand, i think, that both lazy streams and tasks are
 implemented differently.  but a task that produces tasks doesn't give me a
 headache any more than lazy streams of lazy streams.  in fact tasks
 generally seem simpler (to me) because you don't have to worry about making
 the flow work nicely - you can just bail out with a produce.  but maybe
 it's just that i am more used to python than to scheme.  again, why two
 different things?  just because you are used to programming in scheme and i
 am used to python?  that's not a great answer in my book.

 (and the task version of take doesn't require 20 lines, for example -
 https://github.com/andrewcooke/BlockCipherSelfStudy.jl/blob/master/src/Tasks.jl#L5)

 someone else has pointed me to
 http://journal.stuffwithstuff.com/2013/01/13/iteration-inside-and-out/which i 
 haven't read yet, saying that it explains the difference between
 iterators and tasks.  maybe that will help me.

 thinking more about this last night i did realise that my instinctive
 aversion to having lots of ways to do the same thing isn't necessarily
 reasonable in julia.  in a sense, what does it matter if julia has lazy
 streams, tasks and iterators, if they all use the same names for
 functions?  because then you can swap types out and code will still work.
 so i guess the cost to have take defined for iterators, and for tasks and
 for lazy streams is less than i imagined.

 andrew


 On Saturday, 8 March 2014 08:06:33 UTC-3, Mike Innes wrote:

 So, to clarify, Iterators aren't a thing in themselves. Iteration is an
 interface, and to call something an iterator just means that you can put it
 in a for loop. Tasks and Lazy Lists are both iterators; so are arrays,
 sets, dictionaries, and a whole bunch of other things. But although you can
 use them in a similar way if you want to, they are all designed to solve
 very different problems.

 Now, Tasks and Lazy Lists do look similar in that you can produce and
 consume a stream of values with both, but conceptually they are quite
 different - 

Re: [julia-users] Re: [ANN] Two packages: Lazy.jl Mathematica.jl

2014-03-08 Thread Stefan Karpinski
I'm going to try to answer the question by taking it to its logical
conclusion. Why does Julia have if statements, while loops and for loops
when all of these things can be accomplished with closures? In fact,
coroutines are even more powerful still, so why bother with anything
besides coroutines? Everything that can be accomplished with iteration can
also be done with recursion. Should we get rid of while loops and for
loops? Or maybe we should disallow recursion – you can always rewrite a
recursive algorithm using iteration and an explicit stack.

Python's use of generators for custom iteration – not to mention raising
and catching an exception to terminate iteration – ensures that it is all
but impossible to make iteration over custom data types fast or efficient.
Python gets away with this because no one expects using data structures
defined in Python to be especially fast or efficient. The language's built
in types are special and iterating over them efficiently is baked into the
implementation. In Julia, ranges are just a user-defined
typehttps://github.com/JuliaLang/julia/blob/master/base/range.jlthat
happens to be defined before your program begins – if iteration used
coroutines and exceptions in Julia, all for loops – even just iterating
from 1 to n – would be ridiculously slow and inefficient. We'd either be
forced to implement everything important in C, or we'd still be screwing
around with crazy optimization techniques to eliminate the coroutines and
exceptions. Instead, we used a design for iteration that's simple and easy
to make fast and efficient for simple things like iterating over a range or
an array, yet flexible enough for all kinds of user-defined types. There
are also coroutines and exceptions if you need them, but they have some
overhead, so they should be used as appropriate.

So you are left with a choice between using a task that produces values –
which is easy but not very efficient – or implementing an iteration type –
which is more work, but also much more efficient. But this kind of choice
isn't unusual in programming – writing code is full of choices between
different ways to accomplish the same thing with different tradeoffs. The
Python mantra that there should be one – and preferably only one – obvious
way to do it has always stuck me as hopelessly naïve – or maybe it's just
wishful thinking. Sure, it's nice if there is an obvious way to do
something, and even better if there's one clearly right way. But that's a
lucky and rare situation. Most problems don't have exactly one obvious
solution. Easy problems have many obvious solutions. Hard problems have no
obvious solutions (by definition). Trying to design a language so that
there's only one obvious solution to most problems strikes me as a recipe
for having a language that's not powerful enough to solve really hard
problems well. When you're trying to solve a truly difficult problems,
don't you want all the best and most powerful tools available – even if
that means that there are lots of ways to solve easier problems?



On Sat, Mar 8, 2014 at 8:36 AM, Mike Innes mike.j.in...@gmail.com wrote:

 Ok, fair enough - I think the confusion for me lies in the fact that I
 wouldn't have said that Julia has lazy lists, tasks and iterators, in the
 same way that I wouldn't say it has floats, integers and numbers, because
 the former two are just types of the latter. But now I think I understand
 that by iterator you mean iterator implementation via a custom type -
 like the Take and Repeat types that Iterators.jl uses. Right? Also, I want
 to separate the idea of tasks and generators, because tasks are just
 coroutines - they can be used to make generators, as you have, but it's not
 their only purpose.

 I think I'm in agreement with you that iterators, in that sense, are best
 reserved for when they have a specific purpose (like Ranges, for example).
 I'm not convinced that the Iterators.jl style is the best idea myself, so
 lets leave that alone for now. Then it comes down to generators and lazy
 sequences, which as you've pointed out are two different ways to solve the
 same problem.

 As I've mentioned, these are both reflections of two very different styles
 of programming, procedural vs. functional. In my view, the fact that
 different people have different tastes is *exactly *the reason to support
 both paradigms, as opposed to deciding on one true way for everyone. That
 article, while it doesn't apply 1:1 to our discussion, also looks at the
 idea that in many cases one style is objectively preferable to another - in
 which case, it only make sense for Julia to support both.

 I'd be interested to see the tree-walking iterator mentioned in the
 article implemented via a task. I could be wrong, but I imagine it would be
 reasonably difficult compared to the lazy sequence version. Equally, I
 don't know of anything that's harder with sequences than with generators,
 so if you can think of anything I'd be interested in 

Re: [julia-users] Re: [ANN] Two packages: Lazy.jl Mathematica.jl

2014-03-08 Thread andrew cooke

i don't think anyone was doubting that iterators are more efficient than 
tasks (me: the only reason i can see for julia adding a separate mechanism 
for iterators separate from tasks is efficiency; mike: [coroutines...] 
impossible to make iteration over custom data types fast or efficient.).

what i was questioning more (although in a circumspect way, which i thought 
was being poite but may have only been annoying) was whether there is a 
need for (yet) another mechanism besides iterators and tasks.  lazy 
sequences (at least when i've used them) are equivalent to generators, so 
are less expressive than tasks.  and it seems the justification is one or 
more of: (1) some (perhaps many) people find them easier to understand; (2) 
they are more suited to some problems than others (and i was hoping to look 
at the challenge posed for tasks at some point); (3) they are more 
efficient than tasks (i don't think mike said this, but it may be true).

andrew

On Saturday, 8 March 2014 20:12:50 UTC-3, Stefan Karpinski wrote:

 I'm going to try to answer the question by taking it to its logical 
 conclusion. Why does Julia have if statements, while loops and for loops 
 when all of these things can be accomplished with closures? In fact, 
 coroutines are even more powerful still, so why bother with anything 
 besides coroutines? Everything that can be accomplished with iteration can 
 also be done with recursion. Should we get rid of while loops and for 
 loops? Or maybe we should disallow recursion – you can always rewrite a 
 recursive algorithm using iteration and an explicit stack.

 Python's use of generators for custom iteration – not to mention raising 
 and catching an exception to terminate iteration – ensures that it is all 
 but impossible to make iteration over custom data types fast or efficient. 
 Python gets away with this because no one expects using data structures 
 defined in Python to be especially fast or efficient. The language's built 
 in types are special and iterating over them efficiently is baked into the 
 implementation. In Julia, ranges are just a user-defined 
 typehttps://github.com/JuliaLang/julia/blob/master/base/range.jlthat 
 happens to be defined before your program begins – if iteration used 
 coroutines and exceptions in Julia, all for loops – even just iterating 
 from 1 to n – would be ridiculously slow and inefficient. We'd either be 
 forced to implement everything important in C, or we'd still be screwing 
 around with crazy optimization techniques to eliminate the coroutines and 
 exceptions. Instead, we used a design for iteration that's simple and easy 
 to make fast and efficient for simple things like iterating over a range or 
 an array, yet flexible enough for all kinds of user-defined types. There 
 are also coroutines and exceptions if you need them, but they have some 
 overhead, so they should be used as appropriate.

 So you are left with a choice between using a task that produces values – 
 which is easy but not very efficient – or implementing an iteration type – 
 which is more work, but also much more efficient. But this kind of choice 
 isn't unusual in programming – writing code is full of choices between 
 different ways to accomplish the same thing with different tradeoffs. The 
 Python mantra that there should be one – and preferably only one – obvious 
 way to do it has always stuck me as hopelessly naïve – or maybe it's just 
 wishful thinking. Sure, it's nice if there is an obvious way to do 
 something, and even better if there's one clearly right way. But that's a 
 lucky and rare situation. Most problems don't have exactly one obvious 
 solution. Easy problems have many obvious solutions. Hard problems have no 
 obvious solutions (by definition). Trying to design a language so that 
 there's only one obvious solution to most problems strikes me as a recipe 
 for having a language that's not powerful enough to solve really hard 
 problems well. When you're trying to solve a truly difficult problems, 
 don't you want all the best and most powerful tools available – even if 
 that means that there are lots of ways to solve easier problems?



 On Sat, Mar 8, 2014 at 8:36 AM, Mike Innes mike.j...@gmail.comjavascript:
  wrote:

 Ok, fair enough - I think the confusion for me lies in the fact that I 
 wouldn't have said that Julia has lazy lists, tasks and iterators, in the 
 same way that I wouldn't say it has floats, integers and numbers, because 
 the former two are just types of the latter. But now I think I understand 
 that by iterator you mean iterator implementation via a custom type - 
 like the Take and Repeat types that Iterators.jl uses. Right? Also, I want 
 to separate the idea of tasks and generators, because tasks are just 
 coroutines - they can be used to make generators, as you have, but it's not 
 their only purpose.

 I think I'm in agreement with you that iterators, in that sense, are best 
 reserved for when they have a 

[julia-users] Re: [ANN] Two packages: Lazy.jl Mathematica.jl

2014-03-07 Thread andrew cooke

oh, so does iterators.jl work for tasks?

i guess i'm confused why julia needs three different ways of doing what to 
me seem like very similar things.  why do iterators, tasks and lazy 
sequences all need to exist?

i may be missing something obvious...

andrew  

On Friday, 7 March 2014 21:10:22 UTC-3, Mike Innes wrote:

 Sorry if I've misunderstood you, but I take it you're referring 
 specifically to using Tasks as generators (via produce()/consume()) and 
 iterating over them?

 Lazy.jl doesn't use tasks for its implementation, no, but you can turn a 
 task (or indeed any iterator) into a lazy list using seq() and use any of 
 these functions on it that way. You can also iterate over lazy lists 
 themselves just like tasks.

 You're right that many of these functions are useful for iterators/tasks 
 too, so there's a similar set of functionality in Iterators.jl.