Re: [julia-users] Re: [ANN] Two packages: Lazy.jl Mathematica.jl
I haven't done any real benchmarks but I imagine custom iterator types are still much faster than generators, given that they're essentially zero overhead. Lazy sequences aren't exactly speed demons either - they're basically just closures, which are a known sore spot for performance in Julia. This may well improve, but you would generally want to use a more functional style when you aren't as worried about performance anyway. Stefan: you want all the best and most powerful tools available Yes, yes, yes - this a thousand times. The problem with demanding that every problem has a single solution is that you end up with every solution fits a single problem as well. You can't possibly foresee all of the solutions and determine which is best, and you can't foresee all of the problems a tool can solve either. So instead of trying to create a one-size-fits all solution to each class of problem, let's make powerful tools that work seamlessly together and trust our users to do what they do best - problem solving. On 9 March 2014 00:03, andrew cooke and...@acooke.org wrote: i don't think anyone was doubting that iterators are more efficient than tasks (me: the only reason i can see for julia adding a separate mechanism for iterators separate from tasks is efficiency; mike: [coroutines...] impossible to make iteration over custom data types fast or efficient.). what i was questioning more (although in a circumspect way, which i thought was being poite but may have only been annoying) was whether there is a need for (yet) another mechanism besides iterators and tasks. lazy sequences (at least when i've used them) are equivalent to generators, so are less expressive than tasks. and it seems the justification is one or more of: (1) some (perhaps many) people find them easier to understand; (2) they are more suited to some problems than others (and i was hoping to look at the challenge posed for tasks at some point); (3) they are more efficient than tasks (i don't think mike said this, but it may be true). andrew On Saturday, 8 March 2014 20:12:50 UTC-3, Stefan Karpinski wrote: I'm going to try to answer the question by taking it to its logical conclusion. Why does Julia have if statements, while loops and for loops when all of these things can be accomplished with closures? In fact, coroutines are even more powerful still, so why bother with anything besides coroutines? Everything that can be accomplished with iteration can also be done with recursion. Should we get rid of while loops and for loops? Or maybe we should disallow recursion - you can always rewrite a recursive algorithm using iteration and an explicit stack. Python's use of generators for custom iteration - not to mention raising and catching an exception to terminate iteration - ensures that it is all but impossible to make iteration over custom data types fast or efficient. Python gets away with this because no one expects using data structures defined in Python to be especially fast or efficient. The language's built in types are special and iterating over them efficiently is baked into the implementation. In Julia, ranges are just a user-defined typehttps://github.com/JuliaLang/julia/blob/master/base/range.jlthat happens to be defined before your program begins - if iteration used coroutines and exceptions in Julia, all for loops - even just iterating from 1 to n - would be ridiculously slow and inefficient. We'd either be forced to implement everything important in C, or we'd still be screwing around with crazy optimization techniques to eliminate the coroutines and exceptions. Instead, we used a design for iteration that's simple and easy to make fast and efficient for simple things like iterating over a range or an array, yet flexible enough for all kinds of user-defined types. There are also coroutines and exceptions if you need them, but they have some overhead, so they should be used as appropriate. So you are left with a choice between using a task that produces values - which is easy but not very efficient - or implementing an iteration type - which is more work, but also much more efficient. But this kind of choice isn't unusual in programming - writing code is full of choices between different ways to accomplish the same thing with different tradeoffs. The Python mantra that there should be one - and preferably only one - obvious way to do it has always stuck me as hopelessly naïve - or maybe it's just wishful thinking. Sure, it's nice if there is an obvious way to do something, and even better if there's one clearly right way. But that's a lucky and rare situation. Most problems don't have exactly one obvious solution. Easy problems have many obvious solutions. Hard problems have no obvious solutions (by definition). Trying to design a language so that there's only one obvious solution to most problems strikes me as a recipe for having a language that's not powerful
Re: [julia-users] Re: [ANN] Two packages: Lazy.jl Mathematica.jl
I'm sorry if that came off as though it was targeted at you - I meant to a general statement about the philosophy of having zero duplication. Of course, you're right, duplication has a cost too, and it doesn't work to just throw everything together either, so like everything else in life it's about compromise. For what it's worth, I think you asked a good question, made a valid point, and an interesting discussion came out of it. Regardless of one's approach, it's always right to question whether you're really doing the right thing - so I didn't take your question as being demanding at all, far from it. Thanks for taking the time to join the discussion, and again, apologies if you've felt that there's any hostility involved. On 9 March 2014 11:23, andrew cooke and...@acooke.org wrote: On Sunday, 9 March 2014 07:48:25 UTC-3, Mike Innes wrote: Stefan: you want all the best and most powerful tools available Yes, yes, yes - this a thousand times. The problem with demanding that every problem has a single solution is that you end up with every solution fits a single problem as well. You can't possibly foresee all of the solutions and determine which is best, and you can't foresee all of the problems a tool can solve either. So instead of trying to create a one-size-fits all solution to each class of problem, let's make powerful tools that work seamlessly together and trust our users to do what they do best - problem solving. Go read the Julia issues. They're full of tradeoffs between simplicity and functionality. Life, and language and library design, aren't as simple as you are making out. There are costs to duplication and alternatives. And painting my posts as demanding a single solution is plain wrong. I asked a question. I didn't demand anything. Andrew
[julia-users] Re: [ANN] Two packages: Lazy.jl Mathematica.jl
So, to clarify, Iterators aren't a thing in themselves. Iteration is an interface, and to call something an iterator just means that you can put it in a for loop. Tasks and Lazy Lists are both iterators; so are arrays, sets, dictionaries, and a whole bunch of other things. But although you can use them in a similar way if you want to, they are all designed to solve very different problems. Now, Tasks and Lazy Lists do look similar in that you can produce and consume a stream of values with both, but conceptually they are quite different - Tasks are a mechanism for control flow, whereas Lazy Lists are a data structure. Perhaps you could call them the procedural and functional analogies of each other. I can't tell you what's best for you, but if you're thinking of Tasks as representing a sequence of data, then there's a good chance you'll find Lazy Lists easier to reason about. For example, consider the partition() function. In Lazy.jl terms this splits a single list into a list of lists - it's fairly easy to visualise this: partition(3, seq(1:9)) List: (1 2 3) (4 5 6) (7 8 9) If you wanted to write partition() for Tasks, you'd end up with tasks that produce tasks. I don't know about you, but that gives me a headache. You'll also notice that working with general iterators takes a lot of work; consider the Iterators.jl version of take(), which takes about twenty lines, versus the two-liner in Lazy.jl. Some things are simply impossible to do generically, like flatten(). That's not to say that Tasks aren't useful - they're better if you want to do more things in terms of control flow and less in terms of manipulating the data itself, for example. Both Tasks and Lazy Lists are extremely powerful, but each within their own scope - hence it's useful to have both. Is this roughly what you were looking for? Let me know if I've missed anything.
[julia-users] Re: [ANN] Two packages: Lazy.jl Mathematica.jl
i realise that in julia iterators are a protocol (that they rely on start, done and next, and that the underlying type used to do the iteration depends on what is being iterated over). but that's not true in python, for example, where all iterators are implemented as coroutines. the only reason i can see for julia adding a separate mechanism for iterators separate from tasks is efficiency - it's less work to use the iterator protocol to effectively manage an integer than to have a task. or maybe it's that consume is explicit in julia while it's not in python, so tasks look uglier in julia? to me this seems confusing. for example, it would be nice to have something that takes a task and generates a new task than is the contents of the old task repeated? but the repeat() function in Iterators.jl doesn't do that. instead it gives you an iterator. i don't know if this matters in practice - i haven't use tasks and iterators enough - but it seems like a mess. why two different things? similarly, i understand, i think, that both lazy streams and tasks are implemented differently. but a task that produces tasks doesn't give me a headache any more than lazy streams of lazy streams. in fact tasks generally seem simpler (to me) because you don't have to worry about making the flow work nicely - you can just bail out with a produce. but maybe it's just that i am more used to python than to scheme. again, why two different things? just because you are used to programming in scheme and i am used to python? that's not a great answer in my book. (and the task version of take doesn't require 20 lines, for example - https://github.com/andrewcooke/BlockCipherSelfStudy.jl/blob/master/src/Tasks.jl#L5 ) someone else has pointed me to http://journal.stuffwithstuff.com/2013/01/13/iteration-inside-and-out/ which i haven't read yet, saying that it explains the difference between iterators and tasks. maybe that will help me. thinking more about this last night i did realise that my instinctive aversion to having lots of ways to do the same thing isn't necessarily reasonable in julia. in a sense, what does it matter if julia has lazy streams, tasks and iterators, if they all use the same names for functions? because then you can swap types out and code will still work. so i guess the cost to have take defined for iterators, and for tasks and for lazy streams is less than i imagined. andrew On Saturday, 8 March 2014 08:06:33 UTC-3, Mike Innes wrote: So, to clarify, Iterators aren't a thing in themselves. Iteration is an interface, and to call something an iterator just means that you can put it in a for loop. Tasks and Lazy Lists are both iterators; so are arrays, sets, dictionaries, and a whole bunch of other things. But although you can use them in a similar way if you want to, they are all designed to solve very different problems. Now, Tasks and Lazy Lists do look similar in that you can produce and consume a stream of values with both, but conceptually they are quite different - Tasks are a mechanism for control flow, whereas Lazy Lists are a data structure. Perhaps you could call them the procedural and functional analogies of each other. I can't tell you what's best for you, but if you're thinking of Tasks as representing a sequence of data, then there's a good chance you'll find Lazy Lists easier to reason about. For example, consider the partition() function. In Lazy.jl terms this splits a single list into a list of lists - it's fairly easy to visualise this: partition(3, seq(1:9)) List: (1 2 3) (4 5 6) (7 8 9) If you wanted to write partition() for Tasks, you'd end up with tasks that produce tasks. I don't know about you, but that gives me a headache. You'll also notice that working with general iterators takes a lot of work; consider the Iterators.jl version of take(), which takes about twenty lines, versus the two-liner in Lazy.jl. Some things are simply impossible to do generically, like flatten(). That's not to say that Tasks aren't useful - they're better if you want to do more things in terms of control flow and less in terms of manipulating the data itself, for example. Both Tasks and Lazy Lists are extremely powerful, but each within their own scope - hence it's useful to have both. Is this roughly what you were looking for? Let me know if I've missed anything.
Re: [julia-users] Re: [ANN] Two packages: Lazy.jl Mathematica.jl
Ok, fair enough - I think the confusion for me lies in the fact that I wouldn't have said that Julia has lazy lists, tasks and iterators, in the same way that I wouldn't say it has floats, integers and numbers, because the former two are just types of the latter. But now I think I understand that by iterator you mean iterator implementation via a custom type - like the Take and Repeat types that Iterators.jl uses. Right? Also, I want to separate the idea of tasks and generators, because tasks are just coroutines - they can be used to make generators, as you have, but it's not their only purpose. I think I'm in agreement with you that iterators, in that sense, are best reserved for when they have a specific purpose (like Ranges, for example). I'm not convinced that the Iterators.jl style is the best idea myself, so lets leave that alone for now. Then it comes down to generators and lazy sequences, which as you've pointed out are two different ways to solve the same problem. As I've mentioned, these are both reflections of two very different styles of programming, procedural vs. functional. In my view, the fact that different people have different tastes is *exactly *the reason to support both paradigms, as opposed to deciding on one true way for everyone. That article, while it doesn't apply 1:1 to our discussion, also looks at the idea that in many cases one style is objectively preferable to another - in which case, it only make sense for Julia to support both. I'd be interested to see the tree-walking iterator mentioned in the article implemented via a task. I could be wrong, but I imagine it would be reasonably difficult compared to the lazy sequence version. Equally, I don't know of anything that's harder with sequences than with generators, so if you can think of anything I'd be interested in having a go at it. On 8 March 2014 11:44, andrew cooke and...@acooke.org wrote: i realise that in julia iterators are a protocol (that they rely on start, done and next, and that the underlying type used to do the iteration depends on what is being iterated over). but that's not true in python, for example, where all iterators are implemented as coroutines. the only reason i can see for julia adding a separate mechanism for iterators separate from tasks is efficiency - it's less work to use the iterator protocol to effectively manage an integer than to have a task. or maybe it's that consume is explicit in julia while it's not in python, so tasks look uglier in julia? to me this seems confusing. for example, it would be nice to have something that takes a task and generates a new task than is the contents of the old task repeated? but the repeat() function in Iterators.jl doesn't do that. instead it gives you an iterator. i don't know if this matters in practice - i haven't use tasks and iterators enough - but it seems like a mess. why two different things? similarly, i understand, i think, that both lazy streams and tasks are implemented differently. but a task that produces tasks doesn't give me a headache any more than lazy streams of lazy streams. in fact tasks generally seem simpler (to me) because you don't have to worry about making the flow work nicely - you can just bail out with a produce. but maybe it's just that i am more used to python than to scheme. again, why two different things? just because you are used to programming in scheme and i am used to python? that's not a great answer in my book. (and the task version of take doesn't require 20 lines, for example - https://github.com/andrewcooke/BlockCipherSelfStudy.jl/blob/master/src/Tasks.jl#L5) someone else has pointed me to http://journal.stuffwithstuff.com/2013/01/13/iteration-inside-and-out/which i haven't read yet, saying that it explains the difference between iterators and tasks. maybe that will help me. thinking more about this last night i did realise that my instinctive aversion to having lots of ways to do the same thing isn't necessarily reasonable in julia. in a sense, what does it matter if julia has lazy streams, tasks and iterators, if they all use the same names for functions? because then you can swap types out and code will still work. so i guess the cost to have take defined for iterators, and for tasks and for lazy streams is less than i imagined. andrew On Saturday, 8 March 2014 08:06:33 UTC-3, Mike Innes wrote: So, to clarify, Iterators aren't a thing in themselves. Iteration is an interface, and to call something an iterator just means that you can put it in a for loop. Tasks and Lazy Lists are both iterators; so are arrays, sets, dictionaries, and a whole bunch of other things. But although you can use them in a similar way if you want to, they are all designed to solve very different problems. Now, Tasks and Lazy Lists do look similar in that you can produce and consume a stream of values with both, but conceptually they are quite different -
Re: [julia-users] Re: [ANN] Two packages: Lazy.jl Mathematica.jl
I'm going to try to answer the question by taking it to its logical conclusion. Why does Julia have if statements, while loops and for loops when all of these things can be accomplished with closures? In fact, coroutines are even more powerful still, so why bother with anything besides coroutines? Everything that can be accomplished with iteration can also be done with recursion. Should we get rid of while loops and for loops? Or maybe we should disallow recursion – you can always rewrite a recursive algorithm using iteration and an explicit stack. Python's use of generators for custom iteration – not to mention raising and catching an exception to terminate iteration – ensures that it is all but impossible to make iteration over custom data types fast or efficient. Python gets away with this because no one expects using data structures defined in Python to be especially fast or efficient. The language's built in types are special and iterating over them efficiently is baked into the implementation. In Julia, ranges are just a user-defined typehttps://github.com/JuliaLang/julia/blob/master/base/range.jlthat happens to be defined before your program begins – if iteration used coroutines and exceptions in Julia, all for loops – even just iterating from 1 to n – would be ridiculously slow and inefficient. We'd either be forced to implement everything important in C, or we'd still be screwing around with crazy optimization techniques to eliminate the coroutines and exceptions. Instead, we used a design for iteration that's simple and easy to make fast and efficient for simple things like iterating over a range or an array, yet flexible enough for all kinds of user-defined types. There are also coroutines and exceptions if you need them, but they have some overhead, so they should be used as appropriate. So you are left with a choice between using a task that produces values – which is easy but not very efficient – or implementing an iteration type – which is more work, but also much more efficient. But this kind of choice isn't unusual in programming – writing code is full of choices between different ways to accomplish the same thing with different tradeoffs. The Python mantra that there should be one – and preferably only one – obvious way to do it has always stuck me as hopelessly naïve – or maybe it's just wishful thinking. Sure, it's nice if there is an obvious way to do something, and even better if there's one clearly right way. But that's a lucky and rare situation. Most problems don't have exactly one obvious solution. Easy problems have many obvious solutions. Hard problems have no obvious solutions (by definition). Trying to design a language so that there's only one obvious solution to most problems strikes me as a recipe for having a language that's not powerful enough to solve really hard problems well. When you're trying to solve a truly difficult problems, don't you want all the best and most powerful tools available – even if that means that there are lots of ways to solve easier problems? On Sat, Mar 8, 2014 at 8:36 AM, Mike Innes mike.j.in...@gmail.com wrote: Ok, fair enough - I think the confusion for me lies in the fact that I wouldn't have said that Julia has lazy lists, tasks and iterators, in the same way that I wouldn't say it has floats, integers and numbers, because the former two are just types of the latter. But now I think I understand that by iterator you mean iterator implementation via a custom type - like the Take and Repeat types that Iterators.jl uses. Right? Also, I want to separate the idea of tasks and generators, because tasks are just coroutines - they can be used to make generators, as you have, but it's not their only purpose. I think I'm in agreement with you that iterators, in that sense, are best reserved for when they have a specific purpose (like Ranges, for example). I'm not convinced that the Iterators.jl style is the best idea myself, so lets leave that alone for now. Then it comes down to generators and lazy sequences, which as you've pointed out are two different ways to solve the same problem. As I've mentioned, these are both reflections of two very different styles of programming, procedural vs. functional. In my view, the fact that different people have different tastes is *exactly *the reason to support both paradigms, as opposed to deciding on one true way for everyone. That article, while it doesn't apply 1:1 to our discussion, also looks at the idea that in many cases one style is objectively preferable to another - in which case, it only make sense for Julia to support both. I'd be interested to see the tree-walking iterator mentioned in the article implemented via a task. I could be wrong, but I imagine it would be reasonably difficult compared to the lazy sequence version. Equally, I don't know of anything that's harder with sequences than with generators, so if you can think of anything I'd be interested in
Re: [julia-users] Re: [ANN] Two packages: Lazy.jl Mathematica.jl
i don't think anyone was doubting that iterators are more efficient than tasks (me: the only reason i can see for julia adding a separate mechanism for iterators separate from tasks is efficiency; mike: [coroutines...] impossible to make iteration over custom data types fast or efficient.). what i was questioning more (although in a circumspect way, which i thought was being poite but may have only been annoying) was whether there is a need for (yet) another mechanism besides iterators and tasks. lazy sequences (at least when i've used them) are equivalent to generators, so are less expressive than tasks. and it seems the justification is one or more of: (1) some (perhaps many) people find them easier to understand; (2) they are more suited to some problems than others (and i was hoping to look at the challenge posed for tasks at some point); (3) they are more efficient than tasks (i don't think mike said this, but it may be true). andrew On Saturday, 8 March 2014 20:12:50 UTC-3, Stefan Karpinski wrote: I'm going to try to answer the question by taking it to its logical conclusion. Why does Julia have if statements, while loops and for loops when all of these things can be accomplished with closures? In fact, coroutines are even more powerful still, so why bother with anything besides coroutines? Everything that can be accomplished with iteration can also be done with recursion. Should we get rid of while loops and for loops? Or maybe we should disallow recursion – you can always rewrite a recursive algorithm using iteration and an explicit stack. Python's use of generators for custom iteration – not to mention raising and catching an exception to terminate iteration – ensures that it is all but impossible to make iteration over custom data types fast or efficient. Python gets away with this because no one expects using data structures defined in Python to be especially fast or efficient. The language's built in types are special and iterating over them efficiently is baked into the implementation. In Julia, ranges are just a user-defined typehttps://github.com/JuliaLang/julia/blob/master/base/range.jlthat happens to be defined before your program begins – if iteration used coroutines and exceptions in Julia, all for loops – even just iterating from 1 to n – would be ridiculously slow and inefficient. We'd either be forced to implement everything important in C, or we'd still be screwing around with crazy optimization techniques to eliminate the coroutines and exceptions. Instead, we used a design for iteration that's simple and easy to make fast and efficient for simple things like iterating over a range or an array, yet flexible enough for all kinds of user-defined types. There are also coroutines and exceptions if you need them, but they have some overhead, so they should be used as appropriate. So you are left with a choice between using a task that produces values – which is easy but not very efficient – or implementing an iteration type – which is more work, but also much more efficient. But this kind of choice isn't unusual in programming – writing code is full of choices between different ways to accomplish the same thing with different tradeoffs. The Python mantra that there should be one – and preferably only one – obvious way to do it has always stuck me as hopelessly naïve – or maybe it's just wishful thinking. Sure, it's nice if there is an obvious way to do something, and even better if there's one clearly right way. But that's a lucky and rare situation. Most problems don't have exactly one obvious solution. Easy problems have many obvious solutions. Hard problems have no obvious solutions (by definition). Trying to design a language so that there's only one obvious solution to most problems strikes me as a recipe for having a language that's not powerful enough to solve really hard problems well. When you're trying to solve a truly difficult problems, don't you want all the best and most powerful tools available – even if that means that there are lots of ways to solve easier problems? On Sat, Mar 8, 2014 at 8:36 AM, Mike Innes mike.j...@gmail.comjavascript: wrote: Ok, fair enough - I think the confusion for me lies in the fact that I wouldn't have said that Julia has lazy lists, tasks and iterators, in the same way that I wouldn't say it has floats, integers and numbers, because the former two are just types of the latter. But now I think I understand that by iterator you mean iterator implementation via a custom type - like the Take and Repeat types that Iterators.jl uses. Right? Also, I want to separate the idea of tasks and generators, because tasks are just coroutines - they can be used to make generators, as you have, but it's not their only purpose. I think I'm in agreement with you that iterators, in that sense, are best reserved for when they have a
[julia-users] Re: [ANN] Two packages: Lazy.jl Mathematica.jl
oh, so does iterators.jl work for tasks? i guess i'm confused why julia needs three different ways of doing what to me seem like very similar things. why do iterators, tasks and lazy sequences all need to exist? i may be missing something obvious... andrew On Friday, 7 March 2014 21:10:22 UTC-3, Mike Innes wrote: Sorry if I've misunderstood you, but I take it you're referring specifically to using Tasks as generators (via produce()/consume()) and iterating over them? Lazy.jl doesn't use tasks for its implementation, no, but you can turn a task (or indeed any iterator) into a lazy list using seq() and use any of these functions on it that way. You can also iterate over lazy lists themselves just like tasks. You're right that many of these functions are useful for iterators/tasks too, so there's a similar set of functionality in Iterators.jl.