I haven't done any real benchmarks but I imagine custom iterator types are still much faster than generators, given that they're essentially zero overhead.
Lazy sequences aren't exactly speed demons either - they're basically just closures, which are a known sore spot for performance in Julia. This may well improve, but you would generally want to use a more functional style when you aren't as worried about performance anyway. Stefan: "you want all the best and most powerful tools available" Yes, yes, yes - this a thousand times. The problem with demanding that "every problem has a single solution" is that you end up with "every solution fits a single problem" as well. You can't possibly foresee all of the solutions and determine which is best, and you can't foresee all of the problems a tool can solve either. So instead of trying to create a one-size-fits all solution to each class of problem, let's make powerful tools that work seamlessly together and trust our users to do what they do best - problem solving. On 9 March 2014 00:03, andrew cooke <[email protected]> wrote: > > i don't think anyone was doubting that iterators are more efficient than > tasks (me: "the only reason i can see for julia adding a separate mechanism > for iterators separate from tasks is efficiency; mike: "[coroutines...] > impossible to make iteration over custom data types fast or efficient."). > > what i was questioning more (although in a circumspect way, which i > thought was being poite but may have only been annoying) was whether there > is a need for (yet) another mechanism besides iterators and tasks. lazy > sequences (at least when i've used them) are equivalent to generators, so > are less expressive than tasks. and it seems the justification is one or > more of: (1) some (perhaps many) people find them easier to understand; (2) > they are more suited to some problems than others (and i was hoping to look > at the challenge posed for tasks at some point); (3) they are more > efficient than tasks (i don't think mike said this, but it may be true). > > andrew > > > On Saturday, 8 March 2014 20:12:50 UTC-3, Stefan Karpinski wrote: > >> I'm going to try to answer the question by taking it to its logical >> conclusion. Why does Julia have if statements, while loops and for loops >> when all of these things can be accomplished with closures? In fact, >> coroutines are even more powerful still, so why bother with anything >> besides coroutines? Everything that can be accomplished with iteration can >> also be done with recursion. Should we get rid of while loops and for >> loops? Or maybe we should disallow recursion - you can always rewrite a >> recursive algorithm using iteration and an explicit stack. >> >> Python's use of generators for custom iteration - not to mention raising >> and catching an exception to terminate iteration - ensures that it is all >> but impossible to make iteration over custom data types fast or efficient. >> Python gets away with this because no one expects using data structures >> defined in Python to be especially fast or efficient. The language's built >> in types are special and iterating over them efficiently is baked into the >> implementation. In Julia, ranges are just a user-defined >> type<https://github.com/JuliaLang/julia/blob/master/base/range.jl>that >> happens to be defined before your program begins - if iteration used >> coroutines and exceptions in Julia, all for loops - even just iterating >> from 1 to n - would be ridiculously slow and inefficient. We'd either be >> forced to implement everything important in C, or we'd still be screwing >> around with crazy optimization techniques to eliminate the coroutines and >> exceptions. Instead, we used a design for iteration that's simple and easy >> to make fast and efficient for simple things like iterating over a range or >> an array, yet flexible enough for all kinds of user-defined types. There >> are also coroutines and exceptions if you need them, but they have some >> overhead, so they should be used as appropriate. >> >> So you are left with a choice between using a task that produces values - >> which is easy but not very efficient - or implementing an iteration type - >> which is more work, but also much more efficient. But this kind of choice >> isn't unusual in programming - writing code is full of choices between >> different ways to accomplish the same thing with different tradeoffs. The >> Python mantra that "there should be one - and preferably only one - obvious >> way to do it" has always stuck me as hopelessly naïve - or maybe it's just >> wishful thinking. Sure, it's nice if there is an obvious way to do >> something, and even better if there's one clearly right way. But that's a >> lucky and rare situation. Most problems don't have exactly one obvious >> solution. Easy problems have many obvious solutions. Hard problems have no >> obvious solutions (by definition). Trying to design a language so that >> there's only one obvious solution to most problems strikes me as a recipe >> for having a language that's not powerful enough to solve really hard >> problems well. When you're trying to solve a truly difficult problems, >> don't you want all the best and most powerful tools available - even if >> that means that there are lots of ways to solve easier problems? >> >> >> >> On Sat, Mar 8, 2014 at 8:36 AM, Mike Innes <[email protected]> wrote: >> >>> Ok, fair enough - I think the confusion for me lies in the fact that I >>> wouldn't have said that Julia has "lazy lists, tasks and iterators", in the >>> same way that I wouldn't say it has "floats, integers and numbers", because >>> the former two are just types of the latter. But now I think I understand >>> that by "iterator" you mean "iterator implementation via a custom type" - >>> like the Take and Repeat types that Iterators.jl uses. Right? Also, I want >>> to separate the idea of "tasks" and "generators", because "tasks" are just >>> coroutines - they can be used to make generators, as you have, but it's not >>> their only purpose. >>> >>> I think I'm in agreement with you that iterators, in that sense, are >>> best reserved for when they have a specific purpose (like Ranges, for >>> example). I'm not convinced that the Iterators.jl style is the best idea >>> myself, so lets leave that alone for now. Then it comes down to generators >>> and lazy sequences, which as you've pointed out are two different ways to >>> solve the same problem. >>> >>> As I've mentioned, these are both reflections of two very different >>> styles of programming, procedural vs. functional. In my view, the fact that >>> different people have different tastes is *exactly *the reason to >>> support both paradigms, as opposed to deciding on "one true way" for >>> everyone. That article, while it doesn't apply 1:1 to our discussion, also >>> looks at the idea that in many cases one style is objectively preferable to >>> another - in which case, it only make sense for Julia to support both. >>> >>> I'd be interested to see the tree-walking iterator mentioned in the >>> article implemented via a task. I could be wrong, but I imagine it would be >>> reasonably difficult compared to the lazy sequence version. Equally, I >>> don't know of anything that's harder with sequences than with generators, >>> so if you can think of anything I'd be interested in having a go at it. >>> >>> >>> On 8 March 2014 11:44, andrew cooke <[email protected]> wrote: >>> >>>> >>>> i realise that in julia iterators are a protocol (that they rely on >>>> start, done and next, and that the underlying type used to "do" the >>>> iteration depends on what is being iterated over). but that's not true in >>>> python, for example, where all iterators are implemented as coroutines. >>>> the only reason i can see for julia adding a separate mechanism for >>>> iterators separate from tasks is efficiency - it's less work to use the >>>> iterator protocol to effectively manage an integer than to have a task. or >>>> maybe it's that consume is explicit in julia while it's not in python, so >>>> tasks look uglier in julia? >>>> >>>> to me this seems confusing. for example, it would be nice to have >>>> something that takes a task and generates a new task than is the contents >>>> of the old task repeated? but the repeat() function in Iterators.jl >>>> doesn't do that. instead it gives you an iterator. i don't know if this >>>> matters in practice - i haven't use tasks and iterators enough - but it >>>> seems like a mess. why two different things? >>>> >>>> similarly, i understand, i think, that both lazy streams and tasks are >>>> implemented differently. but a task that produces tasks doesn't give me a >>>> headache any more than lazy streams of lazy streams. in fact tasks >>>> generally seem simpler (to me) because you don't have to worry about making >>>> the flow work nicely - you can just bail out with a produce. but maybe >>>> it's just that i am more used to python than to scheme. again, why two >>>> different things? just because you are used to programming in scheme and i >>>> am used to python? that's not a great answer in my book. >>>> >>>> (and the task version of take doesn't require 20 lines, for example - >>>> https://github.com/andrewcooke/BlockCipherSelfStudy.jl/blob/ >>>> master/src/Tasks.jl#L5 ) >>>> >>>> someone else has pointed me to http://journal.stuffwithstuff. >>>> com/2013/01/13/iteration-inside-and-out/ which i haven't read yet, >>>> saying that it explains the difference between iterators and tasks. maybe >>>> that will help me. >>>> >>>> thinking more about this last night i did realise that my instinctive >>>> aversion to having lots of ways to do the same thing isn't necessarily >>>> reasonable in julia. in a sense, what does it matter if julia has lazy >>>> streams, tasks and iterators, if they all use the same names for >>>> functions? because then you can swap types out and code will still work. >>>> so i guess the cost to have take defined for iterators, and for tasks and >>>> for lazy streams is less than i imagined. >>>> >>>> andrew >>>> >>>> >>>> On Saturday, 8 March 2014 08:06:33 UTC-3, Mike Innes wrote: >>>>> >>>>> So, to clarify, Iterators aren't a thing in themselves. Iteration is >>>>> an interface, and to call something an iterator just means that you can >>>>> put >>>>> it in a for loop. Tasks and Lazy Lists are both iterators; so are arrays, >>>>> sets, dictionaries, and a whole bunch of other things. But although you >>>>> can >>>>> use them in a similar way if you want to, they are all designed to solve >>>>> very different problems. >>>>> >>>>> Now, Tasks and Lazy Lists do look similar in that you can produce and >>>>> consume a stream of values with both, but conceptually they are quite >>>>> different - Tasks are a mechanism for control flow, whereas Lazy Lists are >>>>> a data structure. Perhaps you could call them the procedural and >>>>> functional >>>>> analogies of each other. I can't tell you what's best for you, but if >>>>> you're thinking of Tasks as representing a sequence of data, then there's >>>>> a >>>>> good chance you'll find Lazy Lists easier to reason about. >>>>> >>>>> For example, consider the partition() function. In Lazy.jl terms this >>>>> splits a single list into a list of lists - it's fairly easy to visualise >>>>> this: >>>>> >>>>> > partition(3, seq(1:9)) >>>>> List: >>>>> (1 2 3) >>>>> (4 5 6) >>>>> (7 8 9) >>>>> >>>>> If you wanted to write partition() for Tasks, you'd end up with tasks >>>>> that produce tasks. I don't know about you, but that gives me a headache. >>>>> >>>>> You'll also notice that working with general iterators takes a lot of >>>>> work; consider the Iterators.jl version of take(), which takes about >>>>> twenty >>>>> lines, versus the two-liner in Lazy.jl. Some things are simply impossible >>>>> to do generically, like flatten(). >>>>> >>>>> That's not to say that Tasks aren't useful - they're better if you >>>>> want to do more things in terms of control flow and less in terms of >>>>> manipulating the data itself, for example. Both Tasks and Lazy Lists are >>>>> extremely powerful, but each within their own scope - hence it's useful to >>>>> have both. >>>>> >>>>> Is this roughly what you were looking for? Let me know if I've missed >>>>> anything. >>>>> >>>> >>> >>
