Re: [julia-users] Re: [ANN] Two packages: Lazy.jl & Mathematica.jl

andrew cooke Sat, 08 Mar 2014 16:03:21 -0800

i don't think anyone was doubting that iterators are more efficient than 
tasks (me: "the only reason i can see for julia adding a separate mechanism 
for iterators separate from tasks is efficiency; mike: "[coroutines...] 
impossible to make iteration over custom data types fast or efficient.").


what i was questioning more (although in a circumspect way, which i thought 
was being poite but may have only been annoying) was whether there is a 
need for (yet) another mechanism besides iterators and tasks.  lazy 
sequences (at least when i've used them) are equivalent to generators, so 
are less expressive than tasks.  and it seems the justification is one or 
more of: (1) some (perhaps many) people find them easier to understand; (2) 
they are more suited to some problems than others (and i was hoping to look 
at the challenge posed for tasks at some point); (3) they are more 
efficient than tasks (i don't think mike said this, but it may be true).

andrew

On Saturday, 8 March 2014 20:12:50 UTC-3, Stefan Karpinski wrote:
>
> I'm going to try to answer the question by taking it to its logical 
> conclusion. Why does Julia have if statements, while loops and for loops 
> when all of these things can be accomplished with closures? In fact, 
> coroutines are even more powerful still, so why bother with anything 
> besides coroutines? Everything that can be accomplished with iteration can 
> also be done with recursion. Should we get rid of while loops and for 
> loops? Or maybe we should disallow recursion – you can always rewrite a 
> recursive algorithm using iteration and an explicit stack.
>
> Python's use of generators for custom iteration – not to mention raising 
> and catching an exception to terminate iteration – ensures that it is all 
> but impossible to make iteration over custom data types fast or efficient. 
> Python gets away with this because no one expects using data structures 
> defined in Python to be especially fast or efficient. The language's built 
> in types are special and iterating over them efficiently is baked into the 
> implementation. In Julia, ranges are just a user-defined 
> type<https://github.com/JuliaLang/julia/blob/master/base/range.jl>that 
> happens to be defined before your program begins – if iteration used 
> coroutines and exceptions in Julia, all for loops – even just iterating 
> from 1 to n – would be ridiculously slow and inefficient. We'd either be 
> forced to implement everything important in C, or we'd still be screwing 
> around with crazy optimization techniques to eliminate the coroutines and 
> exceptions. Instead, we used a design for iteration that's simple and easy 
> to make fast and efficient for simple things like iterating over a range or 
> an array, yet flexible enough for all kinds of user-defined types. There 
> are also coroutines and exceptions if you need them, but they have some 
> overhead, so they should be used as appropriate.
>
> So you are left with a choice between using a task that produces values – 
> which is easy but not very efficient – or implementing an iteration type – 
> which is more work, but also much more efficient. But this kind of choice 
> isn't unusual in programming – writing code is full of choices between 
> different ways to accomplish the same thing with different tradeoffs. The 
> Python mantra that "there should be one – and preferably only one – obvious 
> way to do it" has always stuck me as hopelessly naïve – or maybe it's just 
> wishful thinking. Sure, it's nice if there is an obvious way to do 
> something, and even better if there's one clearly right way. But that's a 
> lucky and rare situation. Most problems don't have exactly one obvious 
> solution. Easy problems have many obvious solutions. Hard problems have no 
> obvious solutions (by definition). Trying to design a language so that 
> there's only one obvious solution to most problems strikes me as a recipe 
> for having a language that's not powerful enough to solve really hard 
> problems well. When you're trying to solve a truly difficult problems, 
> don't you want all the best and most powerful tools available – even if 
> that means that there are lots of ways to solve easier problems?
>
>
>
> On Sat, Mar 8, 2014 at 8:36 AM, Mike Innes <[email protected]<javascript:>
> > wrote:
>
>> Ok, fair enough - I think the confusion for me lies in the fact that I 
>> wouldn't have said that Julia has "lazy lists, tasks and iterators", in the 
>> same way that I wouldn't say it has "floats, integers and numbers", because 
>> the former two are just types of the latter. But now I think I understand 
>> that by "iterator" you mean "iterator implementation via a custom type" - 
>> like the Take and Repeat types that Iterators.jl uses. Right? Also, I want 
>> to separate the idea of "tasks" and "generators", because "tasks" are just 
>> coroutines - they can be used to make generators, as you have, but it's not 
>> their only purpose.
>>
>> I think I'm in agreement with you that iterators, in that sense, are best 
>> reserved for when they have a specific purpose (like Ranges, for example). 
>> I'm not convinced that the Iterators.jl style is the best idea myself, so 
>> lets leave that alone for now. Then it comes down to generators and lazy 
>> sequences, which as you've pointed out are two different ways to solve the 
>> same problem.
>>
>> As I've mentioned, these are both reflections of two very different 
>> styles of programming, procedural vs. functional. In my view, the fact that 
>> different people have different tastes is *exactly *the reason to 
>> support both paradigms, as opposed to deciding on "one true way" for 
>> everyone. That article, while it doesn't apply 1:1 to our discussion, also 
>> looks at the idea that in many cases one style is objectively preferable to 
>> another - in which case, it only make sense for Julia to support both.
>>
>> I'd be interested to see the tree-walking iterator mentioned in the 
>> article implemented via a task. I could be wrong, but I imagine it would be 
>> reasonably difficult compared to the lazy sequence version. Equally, I 
>> don't know of anything that's harder with sequences than with generators, 
>> so if you can think of anything I'd be interested in having a go at it.
>>
>>
>> On 8 March 2014 11:44, andrew cooke <[email protected] <javascript:>>wrote:
>>
>>>
>>> i realise that in julia iterators are a protocol (that they rely on 
>>> start, done and next, and that the underlying type used to "do" the 
>>> iteration depends on what is being iterated over).  but that's not true in 
>>> python, for example, where all iterators are implemented as coroutines.  
>>> the only reason i can see for julia adding a separate mechanism for 
>>> iterators separate from tasks is efficiency - it's less work to use the 
>>> iterator protocol to effectively manage an integer than to have a task.  or 
>>> maybe it's that consume is explicit in julia while it's not in python, so 
>>> tasks look uglier in julia?
>>>
>>> to me this seems confusing.  for example, it would be nice to have 
>>> something that takes a task and generates a new task than is the contents 
>>> of the old task repeated?  but the repeat() function in Iterators.jl 
>>> doesn't do that.  instead it gives you an iterator.  i don't know if this 
>>> matters in practice - i haven't use tasks and iterators enough - but it 
>>> seems like a mess.  why two different things?
>>>
>>> similarly, i understand, i think, that both lazy streams and tasks are 
>>> implemented differently.  but a task that produces tasks doesn't give me a 
>>> headache any more than lazy streams of lazy streams.  in fact tasks 
>>> generally seem simpler (to me) because you don't have to worry about making 
>>> the flow work nicely - you can just bail out with a produce.  but maybe 
>>> it's just that i am more used to python than to scheme.  again, why two 
>>> different things?  just because you are used to programming in scheme and i 
>>> am used to python?  that's not a great answer in my book.
>>>
>>> (and the task version of take doesn't require 20 lines, for example - 
>>> https://github.com/andrewcooke/BlockCipherSelfStudy.jl/blob/master/src/Tasks.jl#L5)
>>>
>>> someone else has pointed me to 
>>> http://journal.stuffwithstuff.com/2013/01/13/iteration-inside-and-out/which 
>>> i haven't read yet, saying that it explains the difference between 
>>> iterators and tasks.  maybe that will help me.
>>>
>>> thinking more about this last night i did realise that my instinctive 
>>> aversion to having lots of ways to do the same thing isn't necessarily 
>>> reasonable in julia.  in a sense, what does it matter if julia has lazy 
>>> streams, tasks and iterators, if they all use the same names for 
>>> functions?  because then you can swap types out and code will still work.  
>>> so i guess the cost to have take defined for iterators, and for tasks and 
>>> for lazy streams is less than i imagined.
>>>
>>> andrew
>>>
>>>
>>> On Saturday, 8 March 2014 08:06:33 UTC-3, Mike Innes wrote:
>>>>
>>>> So, to clarify, Iterators aren't a thing in themselves. Iteration is an 
>>>> interface, and to call something an iterator just means that you can put 
>>>> it 
>>>> in a for loop. Tasks and Lazy Lists are both iterators; so are arrays, 
>>>> sets, dictionaries, and a whole bunch of other things. But although you 
>>>> can 
>>>> use them in a similar way if you want to, they are all designed to solve 
>>>> very different problems.
>>>>
>>>> Now, Tasks and Lazy Lists do look similar in that you can produce and 
>>>> consume a stream of values with both, but conceptually they are quite 
>>>> different - Tasks are a mechanism for control flow, whereas Lazy Lists are 
>>>> a data structure. Perhaps you could call them the procedural and 
>>>> functional 
>>>> analogies of each other. I can't tell you what's best for you, but if 
>>>> you're thinking of Tasks as representing a sequence of data, then there's 
>>>> a 
>>>> good chance you'll find Lazy Lists easier to reason about.
>>>>
>>>> For example, consider the partition() function. In Lazy.jl terms this 
>>>> splits a single list into a list of lists - it's fairly easy to visualise 
>>>> this:
>>>>
>>>> > partition(3, seq(1:9))
>>>> List:
>>>>   (1 2 3)
>>>>   (4 5 6)
>>>>   (7 8 9)
>>>>
>>>> If you wanted to write partition() for Tasks, you'd end up with tasks 
>>>> that produce tasks. I don't know about you, but that gives me a headache.
>>>>
>>>> You'll also notice that working with general iterators takes a lot of 
>>>> work; consider the Iterators.jl version of take(), which takes about 
>>>> twenty 
>>>> lines, versus the two-liner in Lazy.jl. Some things are simply impossible 
>>>> to do generically, like flatten().
>>>>
>>>> That's not to say that Tasks aren't useful - they're better if you want 
>>>> to do more things in terms of control flow and less in terms of 
>>>> manipulating the data itself, for example. Both Tasks and Lazy Lists are 
>>>> extremely powerful, but each within their own scope - hence it's useful to 
>>>> have both.
>>>>
>>>> Is this roughly what you were looking for? Let me know if I've missed 
>>>> anything.
>>>>
>>>
>>
>

Re: [julia-users] Re: [ANN] Two packages: Lazy.jl & Mathematica.jl

Reply via email to