Re: parallel sequence side-effect processor

2016-09-25 Thread Mars0i
Thanks everyone.  There's a lot here--I'm still digesting it all.  It's 
clear that a lot may depend on the underlying collections, and that, among 
other things, foldmap is worth considering for benchmarking in 
performance-critical code, as is mapping a vector across multiple sequences 
to combine them into one that can be passed to run!.  And Alex's comment 
about range conveys that I shouldn't make superficial assumptions about 
what is and isn't going to run quickly!  More to learn 



-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: parallel sequence side-effect processor

2016-09-24 Thread Dragan Djuric
Just for the reference, this is the code with neanderthal, so we can get 
the difference when running with primitives. No native code has been used 
here, just pure Clojure.

(def sv1 (sv v1))
(def sv2 (sv v2))

(defn side-effect-4 ^double [^double a ^double b] 1.0)

(with-progress-reporting (quick-bench  (foldmap (double-fn +) 0 
side-effect-4 sv1 sv2)))

 Execution time mean : 16.436371 µs



On Saturday, September 24, 2016 at 8:07:21 PM UTC+2, Dragan Djuric wrote:
>
> Francis,
>
> The times you got also heavily depend on the actual side-effect function, 
> which in this case is much faster when called with one arg, instead of with 
> varargs, that fluokitten need here.
>
> If we give fluokitten a function that does not create a sequence for 
> multiple arguments, it is much faster than the fastest example that you 
> give (although it does not do any numerical optimizations).
>
> For example, your code runs the following times on my machine (measured by 
> criterium):
>
> (def v1 (vec (range 1)))
>
> (def v2 (vec (range 1)))
>
> (def side-effect (constantly nil))
>
> (defn side-effect-2 [a b] nil)
>
> (with-progress-reporting (quick-bench  (foldmap op nil side-effect-2 v1 v2
> )))
>
> Execution time mean : 290.822794 µs
>
> (with-progress-reporting (quick-bench  (run! side-effect (mapv vector v1 
> v2
>
> Execution time mean : 847.292924 µs
>
> (defn side-effect-3 [a] nil)
>
> (with-progress-reporting (quick-bench  (run! side-effect-3 (mapv vector 
> v1 v2
>
>  Execution time mean : 847.554524 µs
>
> So, in your previous example, all the "normal" clojure code seems to be 
> optimally written, but fluokitten was handicapped by the side-effect's use 
> of varargs for only 2 arguments. When side-effects is written "better, 
> fluokitten is 3 times faster than the fastest "vanilla" solution (of those 
> mentioned).
>
> Of course, most of the time in real life, except with numerics, the 
> side-effect function would be much slower than in this example, so I expect 
> all of those examples to have similar performance. The actual issue becomes 
> memory allocation, and it seems to me that marsi0 asked the question with 
> this in mind.
>
> On Saturday, September 24, 2016 at 4:59:35 PM UTC+2, Francis Avila wrote:
>>
>> BTW I noticed that sequence got hotter and eventually became the fastest, 
>> but it took many more runs:
>>
>> (time (dorun (sequence (map side-effect) col1 col2)))
>> "Elapsed time: 31.321698 msecs"
>> => nil
>> (time (dorun (sequence (map side-effect) col1 col2)))
>> "Elapsed time: 15.492247 msecs"
>> => nil
>> (time (dorun (sequence (map side-effect) col1 col2)))
>> "Elapsed time: 10.9549 msecs"
>> => nil
>> (time (dorun (sequence (map side-effect) col1 col2)))
>> "Elapsed time: 9.122967 msecs"
>> => nil
>> (time (dorun (sequence (map side-effect) col1 col2)))
>> "Elapsed time: 18.056823 msecs"
>> => nil
>> (time (dorun (sequence (map side-effect) col1 col2)))
>> "Elapsed time: 9.381068 msecs"
>> => nil
>>
>>
>> This is likely close to the theoretical maximum (using loop+recur plus a 
>> mutable iterator over multiple colls). The only thing faster would get rid 
>> of the iterator, which would require using a memory-indexable data 
>> structure like an array, and now you are in core.matrix and neanderthal 
>> territory.
>>
>>
>> On Thursday, September 22, 2016 at 11:02:09 PM UTC-5, Mars0i wrote:
>>>
>>> This is almost the same as an issue I raised in this group over a year 
>>> and a half ago, here 
>>> .
>>>   
>>> I suggested that Clojure should include function with map's syntax but that 
>>> was executed only for side-effects, without constructing sequences.  No one 
>>> else was interested--no problem.   It's still bugging me.
>>>
>>> It's not map syntax that I care about at this point.  What's bugging me 
>>> is that there's no standard, built-in way to process multiple sequences for 
>>> side effects without (a) constructing unnecessary sequences or (b) rolling 
>>> my own function with loop/recur or something else.
>>>
>>> If I want to process multiple sequences for side-effects in the the way 
>>> that 'for' does, Clojure gives me 'doseq'.  Beautiful.  I can operate on 
>>> the cross product of the sequences, or filter them in various ways.
>>>
>>> If I want to process multiple sequences by applying an n-ary function to 
>>> the first element of each of n sequences, then to the second element of 
>>> each sequence, and so on, I can use 'map' or 'mapv', but that means 
>>> constructing unnecessary collections.
>>>
>>> Or I can splice my n sequences together, and make a single sequence of 
>>> n-tuples, and use doseq to process it.  (There's an example like this on 
>>> the doseq doc page.)  Or I can process such a sequence with 'reduce'.  More 
>>> unnecessary collections, though.
>>>
>>> Or I can use 'dotimes', and index into each of the collections, which is 
>>> OK if

Re: parallel sequence side-effect processor

2016-09-24 Thread Dragan Djuric
Francis,

The times you got also heavily depend on the actual side-effect function, 
which in this case is much faster when called with one arg, instead of with 
varargs, that fluokitten need here.

If we give fluokitten a function that does not create a sequence for 
multiple arguments, it is much faster than the fastest example that you 
give (although it does not do any numerical optimizations).

For example, your code runs the following times on my machine (measured by 
criterium):

(def v1 (vec (range 1)))

(def v2 (vec (range 1)))

(def side-effect (constantly nil))

(defn side-effect-2 [a b] nil)

(with-progress-reporting (quick-bench  (foldmap op nil side-effect-2 v1 v2
)))

Execution time mean : 290.822794 µs

(with-progress-reporting (quick-bench  (run! side-effect (mapv vector v1 v2


Execution time mean : 847.292924 µs

(defn side-effect-3 [a] nil)

(with-progress-reporting (quick-bench  (run! side-effect-3 (mapv vector v1 
v2

 Execution time mean : 847.554524 µs

So, in your previous example, all the "normal" clojure code seems to be 
optimally written, but fluokitten was handicapped by the side-effect's use 
of varargs for only 2 arguments. When side-effects is written "better, 
fluokitten is 3 times faster than the fastest "vanilla" solution (of those 
mentioned).

Of course, most of the time in real life, except with numerics, the 
side-effect function would be much slower than in this example, so I expect 
all of those examples to have similar performance. The actual issue becomes 
memory allocation, and it seems to me that marsi0 asked the question with 
this in mind.

On Saturday, September 24, 2016 at 4:59:35 PM UTC+2, Francis Avila wrote:
>
> BTW I noticed that sequence got hotter and eventually became the fastest, 
> but it took many more runs:
>
> (time (dorun (sequence (map side-effect) col1 col2)))
> "Elapsed time: 31.321698 msecs"
> => nil
> (time (dorun (sequence (map side-effect) col1 col2)))
> "Elapsed time: 15.492247 msecs"
> => nil
> (time (dorun (sequence (map side-effect) col1 col2)))
> "Elapsed time: 10.9549 msecs"
> => nil
> (time (dorun (sequence (map side-effect) col1 col2)))
> "Elapsed time: 9.122967 msecs"
> => nil
> (time (dorun (sequence (map side-effect) col1 col2)))
> "Elapsed time: 18.056823 msecs"
> => nil
> (time (dorun (sequence (map side-effect) col1 col2)))
> "Elapsed time: 9.381068 msecs"
> => nil
>
>
> This is likely close to the theoretical maximum (using loop+recur plus a 
> mutable iterator over multiple colls). The only thing faster would get rid 
> of the iterator, which would require using a memory-indexable data 
> structure like an array, and now you are in core.matrix and neanderthal 
> territory.
>
>
> On Thursday, September 22, 2016 at 11:02:09 PM UTC-5, Mars0i wrote:
>>
>> This is almost the same as an issue I raised in this group over a year 
>> and a half ago, here 
>> .
>>   
>> I suggested that Clojure should include function with map's syntax but that 
>> was executed only for side-effects, without constructing sequences.  No one 
>> else was interested--no problem.   It's still bugging me.
>>
>> It's not map syntax that I care about at this point.  What's bugging me 
>> is that there's no standard, built-in way to process multiple sequences for 
>> side effects without (a) constructing unnecessary sequences or (b) rolling 
>> my own function with loop/recur or something else.
>>
>> If I want to process multiple sequences for side-effects in the the way 
>> that 'for' does, Clojure gives me 'doseq'.  Beautiful.  I can operate on 
>> the cross product of the sequences, or filter them in various ways.
>>
>> If I want to process multiple sequences by applying an n-ary function to 
>> the first element of each of n sequences, then to the second element of 
>> each sequence, and so on, I can use 'map' or 'mapv', but that means 
>> constructing unnecessary collections.
>>
>> Or I can splice my n sequences together, and make a single sequence of 
>> n-tuples, and use doseq to process it.  (There's an example like this on 
>> the doseq doc page.)  Or I can process such a sequence with 'reduce'.  More 
>> unnecessary collections, though.
>>
>> Or I can use 'dotimes', and index into each of the collections, which is 
>> OK if they're vectors, but ... ugh. why?
>>
>> Or I can construct my own function using first and rest or first and next 
>> on each of my sequences, via loop/recur, for example.   But that seems odd 
>> to me.  
>>
>> Isn't this a common use case?  Is processing multiple sequences for 
>> side-effects with corresponding elements in each application so unusual?  
>> (Am I the only one?)  Isn't it odd that we have doseq and map but nothing 
>> that processes multiple sequences for side-effects, in sequence, rather 
>> than as a cross-product?  
>>
>> (I admit that in my current use case, the sequences are small, so 
>> 

Re: parallel sequence side-effect processor

2016-09-24 Thread Dragan Djuric
That's why I still think that in this particular case, there is no new
sequence creation (by fluokitten). yes, it does call first/next but those
do not require (significant) new memory or copying. They reuse the memory
of the underlying vectors, if I understood that well. Or there is something
else that you referred to, Tim?

Additional note is that it is the current implementation. If someone points
me to a better way to implement this case, I'll improve it. The API would
stay the same. foldmap, in any case, is more general than this use case,
and also does the folding (which is unneeded here, but very useful in
general).

And, as Francis said, if the priority is speed, I would use Neanderthal
here, or maybe even ClojureCL if side effects are also numerical and there
is enough data that GPU or CPU vectorization can make a difference.

On Saturday, September 24, 2016, Alex Miller  wrote:

> Just a general note of caution. range is a highly specialized (and
> optimized) impl for both reducing and seq traversal. Also note that seqs on
> vecs are about the next-most optimized seq (because they also don't
> allocate or cache - rather they just pull from the vector nodes as the
> backing store for elements). So just be cautious about making general
> conclusions about reduce and seq performance from these two specific cases.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> 
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> 
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "Clojure" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/clojure/Q5uZcYRdbgg/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> clojure+unsubscr...@googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: parallel sequence side-effect processor

2016-09-24 Thread Alex Miller
sequence taps into the underlying TransfomerIterator which applies a 
transducer while iterating across multiple iterators at the same time. 
sequence wraps that into a chunked sequence, but you could tap directly 
into TransformerIterator/createMulti without using the outer sequence. 

Nothing in Clojure does that right now. eduction is close but doesn't 
support multiple source colls. But you could either extend eduction or make 
something very similar that did the same and maybe that would be a 
reasonable extension in core.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: parallel sequence side-effect processor

2016-09-24 Thread Alex Miller
Just a general note of caution. range is a highly specialized (and 
optimized) impl for both reducing and seq traversal. Also note that seqs on 
vecs are about the next-most optimized seq (because they also don't 
allocate or cache - rather they just pull from the vector nodes as the 
backing store for elements). So just be cautious about making general 
conclusions about reduce and seq performance from these two specific cases.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: parallel sequence side-effect processor

2016-09-24 Thread Francis Avila
BTW I noticed that sequence got hotter and eventually became the fastest, 
but it took many more runs:

(time (dorun (sequence (map side-effect) col1 col2)))
"Elapsed time: 31.321698 msecs"
=> nil
(time (dorun (sequence (map side-effect) col1 col2)))
"Elapsed time: 15.492247 msecs"
=> nil
(time (dorun (sequence (map side-effect) col1 col2)))
"Elapsed time: 10.9549 msecs"
=> nil
(time (dorun (sequence (map side-effect) col1 col2)))
"Elapsed time: 9.122967 msecs"
=> nil
(time (dorun (sequence (map side-effect) col1 col2)))
"Elapsed time: 18.056823 msecs"
=> nil
(time (dorun (sequence (map side-effect) col1 col2)))
"Elapsed time: 9.381068 msecs"
=> nil


This is likely close to the theoretical maximum (using loop+recur plus a 
mutable iterator over multiple colls). The only thing faster would get rid 
of the iterator, which would require using a memory-indexable data 
structure like an array, and now you are in core.matrix and neanderthal 
territory.


On Thursday, September 22, 2016 at 11:02:09 PM UTC-5, Mars0i wrote:
>
> This is almost the same as an issue I raised in this group over a year and 
> a half ago, here 
> .
>   
> I suggested that Clojure should include function with map's syntax but that 
> was executed only for side-effects, without constructing sequences.  No one 
> else was interested--no problem.   It's still bugging me.
>
> It's not map syntax that I care about at this point.  What's bugging me is 
> that there's no standard, built-in way to process multiple sequences for 
> side effects without (a) constructing unnecessary sequences or (b) rolling 
> my own function with loop/recur or something else.
>
> If I want to process multiple sequences for side-effects in the the way 
> that 'for' does, Clojure gives me 'doseq'.  Beautiful.  I can operate on 
> the cross product of the sequences, or filter them in various ways.
>
> If I want to process multiple sequences by applying an n-ary function to 
> the first element of each of n sequences, then to the second element of 
> each sequence, and so on, I can use 'map' or 'mapv', but that means 
> constructing unnecessary collections.
>
> Or I can splice my n sequences together, and make a single sequence of 
> n-tuples, and use doseq to process it.  (There's an example like this on 
> the doseq doc page.)  Or I can process such a sequence with 'reduce'.  More 
> unnecessary collections, though.
>
> Or I can use 'dotimes', and index into each of the collections, which is 
> OK if they're vectors, but ... ugh. why?
>
> Or I can construct my own function using first and rest or first and next 
> on each of my sequences, via loop/recur, for example.   But that seems odd 
> to me.  
>
> Isn't this a common use case?  Is processing multiple sequences for 
> side-effects with corresponding elements in each application so unusual?  
> (Am I the only one?)  Isn't it odd that we have doseq and map but nothing 
> that processes multiple sequences for side-effects, in sequence, rather 
> than as a cross-product?  
>
> (I admit that in my current use case, the sequences are small, so creating 
> a sequence of n-tuples would have only a trivial cost.  It just bugs me, 
> though. :-)
>
> I'd still be OK with something that had a map-like syntax, but my current 
> inclination is that it would be better for such a function to have 
> doseq-style syntax.  The name might be derived from "doseq"--maybe 
> "doseq*". 
>
> (I'd be willing to suggest this in a JIRA ticket, but that doesn't seem 
> reasonable unless there's a call for something like this from more than one 
> person.)
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: parallel sequence side-effect processor

2016-09-24 Thread Francis Avila
Well, you pay a cost whenever you use seqs instead of reduce, so it/s 
strange to think of doseq as "as fast as possible". If it/s argument is a 
true collection, its IReduce is usually faster than its seq. If it is a 
sequence already, its IReduce is usually just using seqs anyway.

Let's get some rough numbers here (which is what you should do on a 
case-by-case basis):

(require  '[uncomplicate.fluokitten.core :as f])
=> nil
(def side-effect (constantly nil))
=> #'user/side-effect
(def col1 (range 1))
=> #'user/col1
(def col2 (range 1))
=> #'user/col2
(def col2 (vec (range 1)))
=> #'user/col2
(def col1 (vec (range 1)))
=> #'user/col1

Single-coll:

(time (dotimes [_ 10] (doseq [x col1] (side-effect x
"Elapsed time: 13.916077 msecs"
=> nil
(time (dotimes [_ 10] (dorun (map side-effect col1
"Elapsed time: 5.707441 msecs"
=> nil
(time (dotimes [_ 10] (run! side-effect col1)))
"Elapsed time: 1.190621 msecs"
=> nil


Notice there seems to be some overhead to doseq that dorun doesn't have, so 
actually map+dorun is faster (for this particular coll). And run! is the 
fastest because a vector's IReduce is significantly faster than its seq.

Multi-coll:

(time (dorun (map side-effect col1 col2)))
"Elapsed time: 13.194375 msecs"
=> nil
(time (run! side-effect (map vector col1 col2)))
"Elapsed time: 17.54224 msecs"
=> nil
(time (run! side-effect (mapv vector col1 col2)))
"Elapsed time: 3.892984 msecs"
=> nil
(time (f/foldmap f/op nil side-effect col1 col2))
"Elapsed time: 12.454673 msecs"
=> nil
(time (dorun (sequence (map side-effect) col1 col2)))
"Elapsed time: 31.321698 msecs"
=> nil

Interesting results.

So dorun+map is still pretty good. foldmap wins out a little bit, but is 
still clearly using seqs underneath.

run! over a seq is not better than just consuming the seq. This is because 
the IReduce of a seq uses first/next internally, so there is no IReduce win.

However, run! over a vector is faster (for this collection, at this size) *even 
though we are creating a bunch more intermediate collections!*. The IReduce 
of a vector is really *that much faster* than its seq! So if you really 
need speed and don't care about laziness or memory, it may be faster just 
to make the intermediate collections-of-collections and reduce over it. 
Benchmark your particular hotspot!

And even though sequence uses iterators internally, there is clearly still 
some extra overhead involved.


On Friday, September 23, 2016 at 10:58:42 PM UTC-5, Mars0i wrote:
>
> Thanks very much Francis.  
>
> So it sounds as if, if my data would already be in a vector, list or 
> sequence (maybe a lazy sequence), doseq will process that structure as fast 
> as possible, but there are other structures that might have faster internal 
> reduce operations.
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: parallel sequence side-effect processor

2016-09-24 Thread Timothy Baldridge
Francis is correct, and I want to make a point about something he said. If
you are mapping over two collections at once, you will either be using Java
iterators, or you will be creating lazy sequences when you walk over the
input collections. Sure, how many lazy sequences are create here and there
can be tweaked, but it is *impossible* on the JVM to walk two data
structures at the same time and not build sequences unless you drop to Java
iterators. So I have to say yet again "No extra sequence building
involved," is not technically true. Fluokitten's functions are calling
next/first on a vector which causes the creation of a lazy sequence, sure
there may be one less allocation (no extra sequence is being produced as
the output of this operation), but the statement that it does "NOT create
intermediate collections" is false.

There are ways we could speed up iteration over collections, but in the end
I recommend benchmarking. Unless your collection is a set of primitive
numbers the overhead of other operations (assoc, get, etc) is going to be
much more expensive than simply walking a lazy seq. In my experience, and
depending on the domain, optimizing lazy seq creation is almost always
optimizing in the wrong place. There are places where switching to the
iteration methods improves performance, but benchmark your code and see
before adopting any non-standard methods wholesale.

Timothy

On Sat, Sep 24, 2016 at 1:00 AM, Dragan Djuric  wrote:

> Now I am on the REPL, and the solution is straightforward:
>
> (foldmap op nil println [1 2 3] [4 5 6])
>
> gives:
>
> 1 4
> 2 5
> 3 6
> nil
>
> The first function is a folding function. In this case we can use op, a
> monoid operation. Since nil is also a monoid, everything will be folded to
> nil. The second part is a function that maps inputs to the monoid that we
> need (nil in this case). println does just this, and does the side effect
> that you asked for. No extra sequence building involved, and we do not need
> to write extra functions.
>
> On Saturday, September 24, 2016 at 2:42:16 AM UTC+2, Mars0i wrote:
>>
>> Thanks everyone--I'm very much appreciating the discussion, though I'm
>> not sure I follow every point.
>>
>> Dragan, thank you very much for suggesting (and writing) foldmap.  Very
>> nice.  I certainly don't mind using a library, though I still think there
>> ought to be something like what I've described in the language core.
>>
>> Francis, thanks for separating out the different types of intermediate
>> collections.  I'm not entirely clear on type 1.  It sounds like those are
>> just collections that are already there before processing them.  Or are you
>> saying that Clojure has to convert them to a seq as such?  Why would doseq
>> have to do that, for example?
>>
>> It's an understatement to say that my understanding of monads, monoids,
>> etc. is weak.   Haven't used Haskell in years, and I never understood
>> monads, etc..  They seem trivial or/and or deeply mysterious.  One day I'll
>> have to sit down and figure it out.  For foldmap, I don't understand what
>> the first function argument to foldmap is supposed to do, but it's easy
>> enough to put something there that does nothing.
>>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>



-- 
“One of the main causes of the fall of the Roman Empire was that–lacking
zero–they had no way to indicate successful termination of their C
programs.”
(Robert Firth)

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: parallel sequence side-effect processor

2016-09-24 Thread Dragan Djuric
Now I am on the REPL, and the solution is straightforward:

(foldmap op nil println [1 2 3] [4 5 6])

gives:

1 4
2 5
3 6
nil

The first function is a folding function. In this case we can use op, a 
monoid operation. Since nil is also a monoid, everything will be folded to 
nil. The second part is a function that maps inputs to the monoid that we 
need (nil in this case). println does just this, and does the side effect 
that you asked for. No extra sequence building involved, and we do not need 
to write extra functions. 

On Saturday, September 24, 2016 at 2:42:16 AM UTC+2, Mars0i wrote:
>
> Thanks everyone--I'm very much appreciating the discussion, though I'm not 
> sure I follow every point.  
>
> Dragan, thank you very much for suggesting (and writing) foldmap.  Very 
> nice.  I certainly don't mind using a library, though I still think there 
> ought to be something like what I've described in the language core.
>
> Francis, thanks for separating out the different types of intermediate 
> collections.  I'm not entirely clear on type 1.  It sounds like those are 
> just collections that are already there before processing them.  Or are you 
> saying that Clojure has to convert them to a seq as such?  Why would doseq 
> have to do that, for example?
>
> It's an understatement to say that my understanding of monads, monoids, 
> etc. is weak.   Haven't used Haskell in years, and I never understood 
> monads, etc..  They seem trivial or/and or deeply mysterious.  One day I'll 
> have to sit down and figure it out.  For foldmap, I don't understand what 
> the first function argument to foldmap is supposed to do, but it's easy 
> enough to put something there that does nothing.
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: parallel sequence side-effect processor

2016-09-23 Thread Mars0i
Thanks very much Francis.  

So it sounds as if, if my data would already be in a vector, list or 
sequence (maybe a lazy sequence), doseq will process that structure as fast 
as possible, but there are other structures that might have faster internal 
reduce operations.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: parallel sequence side-effect processor

2016-09-23 Thread Francis Avila

> Francis, thanks for separating out the different types of intermediate 
> collections.  I'm not entirely clear on type 1.  It sounds like those are 
> just collections that are already there before processing them.  Or are you 
> saying that Clojure has to convert them to a seq as such?  Why would doseq 
> have to do that, for example?

There are two abstractions for working over a collection in clojure: sequences 
(i.e. Something that implements first and rest) and reducibles (something that 
implements IReduce or CollReduce).

Some functions use sequences as their basic interface to a collection: this 
includes map, doseq, etc (all the "traditional" clojure functions). Anything 
that calls first, rest, next, or seq during its operation (including most 
hand-rolled loop-recurs!) is using sequences. If the type is not already a 
sequence, then a sequence object needs to be created over it which knows how to 
walk it. This is an extra intermediate collection. What precisely happens 
depends on the input type, but it almost always involves additional object 
allocations. For efficiency

Other functions use reduction as their abstraction via IReduce, which allows a 
collection to reduce over itself. This is generally much more efficient because 
the type can walk over its own internal structures efficiently and does not 
need to provide the laziness or immutability guarantees of sequences: it can 
use (invisible) mutation internally. But you lose control over the "pace" of 
the reduction: you can't pause it in the middle or access any specific item.

Finally, Java has an Iterator interface, which is not used much in clojure 
because it is mutable.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: parallel sequence side-effect processor

2016-09-23 Thread Mars0i
Thanks everyone--I'm very much appreciating the discussion, though I'm not 
sure I follow every point.  

Dragan, thank you very much for suggesting (and writing) foldmap.  Very 
nice.  I certainly don't mind using a library, though I still think there 
ought to be something like what I've described in the language core.

Francis, thanks for separating out the different types of intermediate 
collections.  I'm not entirely clear on type 1.  It sounds like those are 
just collections that are already there before processing them.  Or are you 
saying that Clojure has to convert them to a seq as such?  Why would doseq 
have to do that, for example?

It's an understatement to say that my understanding of monads, monoids, 
etc. is weak.   Haven't used Haskell in years, and I never understood 
monads, etc..  They seem trivial or/and or deeply mysterious.  One day I'll 
have to sit down and figure it out.  For foldmap, I don't understand what 
the first function argument to foldmap is supposed to do, but it's easy 
enough to put something there that does nothing.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: parallel sequence side-effect processor

2016-09-23 Thread Dragan Djuric
There are a few typos and defn is missing from fb in my last message, but I 
hope it is still readable. Sorry, I am typing this on a mobile device while 
watching the last episode of The Man in the High Castle :) Also, I am 
talking about the code I wrote years ago from the top of my mind without 
access to the repl :)

On Saturday, September 24, 2016 at 1:44:10 AM UTC+2, Dragan Djuric wrote:
>
> A couple of things:
> 1. How fold/foldmap and any other function works, depends on the actual 
> type. For example, if you look at 
> https://github.com/uncomplicate/neanderthal/blob/master/src/clojure/uncomplicate/neanderthal/impl/fluokitten.clj#L396
>  
> you can see that there no intermediate allocations, and everything is 
> primitive.
> 2. Now, if you give foldmap a sequence (or a vector), it goes to the 
> implementation that you pointed to. Now, the difference from map: if I 
> understand well, map would produce an unnecessary resulting sequence. 
> foldmap does not. your accumulating function does need to return 1, or 
> sequences - why not return nil? Also, the accumulator is nil, and can use 
> any dummy function that just nils everything. There is only a matter of 
> calling first/next. Do they really produce any new instance objects? That 
> depends on the implementation of seq, I believe, but it's the same even if 
> we used loop/recur, I believe? 
> 3. The sequences in your printout results are the result of how clojure 
> treat varargs, or I am missing something. So, if I give it a function such 
> as (fb [_ a b] (println a b)), what is exactly allocated, that is not 
> allocated even when using loop/recur directly with first/next?
>
> On Saturday, September 24, 2016 at 1:25:14 AM UTC+2, tbc++ wrote:
>>
>> Yeah, I have to call you out on this one Dragan. I ran the following 
>> code: 
>>
>> (ns fold-test
>> (:require [uncomplicate.fluokitten.core :refer [foldmap]]))
>>
>> (defn fa [& args]
>> (println "fa " args)
>> 1)
>>
>> (defn fb [& args]
>> (println "fb " args)
>> 1)
>>
>> (defn test-fold []
>> (foldmap fa nil fb [1 2 3] [4 5 6]))
>>
>> (test-fold)
>>
>>
>> This code produced: 
>>
>> fb  (1 4)
>> fa  (nil 1)
>> fb  (2 5)
>> fa  (1 1)
>> fb  (3 6)
>> fa  (1 1)
>>
>> So I put a breakpoint in `fb` and ran it again. The stacktrace says it 
>> ends up in algo/collection-foldmap which we can see here: 
>> https://github.com/uncomplicate/fluokitten/blob/master/src/uncomplicate/fluokitten/algo.clj#L415-L443
>>
>> That function is creating seqs out of all its arguments! So it really is 
>> not better than clojure.core/map as far as allocation is concerned. 
>>
>> Timothy
>>
>> On Fri, Sep 23, 2016 at 5:15 PM, Francis Avila  
>> wrote:
>>
>>> There are a few intermediate collections here:
>>>
>>>
>>>1. The source coll may produce a seq object. How costly this is 
>>>depends on the type of coll and the quality of its iterator/ireduce/seq 
>>>implementations.
>>>2. You may need to collect multiple source colls into a tuple-like 
>>>thing to produce a single object for the side-effecting function
>>>3. You may have an intermediate seq/coll of these tuple-like things.
>>>4. You may have a useless seq/coll of "output" from the 
>>>side-effecting function
>>>
>>> In the single-coll case:
>>>
>>> (map f col1) pays 1,4.
>>> (doseq [x col1] (f x)) pays 1.
>>> (run! f col1) pays 1 if coll has an inefficient IReduce, otherwise it 
>>> pays nothing.
>>> (fold f col1) is the same (using reducers r/fold protocol for vectors, 
>>> which ultimately uses IReduce)
>>>
>>> In the multi-coll case:
>>>
>>> (map f coll1 col2) pays all four. 
>>> (run! (fn [[a b]] (f a b)) (map vector col1 col2)) pays 1, 2, and 3.
>>> (doseq [[a b] (map vector col1 col2)] (f a b)) pays 1, 2, 3.
>>> (fold f col1 col2) pays 1 from what I can see? (It uses first+next to 
>>> walk over the items stepwise? There's a lot of indirection so I'm not 100% 
>>> sure what the impl is for vectors that actually gets used.)
>>>
>>> There is no way to avoid 1 in the multi-step case (or 2 if you are fully 
>>> variadic), all you can do is use the most efficient-possible intermediate 
>>> object to track the traversal. Iterators are typically cheaper than seqs, 
>>> so the ideal case would be a loop-recur over multiple iterators.
>>>
>>> In the multi-coll case there is also no way IReduce can help. IReduce is 
>>> a trade: you give up the power to see each step of iteration in order to 
>>> allow the collection to perform the overall reduction operation more 
>>> efficiently. However with multi-coll you really do need to control the 
>>> iteration so you can get all the items at an index together.
>>>
>>> The ideal for multi-collection would probably be something that 
>>> internally looks like clojure.core/sequence but doesn't accumulate the 
>>> results. (Unfortunately some of the classes necessary to do this 
>>> (MultiIterator) are private.)
>>>
>>> Fluokitten could probably do it with some tweaking to its 
>>> algo/collectio

Re: parallel sequence side-effect processor

2016-09-23 Thread Dragan Djuric
A couple of things:
1. How fold/foldmap and any other function works, depends on the actual 
type. For example, if you look 
at 
https://github.com/uncomplicate/neanderthal/blob/master/src/clojure/uncomplicate/neanderthal/impl/fluokitten.clj#L396
 
you can see that there no intermediate allocations, and everything is 
primitive.
2. Now, if you give foldmap a sequence (or a vector), it goes to the 
implementation that you pointed to. Now, the difference from map: if I 
understand well, map would produce an unnecessary resulting sequence. 
foldmap does not. your accumulating function does need to return 1, or 
sequences - why not return nil? Also, the accumulator is nil, and can use 
any dummy function that just nils everything. There is only a matter of 
calling first/next. Do they really produce any new instance objects? That 
depends on the implementation of seq, I believe, but it's the same even if 
we used loop/recur, I believe? 
3. The sequences in your printout results are the result of how clojure 
treat varargs, or I am missing something. So, if I give it a function such 
as (fb [_ a b] (println a b)), what is exactly allocated, that is not 
allocated even when using loop/recur directly with first/next?

On Saturday, September 24, 2016 at 1:25:14 AM UTC+2, tbc++ wrote:
>
> Yeah, I have to call you out on this one Dragan. I ran the following code: 
>
> (ns fold-test
> (:require [uncomplicate.fluokitten.core :refer [foldmap]]))
>
> (defn fa [& args]
> (println "fa " args)
> 1)
>
> (defn fb [& args]
> (println "fb " args)
> 1)
>
> (defn test-fold []
> (foldmap fa nil fb [1 2 3] [4 5 6]))
>
> (test-fold)
>
>
> This code produced: 
>
> fb  (1 4)
> fa  (nil 1)
> fb  (2 5)
> fa  (1 1)
> fb  (3 6)
> fa  (1 1)
>
> So I put a breakpoint in `fb` and ran it again. The stacktrace says it 
> ends up in algo/collection-foldmap which we can see here: 
> https://github.com/uncomplicate/fluokitten/blob/master/src/uncomplicate/fluokitten/algo.clj#L415-L443
>
> That function is creating seqs out of all its arguments! So it really is 
> not better than clojure.core/map as far as allocation is concerned. 
>
> Timothy
>
> On Fri, Sep 23, 2016 at 5:15 PM, Francis Avila  > wrote:
>
>> There are a few intermediate collections here:
>>
>>
>>1. The source coll may produce a seq object. How costly this is 
>>depends on the type of coll and the quality of its iterator/ireduce/seq 
>>implementations.
>>2. You may need to collect multiple source colls into a tuple-like 
>>thing to produce a single object for the side-effecting function
>>3. You may have an intermediate seq/coll of these tuple-like things.
>>4. You may have a useless seq/coll of "output" from the 
>>side-effecting function
>>
>> In the single-coll case:
>>
>> (map f col1) pays 1,4.
>> (doseq [x col1] (f x)) pays 1.
>> (run! f col1) pays 1 if coll has an inefficient IReduce, otherwise it 
>> pays nothing.
>> (fold f col1) is the same (using reducers r/fold protocol for vectors, 
>> which ultimately uses IReduce)
>>
>> In the multi-coll case:
>>
>> (map f coll1 col2) pays all four. 
>> (run! (fn [[a b]] (f a b)) (map vector col1 col2)) pays 1, 2, and 3.
>> (doseq [[a b] (map vector col1 col2)] (f a b)) pays 1, 2, 3.
>> (fold f col1 col2) pays 1 from what I can see? (It uses first+next to 
>> walk over the items stepwise? There's a lot of indirection so I'm not 100% 
>> sure what the impl is for vectors that actually gets used.)
>>
>> There is no way to avoid 1 in the multi-step case (or 2 if you are fully 
>> variadic), all you can do is use the most efficient-possible intermediate 
>> object to track the traversal. Iterators are typically cheaper than seqs, 
>> so the ideal case would be a loop-recur over multiple iterators.
>>
>> In the multi-coll case there is also no way IReduce can help. IReduce is 
>> a trade: you give up the power to see each step of iteration in order to 
>> allow the collection to perform the overall reduction operation more 
>> efficiently. However with multi-coll you really do need to control the 
>> iteration so you can get all the items at an index together.
>>
>> The ideal for multi-collection would probably be something that 
>> internally looks like clojure.core/sequence but doesn't accumulate the 
>> results. (Unfortunately some of the classes necessary to do this 
>> (MultiIterator) are private.)
>>
>> Fluokitten could probably do it with some tweaking to its 
>> algo/collection-foldmap to use iterators where possible instead of 
>> first/next.
>>
>>
>> On Friday, September 23, 2016 at 5:23:51 PM UTC-5, Dragan Djuric wrote:
>>>
>>> fluokitten's fold is MUCH better than (map f a b) because it does NOT 
>>> create intermediate collections. just use (fold f a b) and it would fold 
>>> everything into one thing (in this case nil). If f is a function with side 
>>> effects, it will invoke them. No intermediate collection is created AND the 
>>> folding would be optimized per the type of a.
>>>
>>> On Frida

Re: parallel sequence side-effect processor

2016-09-23 Thread Francis Avila
Well, hold on, it is creating seqs for its input collection (allocation 
item 1 in my list), but it is still better than map because it is not 
making an intermediate lazy-seqs of output (item 4).

(I must amend my earlier message,  (map f coll1 col2) only pays 1 and 4)

On Friday, September 23, 2016 at 6:25:14 PM UTC-5, tbc++ wrote:
>
> Yeah, I have to call you out on this one Dragan. I ran the following code: 
>
> (ns fold-test
> (:require [uncomplicate.fluokitten.core :refer [foldmap]]))
>
> (defn fa [& args]
> (println "fa " args)
> 1)
>
> (defn fb [& args]
> (println "fb " args)
> 1)
>
> (defn test-fold []
> (foldmap fa nil fb [1 2 3] [4 5 6]))
>
> (test-fold)
>
>
> This code produced: 
>
> fb  (1 4)
> fa  (nil 1)
> fb  (2 5)
> fa  (1 1)
> fb  (3 6)
> fa  (1 1)
>
> So I put a breakpoint in `fb` and ran it again. The stacktrace says it 
> ends up in algo/collection-foldmap which we can see here: 
> https://github.com/uncomplicate/fluokitten/blob/master/src/uncomplicate/fluokitten/algo.clj#L415-L443
>
> That function is creating seqs out of all its arguments! So it really is 
> not better than clojure.core/map as far as allocation is concerned. 
>
> Timothy
>
> On Fri, Sep 23, 2016 at 5:15 PM, Francis Avila  > wrote:
>
>> There are a few intermediate collections here:
>>
>>
>>1. The source coll may produce a seq object. How costly this is 
>>depends on the type of coll and the quality of its iterator/ireduce/seq 
>>implementations.
>>2. You may need to collect multiple source colls into a tuple-like 
>>thing to produce a single object for the side-effecting function
>>3. You may have an intermediate seq/coll of these tuple-like things.
>>4. You may have a useless seq/coll of "output" from the 
>>side-effecting function
>>
>> In the single-coll case:
>>
>> (map f col1) pays 1,4.
>> (doseq [x col1] (f x)) pays 1.
>> (run! f col1) pays 1 if coll has an inefficient IReduce, otherwise it 
>> pays nothing.
>> (fold f col1) is the same (using reducers r/fold protocol for vectors, 
>> which ultimately uses IReduce)
>>
>> In the multi-coll case:
>>
>> (map f coll1 col2) pays all four. 
>> (run! (fn [[a b]] (f a b)) (map vector col1 col2)) pays 1, 2, and 3.
>> (doseq [[a b] (map vector col1 col2)] (f a b)) pays 1, 2, 3.
>> (fold f col1 col2) pays 1 from what I can see? (It uses first+next to 
>> walk over the items stepwise? There's a lot of indirection so I'm not 100% 
>> sure what the impl is for vectors that actually gets used.)
>>
>> There is no way to avoid 1 in the multi-step case (or 2 if you are fully 
>> variadic), all you can do is use the most efficient-possible intermediate 
>> object to track the traversal. Iterators are typically cheaper than seqs, 
>> so the ideal case would be a loop-recur over multiple iterators.
>>
>> In the multi-coll case there is also no way IReduce can help. IReduce is 
>> a trade: you give up the power to see each step of iteration in order to 
>> allow the collection to perform the overall reduction operation more 
>> efficiently. However with multi-coll you really do need to control the 
>> iteration so you can get all the items at an index together.
>>
>> The ideal for multi-collection would probably be something that 
>> internally looks like clojure.core/sequence but doesn't accumulate the 
>> results. (Unfortunately some of the classes necessary to do this 
>> (MultiIterator) are private.)
>>
>> Fluokitten could probably do it with some tweaking to its 
>> algo/collection-foldmap to use iterators where possible instead of 
>> first/next.
>>
>>
>> On Friday, September 23, 2016 at 5:23:51 PM UTC-5, Dragan Djuric wrote:
>>>
>>> fluokitten's fold is MUCH better than (map f a b) because it does NOT 
>>> create intermediate collections. just use (fold f a b) and it would fold 
>>> everything into one thing (in this case nil). If f is a function with side 
>>> effects, it will invoke them. No intermediate collection is created AND the 
>>> folding would be optimized per the type of a.
>>>
>>> On Friday, September 23, 2016 at 10:56:00 PM UTC+2, tbc++ wrote:

 How is fluokitten's fold any better than using seqs like (map f a b) 
 would? Both create intermediate collections.

 On Fri, Sep 23, 2016 at 11:40 AM, Dragan Djuric  
 wrote:

> If you do not insist on vanilla clojure, but can use a library, fold 
> from fluokitten might enable you to do this. It is similar to reduce, but 
> accepts multiple arguments. Give it a vararg folding function that prints 
> what you need and ignores the first parameter, and you'd get what you 
> asked 
> for.
>
>
> On Friday, September 23, 2016 at 7:15:42 PM UTC+2, Mars0i wrote:
>>
>> On Friday, September 23, 2016 at 11:11:07 AM UTC-5, Alan Thompson 
>> wrote:
>>>
>>> ​Huh.  I was also unaware of the run! function.​
>>>
>>> I suppose you could always write it like this:
>>>
>>> (def x (vec (range 3)))
>>> (def

Re: parallel sequence side-effect processor

2016-09-23 Thread Timothy Baldridge
Yeah, I have to call you out on this one Dragan. I ran the following code:

(ns fold-test
(:require [uncomplicate.fluokitten.core :refer [foldmap]]))

(defn fa [& args]
(println "fa " args)
1)

(defn fb [& args]
(println "fb " args)
1)

(defn test-fold []
(foldmap fa nil fb [1 2 3] [4 5 6]))

(test-fold)


This code produced:

fb  (1 4)
fa  (nil 1)
fb  (2 5)
fa  (1 1)
fb  (3 6)
fa  (1 1)

So I put a breakpoint in `fb` and ran it again. The stacktrace says it ends
up in algo/collection-foldmap which we can see here:
https://github.com/uncomplicate/fluokitten/blob/master/src/uncomplicate/fluokitten/algo.clj#L415-L443

That function is creating seqs out of all its arguments! So it really is
not better than clojure.core/map as far as allocation is concerned.

Timothy

On Fri, Sep 23, 2016 at 5:15 PM, Francis Avila  wrote:

> There are a few intermediate collections here:
>
>
>1. The source coll may produce a seq object. How costly this is
>depends on the type of coll and the quality of its iterator/ireduce/seq
>implementations.
>2. You may need to collect multiple source colls into a tuple-like
>thing to produce a single object for the side-effecting function
>3. You may have an intermediate seq/coll of these tuple-like things.
>4. You may have a useless seq/coll of "output" from the side-effecting
>function
>
> In the single-coll case:
>
> (map f col1) pays 1,4.
> (doseq [x col1] (f x)) pays 1.
> (run! f col1) pays 1 if coll has an inefficient IReduce, otherwise it pays
> nothing.
> (fold f col1) is the same (using reducers r/fold protocol for vectors,
> which ultimately uses IReduce)
>
> In the multi-coll case:
>
> (map f coll1 col2) pays all four.
> (run! (fn [[a b]] (f a b)) (map vector col1 col2)) pays 1, 2, and 3.
> (doseq [[a b] (map vector col1 col2)] (f a b)) pays 1, 2, 3.
> (fold f col1 col2) pays 1 from what I can see? (It uses first+next to walk
> over the items stepwise? There's a lot of indirection so I'm not 100% sure
> what the impl is for vectors that actually gets used.)
>
> There is no way to avoid 1 in the multi-step case (or 2 if you are fully
> variadic), all you can do is use the most efficient-possible intermediate
> object to track the traversal. Iterators are typically cheaper than seqs,
> so the ideal case would be a loop-recur over multiple iterators.
>
> In the multi-coll case there is also no way IReduce can help. IReduce is a
> trade: you give up the power to see each step of iteration in order to
> allow the collection to perform the overall reduction operation more
> efficiently. However with multi-coll you really do need to control the
> iteration so you can get all the items at an index together.
>
> The ideal for multi-collection would probably be something that internally
> looks like clojure.core/sequence but doesn't accumulate the results.
> (Unfortunately some of the classes necessary to do this (MultiIterator) are
> private.)
>
> Fluokitten could probably do it with some tweaking to its
> algo/collection-foldmap to use iterators where possible instead of
> first/next.
>
>
> On Friday, September 23, 2016 at 5:23:51 PM UTC-5, Dragan Djuric wrote:
>>
>> fluokitten's fold is MUCH better than (map f a b) because it does NOT
>> create intermediate collections. just use (fold f a b) and it would fold
>> everything into one thing (in this case nil). If f is a function with side
>> effects, it will invoke them. No intermediate collection is created AND the
>> folding would be optimized per the type of a.
>>
>> On Friday, September 23, 2016 at 10:56:00 PM UTC+2, tbc++ wrote:
>>>
>>> How is fluokitten's fold any better than using seqs like (map f a b)
>>> would? Both create intermediate collections.
>>>
>>> On Fri, Sep 23, 2016 at 11:40 AM, Dragan Djuric 
>>> wrote:
>>>
 If you do not insist on vanilla clojure, but can use a library, fold
 from fluokitten might enable you to do this. It is similar to reduce, but
 accepts multiple arguments. Give it a vararg folding function that prints
 what you need and ignores the first parameter, and you'd get what you asked
 for.


 On Friday, September 23, 2016 at 7:15:42 PM UTC+2, Mars0i wrote:
>
> On Friday, September 23, 2016 at 11:11:07 AM UTC-5, Alan Thompson
> wrote:
>>
>> ​Huh.  I was also unaware of the run! function.​
>>
>> I suppose you could always write it like this:
>>
>> (def x (vec (range 3)))
>> (def y (vec (reverse x)))
>>
>> (run!
>>   (fn [[x y]] (println x y))
>>
>>   (map vector x y))
>>
>>
>>  > lein run
>> 0 2
>> 1 1
>> 2 0
>>
>>
> Yes.  But that's got the same problem.  Doesn't matter with a toy
> example, but the (map vector ...) could be undesirable with large
> collections in performance-critical code.
>
> although the plain old for loop with dotimes looks simpler:
>>
>> (dotimes [i (count x) ]
>>   (println (x i) (y i)))
>>
>>

Re: parallel sequence side-effect processor

2016-09-23 Thread Francis Avila
There are a few intermediate collections here:


   1. The source coll may produce a seq object. How costly this is depends 
   on the type of coll and the quality of its iterator/ireduce/seq 
   implementations.
   2. You may need to collect multiple source colls into a tuple-like thing 
   to produce a single object for the side-effecting function
   3. You may have an intermediate seq/coll of these tuple-like things.
   4. You may have a useless seq/coll of "output" from the side-effecting 
   function

In the single-coll case:

(map f col1) pays 1,4.
(doseq [x col1] (f x)) pays 1.
(run! f col1) pays 1 if coll has an inefficient IReduce, otherwise it pays 
nothing.
(fold f col1) is the same (using reducers r/fold protocol for vectors, 
which ultimately uses IReduce)

In the multi-coll case:

(map f coll1 col2) pays all four. 
(run! (fn [[a b]] (f a b)) (map vector col1 col2)) pays 1, 2, and 3.
(doseq [[a b] (map vector col1 col2)] (f a b)) pays 1, 2, 3.
(fold f col1 col2) pays 1 from what I can see? (It uses first+next to walk 
over the items stepwise? There's a lot of indirection so I'm not 100% sure 
what the impl is for vectors that actually gets used.)

There is no way to avoid 1 in the multi-step case (or 2 if you are fully 
variadic), all you can do is use the most efficient-possible intermediate 
object to track the traversal. Iterators are typically cheaper than seqs, 
so the ideal case would be a loop-recur over multiple iterators.

In the multi-coll case there is also no way IReduce can help. IReduce is a 
trade: you give up the power to see each step of iteration in order to 
allow the collection to perform the overall reduction operation more 
efficiently. However with multi-coll you really do need to control the 
iteration so you can get all the items at an index together.

The ideal for multi-collection would probably be something that internally 
looks like clojure.core/sequence but doesn't accumulate the results. 
(Unfortunately some of the classes necessary to do this (MultiIterator) are 
private.)

Fluokitten could probably do it with some tweaking to its 
algo/collection-foldmap to use iterators where possible instead of 
first/next.


On Friday, September 23, 2016 at 5:23:51 PM UTC-5, Dragan Djuric wrote:
>
> fluokitten's fold is MUCH better than (map f a b) because it does NOT 
> create intermediate collections. just use (fold f a b) and it would fold 
> everything into one thing (in this case nil). If f is a function with side 
> effects, it will invoke them. No intermediate collection is created AND the 
> folding would be optimized per the type of a.
>
> On Friday, September 23, 2016 at 10:56:00 PM UTC+2, tbc++ wrote:
>>
>> How is fluokitten's fold any better than using seqs like (map f a b) 
>> would? Both create intermediate collections.
>>
>> On Fri, Sep 23, 2016 at 11:40 AM, Dragan Djuric  
>> wrote:
>>
>>> If you do not insist on vanilla clojure, but can use a library, fold 
>>> from fluokitten might enable you to do this. It is similar to reduce, but 
>>> accepts multiple arguments. Give it a vararg folding function that prints 
>>> what you need and ignores the first parameter, and you'd get what you asked 
>>> for.
>>>
>>>
>>> On Friday, September 23, 2016 at 7:15:42 PM UTC+2, Mars0i wrote:

 On Friday, September 23, 2016 at 11:11:07 AM UTC-5, Alan Thompson wrote:
>
> ​Huh.  I was also unaware of the run! function.​
>
> I suppose you could always write it like this:
>
> (def x (vec (range 3)))
> (def y (vec (reverse x)))
>
> (run!
>   (fn [[x y]] (println x y))
>
>   (map vector x y))
>
>
>  > lein run
> 0 2
> 1 1
> 2 0
>
>
 Yes.  But that's got the same problem.  Doesn't matter with a toy 
 example, but the (map vector ...) could be undesirable with large 
 collections in performance-critical code.

 although the plain old for loop with dotimes looks simpler:
>
> (dotimes [i (count x) ]
>   (println (x i) (y i)))
>
>
> maybe that is the best answer? It is hard to beat the flexibility of a 
> a loop and an explicit index.
>

 I agree that this is clearer, but it kind of bothers me to index 
 through a vector sequentially in Clojure.  We need indexing In Clojure 
 because sometimes you need to access a vector more arbitrarily.  If you're 
 just walking the vector in order, we have better methods--as long as we 
 don't want to walk multiple vectors in the same order for side effects.

 However, the real drawback of the dotimes method is that it's not 
 efficient for the general case; it could be slow on lists, lazy sequences, 
 etc. (again, on non-toy examples).  Many of the most convenient Clojure 
 functions return lazy sequences.  Even the non-lazy sequences returned by 
 transducers aren't efficiently indexable, afaik.  Of course you can always 
 throw any sequence into 'vec' and g

Re: parallel sequence side-effect processor

2016-09-23 Thread Dragan Djuric
fluokitten does not have anything to do with haskell, or how haskell does 
those things. It is completely tailored to clojure. Regarding marsi0's 
task, now I see that, actually, foldmap is what should be used, and not 
fold. so, it's something like (foldmap dummy-nil-accumulating-fn nil 
side-effect-fn a b).

It does not create intermediate allocations.

If you use neanderthal vectors/matrices for the data, even better! They 
support fluokitten, and keep everything primitive, so even objects are kept 
in check.

I think that the best way is to try it in a simple project and see how it 
works. Maybe I am missing something, but it seems to me this is more or 
less what marsi0 is looking for.

On Saturday, September 24, 2016 at 12:52:48 AM UTC+2, tbc++ wrote:
>
> How is this done, I searched the source, but there was so much indirection 
> I can't find it. I'm familiar with how Haskell does rfold vs lfold, but one 
> of those does create allocations, they may be in the form of closures, but 
> they are allocations none the less. So does fold use iterators or something?
>
> Timothy
>
> On Fri, Sep 23, 2016 at 4:23 PM, Dragan Djuric  > wrote:
>
>> fluokitten's fold is MUCH better than (map f a b) because it does NOT 
>> create intermediate collections. just use (fold f a b) and it would fold 
>> everything into one thing (in this case nil). If f is a function with side 
>> effects, it will invoke them. No intermediate collection is created AND the 
>> folding would be optimized per the type of a.
>>
>> On Friday, September 23, 2016 at 10:56:00 PM UTC+2, tbc++ wrote:
>>>
>>> How is fluokitten's fold any better than using seqs like (map f a b) 
>>> would? Both create intermediate collections.
>>>
>>> On Fri, Sep 23, 2016 at 11:40 AM, Dragan Djuric  
>>> wrote:
>>>
 If you do not insist on vanilla clojure, but can use a library, fold 
 from fluokitten might enable you to do this. It is similar to reduce, but 
 accepts multiple arguments. Give it a vararg folding function that prints 
 what you need and ignores the first parameter, and you'd get what you 
 asked 
 for.


 On Friday, September 23, 2016 at 7:15:42 PM UTC+2, Mars0i wrote:
>
> On Friday, September 23, 2016 at 11:11:07 AM UTC-5, Alan Thompson 
> wrote:
>>
>> ​Huh.  I was also unaware of the run! function.​
>>
>> I suppose you could always write it like this:
>>
>> (def x (vec (range 3)))
>> (def y (vec (reverse x)))
>>
>> (run!
>>   (fn [[x y]] (println x y))
>>
>>   (map vector x y))
>>
>>
>>  > lein run
>> 0 2
>> 1 1
>> 2 0
>>
>>
> Yes.  But that's got the same problem.  Doesn't matter with a toy 
> example, but the (map vector ...) could be undesirable with large 
> collections in performance-critical code.
>
> although the plain old for loop with dotimes looks simpler:
>>
>> (dotimes [i (count x) ]
>>   (println (x i) (y i)))
>>
>>
>> maybe that is the best answer? It is hard to beat the flexibility of 
>> a a loop and an explicit index.
>>
>
> I agree that this is clearer, but it kind of bothers me to index 
> through a vector sequentially in Clojure.  We need indexing In Clojure 
> because sometimes you need to access a vector more arbitrarily.  If 
> you're 
> just walking the vector in order, we have better methods--as long as we 
> don't want to walk multiple vectors in the same order for side effects.
>
> However, the real drawback of the dotimes method is that it's not 
> efficient for the general case; it could be slow on lists, lazy 
> sequences, 
> etc. (again, on non-toy examples).  Many of the most convenient Clojure 
> functions return lazy sequences.  Even the non-lazy sequences returned by 
> transducers aren't efficiently indexable, afaik.  Of course you can 
> always 
> throw any sequence into 'vec' and get out a vector, but that's an 
> unnecessary transformation if you just want to iterate through the 
> sequences element by element.
>
> If I'm writing a function that will plot points or that will write 
> data to a file, it shouldn't be a requirement for the sake of efficiency 
> that the data come in the form of vectors.  I should be able to pass in 
> the 
> data in whatever form is easiest.  Right now, if I wanted efficiency for 
> walking through sequences in the same order, without creating unnecessary 
> data structures, I'd have to write the function using loop/recur.  On the 
> other hand, if I wanted the cross product of the sequences, I'd use doseq 
> and be done a lot quicker with clearer code.
>
 -- 
 You received this message because you are subscribed to the Google
 Groups "Clojure" group.
 To post to this group, send email to clo...@googlegroups.com
 Note that posts from new members are moderated - pleas

Re: parallel sequence side-effect processor

2016-09-23 Thread Timothy Baldridge
How is this done, I searched the source, but there was so much indirection
I can't find it. I'm familiar with how Haskell does rfold vs lfold, but one
of those does create allocations, they may be in the form of closures, but
they are allocations none the less. So does fold use iterators or something?

Timothy

On Fri, Sep 23, 2016 at 4:23 PM, Dragan Djuric  wrote:

> fluokitten's fold is MUCH better than (map f a b) because it does NOT
> create intermediate collections. just use (fold f a b) and it would fold
> everything into one thing (in this case nil). If f is a function with side
> effects, it will invoke them. No intermediate collection is created AND the
> folding would be optimized per the type of a.
>
> On Friday, September 23, 2016 at 10:56:00 PM UTC+2, tbc++ wrote:
>>
>> How is fluokitten's fold any better than using seqs like (map f a b)
>> would? Both create intermediate collections.
>>
>> On Fri, Sep 23, 2016 at 11:40 AM, Dragan Djuric 
>> wrote:
>>
>>> If you do not insist on vanilla clojure, but can use a library, fold
>>> from fluokitten might enable you to do this. It is similar to reduce, but
>>> accepts multiple arguments. Give it a vararg folding function that prints
>>> what you need and ignores the first parameter, and you'd get what you asked
>>> for.
>>>
>>>
>>> On Friday, September 23, 2016 at 7:15:42 PM UTC+2, Mars0i wrote:

 On Friday, September 23, 2016 at 11:11:07 AM UTC-5, Alan Thompson wrote:
>
> ​Huh.  I was also unaware of the run! function.​
>
> I suppose you could always write it like this:
>
> (def x (vec (range 3)))
> (def y (vec (reverse x)))
>
> (run!
>   (fn [[x y]] (println x y))
>
>   (map vector x y))
>
>
>  > lein run
> 0 2
> 1 1
> 2 0
>
>
 Yes.  But that's got the same problem.  Doesn't matter with a toy
 example, but the (map vector ...) could be undesirable with large
 collections in performance-critical code.

 although the plain old for loop with dotimes looks simpler:
>
> (dotimes [i (count x) ]
>   (println (x i) (y i)))
>
>
> maybe that is the best answer? It is hard to beat the flexibility of a
> a loop and an explicit index.
>

 I agree that this is clearer, but it kind of bothers me to index
 through a vector sequentially in Clojure.  We need indexing In Clojure
 because sometimes you need to access a vector more arbitrarily.  If you're
 just walking the vector in order, we have better methods--as long as we
 don't want to walk multiple vectors in the same order for side effects.

 However, the real drawback of the dotimes method is that it's not
 efficient for the general case; it could be slow on lists, lazy sequences,
 etc. (again, on non-toy examples).  Many of the most convenient Clojure
 functions return lazy sequences.  Even the non-lazy sequences returned by
 transducers aren't efficiently indexable, afaik.  Of course you can always
 throw any sequence into 'vec' and get out a vector, but that's an
 unnecessary transformation if you just want to iterate through the
 sequences element by element.

 If I'm writing a function that will plot points or that will write data
 to a file, it shouldn't be a requirement for the sake of efficiency that
 the data come in the form of vectors.  I should be able to pass in the data
 in whatever form is easiest.  Right now, if I wanted efficiency for walking
 through sequences in the same order, without creating unnecessary data
 structures, I'd have to write the function using loop/recur.  On the other
 hand, if I wanted the cross product of the sequences, I'd use doseq and be
 done a lot quicker with clearer code.

>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Clojure" group.
>>> To post to this group, send email to clo...@googlegroups.com
>>> Note that posts from new members are moderated - please be patient with
>>> your first post.
>>> To unsubscribe from this group, send email to
>>> clojure+u...@googlegroups.com
>>> For more options, visit this group at
>>> http://groups.google.com/group/clojure?hl=en
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "Clojure" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to clojure+u...@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>> “One of the main causes of the fall of the Roman Empire was that–lacking
>> zero–they had no way to indicate successful termination of their C
>> programs.”
>> (Robert Firth)
>>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> you

Re: parallel sequence side-effect processor

2016-09-23 Thread Dragan Djuric
fluokitten's fold is MUCH better than (map f a b) because it does NOT 
create intermediate collections. just use (fold f a b) and it would fold 
everything into one thing (in this case nil). If f is a function with side 
effects, it will invoke them. No intermediate collection is created AND the 
folding would be optimized per the type of a.

On Friday, September 23, 2016 at 10:56:00 PM UTC+2, tbc++ wrote:
>
> How is fluokitten's fold any better than using seqs like (map f a b) 
> would? Both create intermediate collections.
>
> On Fri, Sep 23, 2016 at 11:40 AM, Dragan Djuric  > wrote:
>
>> If you do not insist on vanilla clojure, but can use a library, fold from 
>> fluokitten might enable you to do this. It is similar to reduce, but 
>> accepts multiple arguments. Give it a vararg folding function that prints 
>> what you need and ignores the first parameter, and you'd get what you asked 
>> for.
>>
>>
>> On Friday, September 23, 2016 at 7:15:42 PM UTC+2, Mars0i wrote:
>>>
>>> On Friday, September 23, 2016 at 11:11:07 AM UTC-5, Alan Thompson wrote:

 ​Huh.  I was also unaware of the run! function.​

 I suppose you could always write it like this:

 (def x (vec (range 3)))
 (def y (vec (reverse x)))

 (run!
   (fn [[x y]] (println x y))

   (map vector x y))


  > lein run
 0 2
 1 1
 2 0


>>> Yes.  But that's got the same problem.  Doesn't matter with a toy 
>>> example, but the (map vector ...) could be undesirable with large 
>>> collections in performance-critical code.
>>>
>>> although the plain old for loop with dotimes looks simpler:

 (dotimes [i (count x) ]
   (println (x i) (y i)))


 maybe that is the best answer? It is hard to beat the flexibility of a 
 a loop and an explicit index.

>>>
>>> I agree that this is clearer, but it kind of bothers me to index through 
>>> a vector sequentially in Clojure.  We need indexing In Clojure because 
>>> sometimes you need to access a vector more arbitrarily.  If you're just 
>>> walking the vector in order, we have better methods--as long as we don't 
>>> want to walk multiple vectors in the same order for side effects.
>>>
>>> However, the real drawback of the dotimes method is that it's not 
>>> efficient for the general case; it could be slow on lists, lazy sequences, 
>>> etc. (again, on non-toy examples).  Many of the most convenient Clojure 
>>> functions return lazy sequences.  Even the non-lazy sequences returned by 
>>> transducers aren't efficiently indexable, afaik.  Of course you can always 
>>> throw any sequence into 'vec' and get out a vector, but that's an 
>>> unnecessary transformation if you just want to iterate through the 
>>> sequences element by element.
>>>
>>> If I'm writing a function that will plot points or that will write data 
>>> to a file, it shouldn't be a requirement for the sake of efficiency that 
>>> the data come in the form of vectors.  I should be able to pass in the data 
>>> in whatever form is easiest.  Right now, if I wanted efficiency for walking 
>>> through sequences in the same order, without creating unnecessary data 
>>> structures, I'd have to write the function using loop/recur.  On the other 
>>> hand, if I wanted the cross product of the sequences, I'd use doseq and be 
>>> done a lot quicker with clearer code.
>>>
>> -- 
>> You received this message because you are subscribed to the Google
>> Groups "Clojure" group.
>> To post to this group, send email to clo...@googlegroups.com 
>> 
>> Note that posts from new members are moderated - please be patient with 
>> your first post.
>> To unsubscribe from this group, send email to
>> clojure+u...@googlegroups.com 
>> For more options, visit this group at
>> http://groups.google.com/group/clojure?hl=en
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "Clojure" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to clojure+u...@googlegroups.com .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> “One of the main causes of the fall of the Roman Empire was that–lacking 
> zero–they had no way to indicate successful termination of their C 
> programs.”
> (Robert Firth) 
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.c

Re: parallel sequence side-effect processor

2016-09-23 Thread Timothy Baldridge
But that would print 1 to 32 items depending on a ton of things. No the
correct way to write code like that is

(doseq [s coll]
  (println s))

If you need to iterate over multiple items then use (map f a b).

Also I would suggest benchmarking things, sure (map f coll) is slower than
a transducer, but benchmark your app code and I think you'll be surprised
how fast map is for normal use cases.

On Fri, Sep 23, 2016 at 4:04 PM, William la Forge 
wrote:

> I've run into this when I wanted to print each item in a seq. Since
> println returns nil, I just did (first (keep println coll)). --Needed the
> first here because keep returns a lazy seq.
>
>
> On Friday, September 23, 2016 at 12:02:09 AM UTC-4, Mars0i wrote:
>>
>> This is almost the same as an issue I raised in this group over a year
>> and a half ago, here
>> .
>> I suggested that Clojure should include function with map's syntax but that
>> was executed only for side-effects, without constructing sequences.  No one
>> else was interested--no problem.   It's still bugging me.
>>
>> It's not map syntax that I care about at this point.  What's bugging me
>> is that there's no standard, built-in way to process multiple sequences for
>> side effects without (a) constructing unnecessary sequences or (b) rolling
>> my own function with loop/recur or something else.
>>
>> If I want to process multiple sequences for side-effects in the the way
>> that 'for' does, Clojure gives me 'doseq'.  Beautiful.  I can operate on
>> the cross product of the sequences, or filter them in various ways.
>>
>> If I want to process multiple sequences by applying an n-ary function to
>> the first element of each of n sequences, then to the second element of
>> each sequence, and so on, I can use 'map' or 'mapv', but that means
>> constructing unnecessary collections.
>>
>> Or I can splice my n sequences together, and make a single sequence of
>> n-tuples, and use doseq to process it.  (There's an example like this on
>> the doseq doc page.)  Or I can process such a sequence with 'reduce'.  More
>> unnecessary collections, though.
>>
>> Or I can use 'dotimes', and index into each of the collections, which is
>> OK if they're vectors, but ... ugh. why?
>>
>> Or I can construct my own function using first and rest or first and next
>> on each of my sequences, via loop/recur, for example.   But that seems odd
>> to me.
>>
>> Isn't this a common use case?  Is processing multiple sequences for
>> side-effects with corresponding elements in each application so unusual?
>> (Am I the only one?)  Isn't it odd that we have doseq and map but nothing
>> that processes multiple sequences for side-effects, in sequence, rather
>> than as a cross-product?
>>
>> (I admit that in my current use case, the sequences are small, so
>> creating a sequence of n-tuples would have only a trivial cost.  It just
>> bugs me, though. :-)
>>
>> I'd still be OK with something that had a map-like syntax, but my current
>> inclination is that it would be better for such a function to have
>> doseq-style syntax.  The name might be derived from "doseq"--maybe
>> "doseq*".
>>
>> (I'd be willing to suggest this in a JIRA ticket, but that doesn't seem
>> reasonable unless there's a call for something like this from more than one
>> person.)
>>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>



-- 
“One of the main causes of the fall of the Roman Empire was that–lacking
zero–they had no way to indicate successful termination of their C
programs.”
(Robert Firth)

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: parallel sequence side-effect processor

2016-09-23 Thread William la Forge
I've run into this when I wanted to print each item in a seq. Since println 
returns nil, I just did (first (keep println coll)). --Needed the first 
here because keep returns a lazy seq.

On Friday, September 23, 2016 at 12:02:09 AM UTC-4, Mars0i wrote:
>
> This is almost the same as an issue I raised in this group over a year and 
> a half ago, here 
> .
>   
> I suggested that Clojure should include function with map's syntax but that 
> was executed only for side-effects, without constructing sequences.  No one 
> else was interested--no problem.   It's still bugging me.
>
> It's not map syntax that I care about at this point.  What's bugging me is 
> that there's no standard, built-in way to process multiple sequences for 
> side effects without (a) constructing unnecessary sequences or (b) rolling 
> my own function with loop/recur or something else.
>
> If I want to process multiple sequences for side-effects in the the way 
> that 'for' does, Clojure gives me 'doseq'.  Beautiful.  I can operate on 
> the cross product of the sequences, or filter them in various ways.
>
> If I want to process multiple sequences by applying an n-ary function to 
> the first element of each of n sequences, then to the second element of 
> each sequence, and so on, I can use 'map' or 'mapv', but that means 
> constructing unnecessary collections.
>
> Or I can splice my n sequences together, and make a single sequence of 
> n-tuples, and use doseq to process it.  (There's an example like this on 
> the doseq doc page.)  Or I can process such a sequence with 'reduce'.  More 
> unnecessary collections, though.
>
> Or I can use 'dotimes', and index into each of the collections, which is 
> OK if they're vectors, but ... ugh. why?
>
> Or I can construct my own function using first and rest or first and next 
> on each of my sequences, via loop/recur, for example.   But that seems odd 
> to me.  
>
> Isn't this a common use case?  Is processing multiple sequences for 
> side-effects with corresponding elements in each application so unusual?  
> (Am I the only one?)  Isn't it odd that we have doseq and map but nothing 
> that processes multiple sequences for side-effects, in sequence, rather 
> than as a cross-product?  
>
> (I admit that in my current use case, the sequences are small, so creating 
> a sequence of n-tuples would have only a trivial cost.  It just bugs me, 
> though. :-)
>
> I'd still be OK with something that had a map-like syntax, but my current 
> inclination is that it would be better for such a function to have 
> doseq-style syntax.  The name might be derived from "doseq"--maybe 
> "doseq*". 
>
> (I'd be willing to suggest this in a JIRA ticket, but that doesn't seem 
> reasonable unless there's a call for something like this from more than one 
> person.)
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: parallel sequence side-effect processor

2016-09-23 Thread Timothy Baldridge
How is fluokitten's fold any better than using seqs like (map f a b) would?
Both create intermediate collections.

On Fri, Sep 23, 2016 at 11:40 AM, Dragan Djuric  wrote:

> If you do not insist on vanilla clojure, but can use a library, fold from
> fluokitten might enable you to do this. It is similar to reduce, but
> accepts multiple arguments. Give it a vararg folding function that prints
> what you need and ignores the first parameter, and you'd get what you asked
> for.
>
>
> On Friday, September 23, 2016 at 7:15:42 PM UTC+2, Mars0i wrote:
>>
>> On Friday, September 23, 2016 at 11:11:07 AM UTC-5, Alan Thompson wrote:
>>>
>>> ​Huh.  I was also unaware of the run! function.​
>>>
>>> I suppose you could always write it like this:
>>>
>>> (def x (vec (range 3)))
>>> (def y (vec (reverse x)))
>>>
>>> (run!
>>>   (fn [[x y]] (println x y))
>>>
>>>   (map vector x y))
>>>
>>>
>>>  > lein run
>>> 0 2
>>> 1 1
>>> 2 0
>>>
>>>
>> Yes.  But that's got the same problem.  Doesn't matter with a toy
>> example, but the (map vector ...) could be undesirable with large
>> collections in performance-critical code.
>>
>> although the plain old for loop with dotimes looks simpler:
>>>
>>> (dotimes [i (count x) ]
>>>   (println (x i) (y i)))
>>>
>>>
>>> maybe that is the best answer? It is hard to beat the flexibility of a a
>>> loop and an explicit index.
>>>
>>
>> I agree that this is clearer, but it kind of bothers me to index through
>> a vector sequentially in Clojure.  We need indexing In Clojure because
>> sometimes you need to access a vector more arbitrarily.  If you're just
>> walking the vector in order, we have better methods--as long as we don't
>> want to walk multiple vectors in the same order for side effects.
>>
>> However, the real drawback of the dotimes method is that it's not
>> efficient for the general case; it could be slow on lists, lazy sequences,
>> etc. (again, on non-toy examples).  Many of the most convenient Clojure
>> functions return lazy sequences.  Even the non-lazy sequences returned by
>> transducers aren't efficiently indexable, afaik.  Of course you can always
>> throw any sequence into 'vec' and get out a vector, but that's an
>> unnecessary transformation if you just want to iterate through the
>> sequences element by element.
>>
>> If I'm writing a function that will plot points or that will write data
>> to a file, it shouldn't be a requirement for the sake of efficiency that
>> the data come in the form of vectors.  I should be able to pass in the data
>> in whatever form is easiest.  Right now, if I wanted efficiency for walking
>> through sequences in the same order, without creating unnecessary data
>> structures, I'd have to write the function using loop/recur.  On the other
>> hand, if I wanted the cross product of the sequences, I'd use doseq and be
>> done a lot quicker with clearer code.
>>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>



-- 
“One of the main causes of the fall of the Roman Empire was that–lacking
zero–they had no way to indicate successful termination of their C
programs.”
(Robert Firth)

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: parallel sequence side-effect processor

2016-09-23 Thread Dragan Djuric
If you do not insist on vanilla clojure, but can use a library, fold from 
fluokitten might enable you to do this. It is similar to reduce, but 
accepts multiple arguments. Give it a vararg folding function that prints 
what you need and ignores the first parameter, and you'd get what you asked 
for.

On Friday, September 23, 2016 at 7:15:42 PM UTC+2, Mars0i wrote:
>
> On Friday, September 23, 2016 at 11:11:07 AM UTC-5, Alan Thompson wrote:
>>
>> ​Huh.  I was also unaware of the run! function.​
>>
>> I suppose you could always write it like this:
>>
>> (def x (vec (range 3)))
>> (def y (vec (reverse x)))
>>
>> (run!
>>   (fn [[x y]] (println x y))
>>
>>   (map vector x y))
>>
>>
>>  > lein run
>> 0 2
>> 1 1
>> 2 0
>>
>>
> Yes.  But that's got the same problem.  Doesn't matter with a toy example, 
> but the (map vector ...) could be undesirable with large collections in 
> performance-critical code.
>
> although the plain old for loop with dotimes looks simpler:
>>
>> (dotimes [i (count x) ]
>>   (println (x i) (y i)))
>>
>>
>> maybe that is the best answer? It is hard to beat the flexibility of a a 
>> loop and an explicit index.
>>
>
> I agree that this is clearer, but it kind of bothers me to index through a 
> vector sequentially in Clojure.  We need indexing In Clojure because 
> sometimes you need to access a vector more arbitrarily.  If you're just 
> walking the vector in order, we have better methods--as long as we don't 
> want to walk multiple vectors in the same order for side effects.
>
> However, the real drawback of the dotimes method is that it's not 
> efficient for the general case; it could be slow on lists, lazy sequences, 
> etc. (again, on non-toy examples).  Many of the most convenient Clojure 
> functions return lazy sequences.  Even the non-lazy sequences returned by 
> transducers aren't efficiently indexable, afaik.  Of course you can always 
> throw any sequence into 'vec' and get out a vector, but that's an 
> unnecessary transformation if you just want to iterate through the 
> sequences element by element.
>
> If I'm writing a function that will plot points or that will write data to 
> a file, it shouldn't be a requirement for the sake of efficiency that the 
> data come in the form of vectors.  I should be able to pass in the data in 
> whatever form is easiest.  Right now, if I wanted efficiency for walking 
> through sequences in the same order, without creating unnecessary data 
> structures, I'd have to write the function using loop/recur.  On the other 
> hand, if I wanted the cross product of the sequences, I'd use doseq and be 
> done a lot quicker with clearer code.
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: parallel sequence side-effect processor

2016-09-23 Thread Mars0i
On Friday, September 23, 2016 at 11:11:07 AM UTC-5, Alan Thompson wrote:
>
> ​Huh.  I was also unaware of the run! function.​
>
> I suppose you could always write it like this:
>
> (def x (vec (range 3)))
> (def y (vec (reverse x)))
>
> (run!
>   (fn [[x y]] (println x y))
>
>   (map vector x y))
>
>
>  > lein run
> 0 2
> 1 1
> 2 0
>
>
Yes.  But that's got the same problem.  Doesn't matter with a toy example, 
but the (map vector ...) could be undesirable with large collections in 
performance-critical code.

although the plain old for loop with dotimes looks simpler:
>
> (dotimes [i (count x) ]
>   (println (x i) (y i)))
>
>
> maybe that is the best answer? It is hard to beat the flexibility of a a 
> loop and an explicit index.
>

I agree that this is clearer, but it kind of bothers me to index through a 
vector sequentially in Clojure.  We need indexing In Clojure because 
sometimes you need to access a vector more arbitrarily.  If you're just 
walking the vector in order, we have better methods--as long as we don't 
want to walk multiple vectors in the same order for side effects.

However, the real drawback of the dotimes method is that it's not efficient 
for the general case; it could be slow on lists, lazy sequences, etc. 
(again, on non-toy examples).  Many of the most convenient Clojure 
functions return lazy sequences.  Even the non-lazy sequences returned by 
transducers aren't efficiently indexable, afaik.  Of course you can always 
throw any sequence into 'vec' and get out a vector, but that's an 
unnecessary transformation if you just want to iterate through the 
sequences element by element.

If I'm writing a function that will plot points or that will write data to 
a file, it shouldn't be a requirement for the sake of efficiency that the 
data come in the form of vectors.  I should be able to pass in the data in 
whatever form is easiest.  Right now, if I wanted efficiency for walking 
through sequences in the same order, without creating unnecessary data 
structures, I'd have to write the function using loop/recur.  On the other 
hand, if I wanted the cross product of the sequences, I'd use doseq and be 
done a lot quicker with clearer code.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: parallel sequence side-effect processor

2016-09-23 Thread Alan Thompson
​Huh.  I was also unaware of the run! function.​

I suppose you could always write it like this:

(def x (vec (range 3)))
(def y (vec (reverse x)))

(run!
  (fn [[x y]] (println x y))

  (map vector x y))


 > lein run
0 2
1 1
2 0


although the plain old for loop with dotimes looks simpler:

(dotimes [i (count x) ]
  (println (x i) (y i)))


maybe that is the best answer? It is hard to beat the flexibility of a a
loop and an explicit index.
Alan



On Fri, Sep 23, 2016 at 12:11 AM, Mars0i  wrote:

> On Friday, September 23, 2016 at 1:47:45 AM UTC-5, Francis Avila wrote:
>>
>> It's not crystal clear to me what you are after, either from this post or
>> the one you link to. I think you want a map that does not produce
>> intermediate collections and accepts multiple colls as input at a time?
>
>
> Yes.
>
>
>> Do you have some pseudocode example so we can be precise?
>
>
> What I was imagining most recently was something like this:
>
> (doseq* [x (range 3)
>   y (reverse (range 3))]
>   (println x y))
>
> which would display:
>
> 0 2
> 1 1
> 2 0
>
> as opposed to
>
> (doseq [x (range 3)
>  y (reverse (range 3))]
>   (println x y))
>
> which displays
>
> 0 2
> 0 1
> 0 0
> 1 2
> 1 1
> 1 0
> 2 2
> 2 1
> 2 0
>
> What about (run! my-side-effect-fn coll)
>>
>> This doesn't handle multiple coll at a time like the sequence function,
>> but you can tupleize coll with (map vector coll1 coll2) at some loss of
>> efficiency. Or you can roll your own.
>>
>
> Very nice--I didn't know about run! (!)  Yes, that would do it, except
> that it requires constructing unnecessary collections when you want to
> process multiple sequences.  We we able do the same tupelize trick with
> doseq, before run! appeared.
>
> However, since we have run!, which is exactly what I wanted a year and a
> half ago except that it doesn't handle multiple collections, I would
> suggest simply enhancing run! to allow multiple sequence arguments:
>
>  (run! (fn [x y] (println x y))
>  (range 3)
>  (reverse (range 3))
>
> which would produce:
>
> 0 2
> 1 1
> 2 0
>
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: parallel sequence side-effect processor

2016-09-23 Thread Mars0i
On Friday, September 23, 2016 at 1:47:45 AM UTC-5, Francis Avila wrote:
>
> It's not crystal clear to me what you are after, either from this post or 
> the one you link to. I think you want a map that does not produce 
> intermediate collections and accepts multiple colls as input at a time? 


Yes.
 

> Do you have some pseudocode example so we can be precise?


What I was imagining most recently was something like this:

(doseq* [x (range 3) 
  y (reverse (range 3))]
  (println x y))

which would display:

0 2
1 1
2 0

as opposed to 
 
(doseq [x (range 3) 
 y (reverse (range 3))]
  (println x y))

which displays 

0 2
0 1
0 0
1 2
1 1
1 0
2 2
2 1
2 0

What about (run! my-side-effect-fn coll)
>
> This doesn't handle multiple coll at a time like the sequence function, 
> but you can tupleize coll with (map vector coll1 coll2) at some loss of 
> efficiency. Or you can roll your own.
>

Very nice--I didn't know about run! (!)  Yes, that would do it, except that 
it requires constructing unnecessary collections when you want to process 
multiple sequences.  We we able do the same tupelize trick with doseq, 
before run! appeared.

However, since we have run!, which is exactly what I wanted a year and a 
half ago except that it doesn't handle multiple collections, I would 
suggest simply enhancing run! to allow multiple sequence arguments:

 (run! (fn [x y] (println x y))
 (range 3) 
 (reverse (range 3))

which would produce:

0 2
1 1
2 0

 

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


parallel sequence side-effect processor

2016-09-22 Thread Francis Avila
It's not crystal clear to me what you are after, either from this post or the 
one you link to. I think you want a map that does not produce intermediate 
collections and accepts multiple colls as input at a time? Do you have some 
pseudocode example so we can be precise?

What about (run! my-side-effect-fn coll)

This doesn't handle multiple coll at a time like the sequence function, but you 
can tupleize coll with (map vector coll1 coll2) at some loss of efficiency. Or 
you can roll your own.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


parallel sequence side-effect processor

2016-09-22 Thread Mars0i
This is almost the same as an issue I raised in this group over a year and 
a half ago, here 
.
  
I suggested that Clojure should include function with map's syntax but that 
was executed only for side-effects, without constructing sequences.  No one 
else was interested--no problem.   It's still bugging me.

It's not map syntax that I care about at this point.  What's bugging me is 
that there's no standard, built-in way to process multiple sequences for 
side effects without (a) constructing unnecessary sequences or (b) rolling 
my own function with loop/recur or something else.

If I want to process multiple sequences for side-effects in the the way 
that 'for' does, Clojure gives me 'doseq'.  Beautiful.  I can operate on 
the cross product of the sequences, or filter them in various ways.

If I want to process multiple sequences by applying an n-ary function to 
the first element of each of n sequences, then to the second element of 
each sequence, and so on, I can use 'map' or 'mapv', but that means 
constructing unnecessary collections.

Or I can splice my n sequences together, and make a single sequence of 
n-tuples, and use doseq to process it.  (There's an example like this on 
the doseq doc page.)  Or I can process such a sequence with 'reduce'.  More 
unnecessary collections, though.

Or I can use 'dotimes', and index into each of the collections, which is OK 
if they're vectors, but ... ugh. why?

Or I can construct my own function using first and rest or first and next 
on each of my sequences, via loop/recur, for example.   But that seems odd 
to me.  

Isn't this a common use case?  Is processing multiple sequences for 
side-effects with corresponding elements in each application so unusual?  
(Am I the only one?)  Isn't it odd that we have doseq and map but nothing 
that processes multiple sequences for side-effects, in sequence, rather 
than as a cross-product?  

(I admit that in my current use case, the sequences are small, so creating 
a sequence of n-tuples would have only a trivial cost.  It just bugs me, 
though. :-)

I'd still be OK with something that had a map-like syntax, but my current 
inclination is that it would be better for such a function to have 
doseq-style syntax.  The name might be derived from "doseq"--maybe 
"doseq*". 

(I'd be willing to suggest this in a JIRA ticket, but that doesn't seem 
reasonable unless there's a call for something like this from more than one 
person.)

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.