Re: transducer (+ closures) efficiency question

2015-06-24 Thread Leon Grapenthin
Correction: The result in the example should be [1 2 3] of course.

On Wednesday, June 24, 2015 at 6:34:20 PM UTC+2, Leon Grapenthin wrote:

 Using a set as filter predicate is idiomatic Clojure.

 If you don't know whether your filter coll is a set, you can write your 
 filter-contains as

 (def filter-contains (comp filter set))

 E. g. use as

 (into [] (filter-contains [1 2 3]) [1 2 3 4]) 
 ;- [4]

 I am not up to date about optimizations in the PersistentHashSet world, 
 but I'd assume for very small filter colls you can gain speed with a linear 
 scan instead of contains?.

 If you are ok with input, output and filter being a set, I'd recommend 
 looking at clojure.set/intersection for that task. It optimizes by reducing 
 the smaller set. Depending on the sizes of the collections you are dealing 
 with, the set conversion cost may be worth the speed up you can gain. 


 On Wednesday, June 24, 2015 at 12:07:06 AM UTC+2, Sam Raker wrote:

 Let's say that, as part of an xf, I want to filter out everything in a 
 sequence that's also in some other sequence. Here are some ways of doing 
 that:

 (defn filter-contains1 [edn-file] (remove (partial contains? (set (read-
 edn-file edn-file)

 (defn filter-contains2 [coll] (remove (partial contains? (set coll

 (def filter-contains3 [coll] (let [coll-as-set (set coll)] (remove (
 partial contains? (set coll)

 I have the strong suspicion that `filter-contains3` is the best of the 3, 
 and `filter-contains1` the worst. The internal mechanics of transduce are a 
 bit of a mystery to me, however: if `filter-contains2` were to be used on a 
 collection of, say, a million items, would `coll` be cast to a set a 
 million times, or is Clojure/the JVM smarter than that? I'm also wondering 
 if anyone has any best practices (or whatever) they can share relating to 
 this kind of intersection of transducers/xfs and closures. It seems to me, 
 for example, that something like

 (defn my-thing [coll  stuff]
   (let [s (set coll)]
   ...
   (comp
 ...
(map foo)
(filter bar)
(remove (partial contains? s))
...

 is awkward, but that a lot of limited-use transducer factory functions 
 (like the ones above) aren't exactly optimal, either.



-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: transducer (+ closures) efficiency question

2015-06-24 Thread Herwig Hochleitner
2015-06-24 18:19 GMT+02:00 Ben Wolfson wolf...@gmail.com:


 I meant a difference in speed.


A var-root fetch is a volatile read, which can have several adverse
performance effects. Most importantly, it hinders inlining, since the JIT
has to place a guard for every inlining site of a volatile read, thus it
won't so readily inline.
Even when inlined, there must be a memory barrier for the guard, which can
lead to pipeline flushes, bus communication and other stalls in execution.

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: transducer (+ closures) efficiency question

2015-06-24 Thread Ben Wolfson
On Wed, Jun 24, 2015 at 8:29 AM, Herwig Hochleitner hhochleit...@gmail.com
wrote:



 2015-06-24 1:25 GMT+02:00 Ben Wolfson wolf...@gmail.com:

 On Tue, Jun 23, 2015 at 4:17 PM, Ghadi Shayban gshay...@gmail.com
 wrote:


 Tangentially:
 (remove even?)
 Will be faster than
 (remove (fn [i] (even? i)))
 because in the first case the dereference of the var 'even?' happens
 only once and the value inside the var will be passed to `remove` at the
 outset.  In the second example the var dereference happens for every single
 item (though it's very cheap).  The second example is equivalent to writing 
 (remove
 #'even?)


 I can't seem to observe a difference between the two cases.


 A simple var reference like even? compiles to a fetch of its root val:
 @#'even?
 In the first case, that happens only once, passing the resulting fn object
 to remove. In the second case, the deref happens every time the fn is
 called, thus allowing for redefinition of even?.


I meant a difference in speed.

-- 
Ben Wolfson
Human kind has used its intelligence to vary the flavour of drinks, which
may be sweet, aromatic, fermented or spirit-based. ... Family and social
life also offer numerous other occasions to consume drinks for pleasure.
[Larousse, Drink entry]

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: transducer (+ closures) efficiency question

2015-06-24 Thread Leon Grapenthin
Using a set as filter predicate is idiomatic Clojure.

If you don't know whether your filter coll is a set, you can write your 
filter-contains as

(def filter-contains (comp filter set))

E. g. use as

(into [] (filter-contains [1 2 3]) [1 2 3 4]) 
;- [4]

I am not up to date about optimizations in the PersistentHashSet world, but 
I'd assume for very small filter colls you can gain speed with a linear 
scan instead of contains?.

If you are ok with input, output and filter being a set, I'd recommend 
looking at clojure.set/intersection for that task. It optimizes by reducing 
the smaller set. Depending on the sizes of the collections you are dealing 
with, the set conversion cost may be worth the speed up you can gain. 


On Wednesday, June 24, 2015 at 12:07:06 AM UTC+2, Sam Raker wrote:

 Let's say that, as part of an xf, I want to filter out everything in a 
 sequence that's also in some other sequence. Here are some ways of doing 
 that:

 (defn filter-contains1 [edn-file] (remove (partial contains? (set (read-
 edn-file edn-file)

 (defn filter-contains2 [coll] (remove (partial contains? (set coll

 (def filter-contains3 [coll] (let [coll-as-set (set coll)] (remove (
 partial contains? (set coll)

 I have the strong suspicion that `filter-contains3` is the best of the 3, 
 and `filter-contains1` the worst. The internal mechanics of transduce are a 
 bit of a mystery to me, however: if `filter-contains2` were to be used on a 
 collection of, say, a million items, would `coll` be cast to a set a 
 million times, or is Clojure/the JVM smarter than that? I'm also wondering 
 if anyone has any best practices (or whatever) they can share relating to 
 this kind of intersection of transducers/xfs and closures. It seems to me, 
 for example, that something like

 (defn my-thing [coll  stuff]
   (let [s (set coll)]
   ...
   (comp
 ...
(map foo)
(filter bar)
(remove (partial contains? s))
...

 is awkward, but that a lot of limited-use transducer factory functions 
 (like the ones above) aren't exactly optimal, either.


-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: transducer (+ closures) efficiency question

2015-06-24 Thread Herwig Hochleitner
2015-06-24 1:25 GMT+02:00 Ben Wolfson wolf...@gmail.com:

 On Tue, Jun 23, 2015 at 4:17 PM, Ghadi Shayban gshay...@gmail.com wrote:


 Tangentially:
 (remove even?)
 Will be faster than
 (remove (fn [i] (even? i)))
 because in the first case the dereference of the var 'even?' happens only
 once and the value inside the var will be passed to `remove` at the
 outset.  In the second example the var dereference happens for every single
 item (though it's very cheap).  The second example is equivalent to writing 
 (remove
 #'even?)


 I can't seem to observe a difference between the two cases.


A simple var reference like even? compiles to a fetch of its root val:
@#'even?
In the first case, that happens only once, passing the resulting fn object
to remove. In the second case, the deref happens every time the fn is
called, thus allowing for redefinition of even?.

Have you ever passed passed #'handler to a server-start function (e.g. in a
ring app), so that you might redefine the handler fn during run time of the
server?

Ad. thread topic: As detailed, clojure's evaluation semantics already help
with transducers. Even better: Built transducer stacks should be pretty
accessible to the JIT, since they contain no more var references (in the
wiring). Also, since transducers are a kind of lexically closed pipe from
input effect to output effect, hotspot's escape analysis should also kick
in nicely.

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: transducer (+ closures) efficiency question

2015-06-24 Thread Alan Shaw
As with all discussions involving interoperability, I'd love to see the
same questions answered for the JS target too. (This would of course apply
to books such as the new Clojure Applied. )
On Jun 24, 2015 9:48 AM, Herwig Hochleitner hhochleit...@gmail.com
wrote:



 2015-06-24 18:19 GMT+02:00 Ben Wolfson wolf...@gmail.com:


 I meant a difference in speed.


 A var-root fetch is a volatile read, which can have several adverse
 performance effects. Most importantly, it hinders inlining, since the JIT
 has to place a guard for every inlining site of a volatile read, thus it
 won't so readily inline.
 Even when inlined, there must be a memory barrier for the guard, which can
 lead to pipeline flushes, bus communication and other stalls in execution.

  --
 You received this message because you are subscribed to the Google
 Groups Clojure group.
 To post to this group, send email to clojure@googlegroups.com
 Note that posts from new members are moderated - please be patient with
 your first post.
 To unsubscribe from this group, send email to
 clojure+unsubscr...@googlegroups.com
 For more options, visit this group at
 http://groups.google.com/group/clojure?hl=en
 ---
 You received this message because you are subscribed to the Google Groups
 Clojure group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to clojure+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: transducer (+ closures) efficiency question

2015-06-23 Thread Sam Raker
Oh fantastic! I was 100% wrong in literally the best way.


Thanks!

On Tue, Jun 23, 2015 at 7:17 PM, Ghadi Shayban gshay...@gmail.com wrote:

 Good question.

 Clojure's evaluation semantics dictate that the arguments are evaluated
 (computed) *before* calling the function. So(set coll) is computed before
 being passed to `partial`.  Partial receives a function (a value) and
 arguments (also values) and returns back a new function that saves those
 original arguments (which happen to be stuffed away in Java final fields).

 All three of your filters return a transducer.  All three of the inputs to
 `remove` are a partial function.  Each of the arguments to the call to
 partial is a set.  They are all essentially equivalent, and should perform
 the same.  (Except filter3 happens to create the set twice; once in the
 let, and once as the arg to partial).  So over a billion item collection,
 the set in your examples will only be computed once, once, and twice
 respectively.

 Note however that sets *are* functions that evaluate whether the argument
 is in the set. This means you could remove the call to partial and shorten
 to:

 (defn filter-contains1 [edn-file]
   (remove (set (read-edn-file edn-file


 Tangentially:
 (remove even?)
 Will be faster than
 (remove (fn [i] (even? i)))
 because in the first case the dereference of the var 'even?' happens only
 once and the value inside the var will be passed to `remove` at the
 outset.  In the second example the var dereference happens for every single
 item (though it's very cheap).  The second example is equivalent to writing 
 (remove
 #'even?)

 On Tuesday, June 23, 2015 at 6:07:06 PM UTC-4, Sam Raker wrote:

 Let's say that, as part of an xf, I want to filter out everything in a
 sequence that's also in some other sequence. Here are some ways of doing
 that:

 (defn filter-contains1 [edn-file] (remove (partial contains? (set (read-
 edn-file edn-file)

 (defn filter-contains2 [coll] (remove (partial contains? (set coll

 (def filter-contains3 [coll] (let [coll-as-set (set coll)] (remove (
 partial contains? (set coll)

 I have the strong suspicion that `filter-contains3` is the best of the 3,
 and `filter-contains1` the worst. The internal mechanics of transduce are a
 bit of a mystery to me, however: if `filter-contains2` were to be used on a
 collection of, say, a million items, would `coll` be cast to a set a
 million times, or is Clojure/the JVM smarter than that? I'm also wondering
 if anyone has any best practices (or whatever) they can share relating to
 this kind of intersection of transducers/xfs and closures. It seems to me,
 for example, that something like

 (defn my-thing [coll  stuff]
   (let [s (set coll)]
   ...
   (comp
 ...
(map foo)
(filter bar)
(remove (partial contains? s))
...

 is awkward, but that a lot of limited-use transducer factory functions
 (like the ones above) aren't exactly optimal, either.

  --
 You received this message because you are subscribed to the Google
 Groups Clojure group.
 To post to this group, send email to clojure@googlegroups.com
 Note that posts from new members are moderated - please be patient with
 your first post.
 To unsubscribe from this group, send email to
 clojure+unsubscr...@googlegroups.com
 For more options, visit this group at
 http://groups.google.com/group/clojure?hl=en
 ---
 You received this message because you are subscribed to a topic in the
 Google Groups Clojure group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/clojure/H0zoF6jlY-c/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 clojure+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.




-- 
.

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


transducer (+ closures) efficiency question

2015-06-23 Thread Sam Raker
Let's say that, as part of an xf, I want to filter out everything in a 
sequence that's also in some other sequence. Here are some ways of doing 
that:

(defn filter-contains1 [edn-file] (remove (partial contains? (set 
(read-edn-file 
edn-file)

(defn filter-contains2 [coll] (remove (partial contains? (set coll

(def filter-contains3 [coll] (let [coll-as-set (set coll)] (remove (partial 
contains? (set coll)

I have the strong suspicion that `filter-contains3` is the best of the 3, 
and `filter-contains1` the worst. The internal mechanics of transduce are a 
bit of a mystery to me, however: if `filter-contains2` were to be used on a 
collection of, say, a million items, would `coll` be cast to a set a 
million times, or is Clojure/the JVM smarter than that? I'm also wondering 
if anyone has any best practices (or whatever) they can share relating to 
this kind of intersection of transducers/xfs and closures. It seems to me, 
for example, that something like

(defn my-thing [coll  stuff]
  (let [s (set coll)]
  ...
  (comp
...
   (map foo)
   (filter bar)
   (remove (partial contains? s))
   ...

is awkward, but that a lot of limited-use transducer factory functions 
(like the ones above) aren't exactly optimal, either.

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: transducer (+ closures) efficiency question

2015-06-23 Thread Ghadi Shayban
Good question.

Clojure's evaluation semantics dictate that the arguments are evaluated 
(computed) *before* calling the function. So(set coll) is computed before 
being passed to `partial`.  Partial receives a function (a value) and 
arguments (also values) and returns back a new function that saves those 
original arguments (which happen to be stuffed away in Java final fields).

All three of your filters return a transducer.  All three of the inputs to 
`remove` are a partial function.  Each of the arguments to the call to 
partial is a set.  They are all essentially equivalent, and should perform 
the same.  (Except filter3 happens to create the set twice; once in the 
let, and once as the arg to partial).  So over a billion item collection, 
the set in your examples will only be computed once, once, and twice 
respectively.

Note however that sets *are* functions that evaluate whether the argument 
is in the set. This means you could remove the call to partial and shorten 
to:

(defn filter-contains1 [edn-file]
  (remove (set (read-edn-file edn-file


Tangentially:
(remove even?)
Will be faster than
(remove (fn [i] (even? i)))
because in the first case the dereference of the var 'even?' happens only 
once and the value inside the var will be passed to `remove` at the outset. 
 In the second example the var dereference happens for every single item 
(though it's very cheap).  The second example is equivalent to writing (remove 
#'even?)

On Tuesday, June 23, 2015 at 6:07:06 PM UTC-4, Sam Raker wrote:

 Let's say that, as part of an xf, I want to filter out everything in a 
 sequence that's also in some other sequence. Here are some ways of doing 
 that:

 (defn filter-contains1 [edn-file] (remove (partial contains? (set (read-
 edn-file edn-file)

 (defn filter-contains2 [coll] (remove (partial contains? (set coll

 (def filter-contains3 [coll] (let [coll-as-set (set coll)] (remove (
 partial contains? (set coll)

 I have the strong suspicion that `filter-contains3` is the best of the 3, 
 and `filter-contains1` the worst. The internal mechanics of transduce are a 
 bit of a mystery to me, however: if `filter-contains2` were to be used on a 
 collection of, say, a million items, would `coll` be cast to a set a 
 million times, or is Clojure/the JVM smarter than that? I'm also wondering 
 if anyone has any best practices (or whatever) they can share relating to 
 this kind of intersection of transducers/xfs and closures. It seems to me, 
 for example, that something like

 (defn my-thing [coll  stuff]
   (let [s (set coll)]
   ...
   (comp
 ...
(map foo)
(filter bar)
(remove (partial contains? s))
...

 is awkward, but that a lot of limited-use transducer factory functions 
 (like the ones above) aren't exactly optimal, either.


-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: transducer (+ closures) efficiency question

2015-06-23 Thread Ben Wolfson
On Tue, Jun 23, 2015 at 4:17 PM, Ghadi Shayban gshay...@gmail.com wrote:


 Tangentially:
 (remove even?)
 Will be faster than
 (remove (fn [i] (even? i)))
 because in the first case the dereference of the var 'even?' happens only
 once and the value inside the var will be passed to `remove` at the
 outset.  In the second example the var dereference happens for every single
 item (though it's very cheap).  The second example is equivalent to writing 
 (remove
 #'even?)


I can't seem to observe a difference between the two cases.

-- 
Ben Wolfson
Human kind has used its intelligence to vary the flavour of drinks, which
may be sweet, aromatic, fermented or spirit-based. ... Family and social
life also offer numerous other occasions to consume drinks for pleasure.
[Larousse, Drink entry]

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.