Re: Randomly select an element from a sorted-set (rand-nth (sorted-set ..))

2011-09-26 Thread Michael Gardner
On Sep 26, 2011, at 8:12 AM, Paul Richards wrote:

 How can I efficiently pick a random element from a sorted-set?

If your sorted set is densely packed (if most possible values do appear in the 
set), then you could just pick a random value, see if it's in the set, and 
repeat until you find a match.

Otherwise, you could use a sorted-vector. I don't know if there's a good 
implementation out there, though, so you might have to roll your own.

Or you could use a sorted-set that can do a random binary search, where it 
chooses between the first and second halves at random rather than by comparison 
with a particular value. Maybe you could subclass the Clojure sorted-set to do 
this, though I'm guessing it's not made for subclassing.

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Randomly select an element from a sorted-set (rand-nth (sorted-set ..))

2011-09-26 Thread Benny Tsai
The reason that (rand-nth (seq (sorted-set 1 2 3))) performs badly on large 
sets is probably because nth is O(n) on sequences.  nth is much much faster 
on vectors, so I would suggest trying out (rand-nth (vec (sorted-set 1 2 
3))) and see if that works for your application.

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Randomly select an element from a sorted-set (rand-nth (sorted-set ..))

2011-09-26 Thread Jeremy Heiler
On Mon, Sep 26, 2011 at 9:12 AM, Paul Richards paul.richa...@gmail.com wrote:
 Hi,
 How can I efficiently pick a random element from a sorted-set?

 If I try rand-nth I get this:
 user= (rand-nth (sorted-set 1 2 3))
 java.lang.UnsupportedOperationException: nth not supported on this
 type: PersistentTreeSet (NO_SOURCE_FILE:0)

 I can get this expression to work if I naively apply seq:
 user= (rand-nth (seq (sorted-set 1 2 3)))
 1

 However this performs very badly on large sets.  Is there a more
 efficient way to do this?

 (I need to keep my elements in a sorted-set for other parts of the
 application, where I rely on subseq.)

Try just getting the value with rand-int directly. The sorted-set uses
a tree map underneath, so look up time is consistent with a map. Also,
count is O(1).

(get foo (rand-int (count foo)))

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Randomly select an element from a sorted-set (rand-nth (sorted-set ..))

2011-09-26 Thread Paul Richards
On Sep 26, 2:12 pm, Paul Richards paul.richa...@gmail.com wrote:
 Hi,
 How can I efficiently pick a random element from a sorted-set?


I've come up with a bit of a hack, which relies on me not caring about
non-uniform distributions.  If I create a custom comparator with a
random backdoor, I can select random elements from the sorted set:

(defn my-comp [a b]
(if (or (= :random a) (= :random b))
(- (* 2 (rand-int 2)) 1)
(compare a b)))

; Create test collection (of strings) using the special comparator
(def coll (apply sorted-set-by my-comp (map str (range 1 1000

(map #(first (subseq coll  %)) [500 200 700 :random :random])
; Result: (501 201 701 626 286)



Bonus question..  What is the distribution of the random selection?

(I suspect elements will be chosen with some probability like (1/2)^H
(where H is the height of the element within the red-black tree).  I
should probably generate a histogram to find out..)

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Randomly select an element from a sorted-set (rand-nth (sorted-set ..))

2011-09-26 Thread Paul Richards
On Sep 26, 6:13 pm, Jeremy Heiler jeremyhei...@gmail.com wrote:
 On Mon, Sep 26, 2011 at 9:12 AM, Paul Richards paul.richa...@gmail.com 
 wrote:
  Hi,
  How can I efficiently pick a random element from a sorted-set?

  If I try rand-nth I get this:
  user= (rand-nth (sorted-set 1 2 3))
  java.lang.UnsupportedOperationException: nth not supported on this
  type: PersistentTreeSet (NO_SOURCE_FILE:0)

  I can get this expression to work if I naively apply seq:
  user= (rand-nth (seq (sorted-set 1 2 3)))
  1

  However this performs very badly on large sets.  Is there a more
  efficient way to do this?

  (I need to keep my elements in a sorted-set for other parts of the
  application, where I rely on subseq.)

 Try just getting the value with rand-int directly. The sorted-set uses
 a tree map underneath, so look up time is consistent with a map. Also,
 count is O(1).

 (get foo (rand-int (count foo)))

I feel I picked a bad example.  My sorted-set does not contain
integers, the elements are (collections of) strings.  Trying this
approach leads to a different failure:

user= (get (sorted-set a b c) 1)
java.lang.ClassCastException: java.lang.String cannot be cast to
java.lang.Number (NO_SOURCE_FILE:0)

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Randomly select an element from a sorted-set (rand-nth (sorted-set ..))

2011-09-26 Thread Paul Richards
On Sep 26, 6:04 pm, Benny Tsai benny.t...@gmail.com wrote:
 The reason that (rand-nth (seq (sorted-set 1 2 3))) performs badly on large
 sets is probably because nth is O(n) on sequences.  nth is much much faster
 on vectors, so I would suggest trying out (rand-nth (vec (sorted-set 1 2
 3))) and see if that works for your application.

This will replace an O(n) call to nth with an O(n) call to copy into
a vector, so still leaving me with O(n).

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Randomly select an element from a sorted-set (rand-nth (sorted-set ..))

2011-09-26 Thread Benny Tsai
On Monday, September 26, 2011 2:58:59 PM UTC-6, Paul Richards wrote:

 This will replace an O(n) call to nth with an O(n) call to copy into 
 a vector, so still leaving me with O(n). 


Oops, right :)  If you're getting random elements out of the same sorted set 
multiple times, then it might be still be worth it to pay the cost of 
vectorizing the set once in exchange for faster subsequent random 
selections.  But if not, then I guess not so much.

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Randomly select an element from a sorted-set (rand-nth (sorted-set ..))

2011-09-26 Thread Alan Malloy
On Sep 26, 2:12 pm, Paul Richards paul.richa...@gmail.com wrote:
 On Sep 26, 2:12 pm, Paul Richards paul.richa...@gmail.com wrote:

  Hi,
  How can I efficiently pick a random element from a sorted-set?

 I've come up with a bit of a hack, which relies on me not caring about
 non-uniform distributions.  If I create a custom comparator with a
 random backdoor, I can select random elements from the sorted set:

 (defn my-comp [a b]
     (if (or (= :random a) (= :random b))
         (- (* 2 (rand-int 2)) 1)
         (compare a b)))

 ; Create test collection (of strings) using the special comparator
 (def coll (apply sorted-set-by my-comp (map str (range 1 1000

 (map #(first (subseq coll  %)) [500 200 700 :random :random])
 ; Result: (501 201 701 626 286)

 Bonus question..  What is the distribution of the random selection?

 (I suspect elements will be chosen with some probability like (1/2)^H
 (where H is the height of the element within the red-black tree).  I
 should probably generate a histogram to find out..)

I think it's complicated by what you expect chosen to mean. If you
were randomly selecting a single element, you would only ever choose
one at the bottom of the tree; the internal nodes would never be
selected, right? Of course, you can't do that, since `get` does an
equality check before returning the item. And if you're selecting a
subseq, it depends on what test(s) you supply, and how many items you
take, and...

Anyway, this is sorta fun to generalize to wrap any other comparator:

(defn hackable-comparator [f]
  (fn [ xs]
(if (some #{:random} xs)
  (- (* 2 (rand-int 2)) 1)
  (apply f xs

(def my-comp (hackable-comparator compare))

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en