[
https://issues.apache.org/jira/browse/MAHOUT-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13040977#comment-13040977
]
Lance Norskog edited comment on MAHOUT-676 at 5/30/11 2:43 AM:
---------------------------------------------------------------
bq. Normally slice samplers are used in the sense that Radford Neal proposed in
his 2003 (I think) paper. The Wikipedia entry is my base: [Slice
Sampling|http://en.wikipedia.org/wiki/Slice_sampling] and yes, it's Neal 2003.
Yes, the normal use of slice sampling is to efficiently find a set of samples
corresponding to the PDF/area under curve. That would also be a useful
implementation.
bq. you can use bisection until you get a unique result.
This patch's Sampler interface gives one decision per call. The stupid
implementation here seems the cleanest.
I have a bisection implementation using the windowing algorithm in the wiki
page.
bq. What is the need being satisfied here?
See the description.
was (Author: lancenorskog):
bq. Normally slice samplers are used in the sense that Radford Neal
proposed in his 2003 (I think) paper.
he Wikipedia entry is my base: [Slice
Sampling|http://en.wikipedia.org/wiki/Slice_sampling] and yes, it's Neal 2003.
Yes, the normal use of slice sampling is to efficiently find a set of samples
corresponding to the PDF/area under curve. That would also be a useful
implementation.
bq. you can use bisection until you get a unique result.
I have a bisection implementation using the windowing algorithm in the wiki
page. Bu
This patch's Sampler interface gives one decision per call. The stupid
implementation here seems the cleanest.
> Random samplers in a modular library
> ------------------------------------
>
> Key: MAHOUT-676
> URL: https://issues.apache.org/jira/browse/MAHOUT-676
> Project: Mahout
> Issue Type: New Feature
> Components: Math
> Reporter: Lance Norskog
> Priority: Minor
> Attachments: MAHOUT-676.patch, Sampler.patch
>
>
> This is a modular suite of samplers. It supplies the ability to throw away
> samples in a useful way.
> Here is a use case: for my recommendations, I want user activity to decide
> the amount of influence on the results. For the number of users who watch X
> number of movies: 1-5 is 20%, 6-15 is 50%, 15-30 is 30 %, and users who watch
> over 30 movies are not useful.
> * If I know the input distribution, I can supply a function to the Slice
> sampler to give this distribution.
> * If I don't know the distribution, I can create a Reservoir sampler for each
> of the three buckets. After reading the whole set, I check the sizes of the
> various buckets and solve for my distribution. This gives the number of users
> to pull from each bucket.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira