Barry, I understand your objections below, but do you have a better approach?
Assigning random numbers to entities is guaranteed to be worse. If you are worried about an entity being deleted and opening a gap in the sequence, imagine the thousand-fold gaps you will see with random ID generation (e.g. 1, 10001, 10002, 20000, ...). See below. On Jul 10, 3:03 pm, Barry Hunter <[email protected]> wrote: > On 10/07/2009, Devel63 <[email protected]> wrote: > > > > > The best way is to assign a one-up counter to each record as you > > create it, then call random.randint(1,max_counter) to determine the > > desired record. > > > To retrieve multiple random entities in a query, do a filter('IN ', > > [my random nums]). > > doent work that well once records start getting deleted (get the same > issue, non uniform distribution) If an app needs to support entity deletion, you can still ensure uniformity by running a periodic cron job to compress the counter sequence. > > nor does it work if you filtering at the same time :( > Correct, in that the distribution is no longer uniform. But this is also true of the random ID approach. I admit that the random ID approach seems appealing at first, but when you actually look into it, you'll find that you are guaranteed that many results will be 3X more likely than others, or worse. It IS better in the case that you want to randomize based on time of entity creation, but there are other ways to deal with this. I would love it if someone could come up with a good way to do true random results of an arbitrary query set!! > > > > Note that behind the scenes this generates multiple queries, so you're > > not saving much time. > > > On Jul 10, 7:34 am, Wooble <[email protected]> wrote: > > > Highly non-optimal solution: have a cron job assign new random numbers > > > to your entities often enough to simulate randomness. Even just re- > > > assigning numbers to entities that have been previously selected might > > > work. This involves a lot more CPU as you'd be doing writes, but > > > shifts the work from request time to a background process so your > > > users don't see the added latency for doing N queries. > > > > Another possible solution would be to fetch keys only for X*N entities > > > (where greater X's produce more apparent randomness) then choose N of > > > those keys to actually fetch entities. > > > > On Jul 9, 12:33 pm, aloo <[email protected]> wrote: > > > > > Hi all, > > > > > I'm trying to write a GQL query that returns N random records of a > > > > specific kind. My current implementation works but requires N calls to > > > > the datastore. I'd like to make it 1 call to the datastore if > > > > possible. > > > > > I currently assign a random number to every kind that I put into the > > > > datastore. When I query for a random record I generate another random > > > > number and query for records > rand ORDER BY asc LIMIT 1. > > > > > This works, however, it only returns 1 record so I need to do N > > > > queries. Any ideas on how to make this one query? Thanks. > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~----------~----~----~----~------~----~------~--~---
