Hi Ed,
Here's an idea. I would suggest using as the key of the CarJourney entity
the CarId and the StartDate (in a lexicographical manner). For example, for
car 123 and start date Monday Feb. 14, 2011 at 5:13:33 pm (if time of day is
not important it could be dropped from key), the CarJourney key would be
123:2011:02:14:17:13:33 and I would add a DESCENDING index on the __key__
property since these are not added automatically like other properties.
Then, you can query for the entities that meet your specified criteria using
a key range like this with a limit of 1.

    select CarJourney where __key__ >= MAKEKEY(Car + StartDate) && __key__
<= MAKEKEY(Car + EndDate) order by __key__ DESC

You should be able to fire off these asynchronously for multiple cars. As
another optimization, you might think about making these queries a keys only
query and only retrieving the key and seeing first if you have the
CarJourney in memcache and if not then retrieving the full entity from the
datastore and putting the entity into memcache.

Hope this helps,
Steve

On Mon, Feb 14, 2011 at 7:59 AM, Edward Hartwell Goose
<[email protected]>wrote:

> Hi,
>
> I was wondering if anyone can help me with this problem. We have an
> idea we'd like to implement, and we're currently unable to do this
> efficiently.
>
> I've anonymised the data as best as possible, but the structure is the
> same.
>
> We have two entities, Car and CarJourney.
>
> Each Car has 0 to many CarJourney's.
>
> Each Car Journey has (amongst other properties) a date associated with
> it - the date the journey was started.
>
> I wish to query by time over car journeys. I'll have two times, a
> start date and an end date, where start date <= endDate, and I want to
> receive the most recently started journey in that period.
>
> So, if I had a particular car in mind, say car 123, I'd write a query
> that limits by Car.key and Car.startDate, where Car.key == 123 and
> Journey.startDate >= startDate and Journey.startDate <= endDate with
> an ordering on Journey.startDate descending and a limit of 1.
>
> e.g. Car A has 3 journeys, taken on 1st, 2nd and the 3rd of the month.
> The query start date is 1st and the query end date is the 2nd. The
> result of this query would be one Car journey, the 2nd.
>
> Once the result of that query is returned, a very small amount of
> processing is done to return a result to the user.
>
> That's the easy bit.
>
> But, instead of over 1 Car, I want a list of cars, where the list
> contains N keys to cars.
>
> So, I want to run the above query N times, once for every car. And I
> want the latest journey for each car.
>
> Because the time range is flexible (and thus can't be known
> beforehand) we can't implement a "isMostRecent" flag, because while it
> might be the most recent for now, it might not be the most recent for
> the specified date parameters.
>
> We also need to ensure that this returns promptly (current queries are
> around the 3-5 second mark for a small set of data) as this goes
> straight back to the user. This means that we can't use task queues,
> and because the specified dates are arbitrary we can't implement mass
> indexing of "isWithinDate" fields.
>
> We tried using an async query, but because the amount of processing is
> negligible the bottleneck is still the queries on the datastore
> (because the async api still sends the requests synchronously, it just
> doesn't block).
>
> Ideally, we'd implement this as a select on car journeys ordered by
> startDate where the Car.key is distinct, but we can't seem to pull
> this off in GAE.
>
> There are numerous small optimisations we can make (for example, some
> MemCaching of repeated queries) but none have made a significant dent
> in our query time. And MemCaching can only help for a maximum of 1-2
> minutes (due to the inevitable forward march of time!)
>
> Any ideas are most welcome and highly appreciated.
>
> Thanks,
> Ed
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to