Re: Pagination
Status code 303 (Status.REDIRECT_SEE_OTHER in Restlet) exists for the POST/Redirect/GET case. I'm not sure there's any "right" way vis a vis returning an entity directly from a POST -- certainly that's useful too in straightforward cases. P/R/G is really helpful when the result is more complex and may really consist of multiple GETable resources as in pagination. On Wed, Jun 10, 2009 at 11:54 AM, Dustin N. Jenkins < dustin.jenk...@nrc-cnrc.gc.ca> wrote: > I assume the POST/Redirect/GET pattern is the Client POSTing to the > Resource, and instead of filling the Response's Entity with the > Representation of the change, one simply redirects the Client to the GET > representation. Is this the desired behaviour? I was under the > impression that populating the Response's Entity in the POST was > improper practice, but we often lazily do it. > -- http://restlet.tigris.org/ds/viewMessage.do?dsForumId=4447&dsMessageId=2360979
Re: Pagination
I assume the POST/Redirect/GET pattern is the Client POSTing to the Resource, and instead of filling the Response's Entity with the Representation of the change, one simply redirects the Client to the GET representation. Is this the desired behaviour? I was under the impression that populating the Response's Entity in the POST was improper practice, but we often lazily do it. One more thing I thought of that might be handy is the ability to go directly to the First or Last page. Going to the First page would be easy, I think, as one would simply omit the AFTER clause, but going to the Last page seems like a different story. The type of Pagination we've been talking about is simply a "Go to Next Page" and "Go to Previous Page" design, which should be sufficient I guess. Although I think users typically would like to see how many pages there are in the result set. Having said that, though, I suppose if the Searching service were asked to simply return everything, and only Resource consumes what it wants to based on, say, the User's preferred page size, then one could simply do the math on the Collection size. Anyway, the scope of Pagination seems to be growing... Thanks again for your help Rob. I really like this approach to RESTful Pagination. Dustin Rob Heittman wrote: > Yes, keeping the state bookmarkable for a complex search is also an > interesting challenge. I know one web site (also a science app) that > has several hundred variables that can be incorporated in a query, > more than a bookmark would easily store. They keep a permanent cache > of search queries in the database and return a "minified" URL, like > bit.ly > <http://bit.ly>, using the Post/Redirect/Get pattern. They mark position in > pagination using query params: > > http://{science-app}/search/afq1z?start=1564&extent=20 > > but I also like the "after" approach better: > > http://{science-app}/search/afq1z?after=Sula+Nebouxi,Ecuador&extent=20 > > There are also nice properties of the minified query identifier URL, > in that it lends itself to subsequent RESTful interrogation in other ways: > > http://{science-app}/search/afq1z/sql -- retrieve SQL query definition > http://{science-app}/search/afq1z/export/csv -- dump entire data set > to CSV > > or fun using Variants ... etc ... > > On Tue, Jun 9, 2009 at 11:44 AM, Dustin N. Jenkins > mailto:dustin.jenk...@nrc-cnrc.gc.ca>> > wrote: > > Hi Rob, > > Thank you very much for the detailed post. It's very useful. > > My Persistence Layer uses Hibernate, which in turn uses ehcache as the > Second Layer cache, but I've always had it turned off, so now > would be a > good time to experiment with it I suppose. > > A stable search result is not required in my case, and I would happily > go back to the Persistence Layer each time as I deal with Scientific > results that are updated all the time. A user wouldn't > necessarily get > lost while moving from page to page. > > In reference to Josh's solution, I really like the idea of going > by the > sorted results and asking for the data after the last known item. I > deal with a multi-field form; upwards of a dozen fields to search > on, so > passing data back and forth may not be viable all the time, especially > with a GET given the known character limitation. However, do users > commonly bookmark a search result with a page number? I could > definitely see it. Perhaps the bookmark would encapsulate the AFTER > clause in the URL. > > Thanks again, Rob. It is an interesting problem. > Dustin > > > Rob Heittman wrote: > > Ah, pagination. One of the great programming tradeoffs :-) Have a > > look at this comment thread from Ohloh a while back. > > > > http://www.ohloh.net/forums/3491/topics/1056 > > > > Josh Triplett proposes a good solution that is lightweight for > paging > > non-critical data without server state. > > > > You can guarantee a stable search result for the duration of the > > browse by caching the entire result set server side and providing a > > means of moving through it ... that might scale to hundreds of > > results, but not so much to millions. Still, that's the usual > Session > > idiom. > > > > Here's what I usually do ... send an HTML or XML representation with > > sufficient information about how to repeat the search and page thru > > it, but keep no server side state per se. I just make sure the data > > l
Re: Pagination
Yes, keeping the state bookmarkable for a complex search is also an interesting challenge. I know one web site (also a science app) that has several hundred variables that can be incorporated in a query, more than a bookmark would easily store. They keep a permanent cache of search queries in the database and return a "minified" URL, like bit.ly, using the Post/Redirect/Get pattern. They mark position in pagination using query params: http://{science-app}/search/afq1z?start=1564&extent=20 but I also like the "after" approach better: http://{science-app}/search/afq1z?after=Sula+Nebouxi,Ecuador&extent=20 There are also nice properties of the minified query identifier URL, in that it lends itself to subsequent RESTful interrogation in other ways: http://{science-app}/search/afq1z/sql -- retrieve SQL query definition http://{science-app}/search/afq1z/export/csv -- dump entire data set to CSV or fun using Variants ... etc ... On Tue, Jun 9, 2009 at 11:44 AM, Dustin N. Jenkins < dustin.jenk...@nrc-cnrc.gc.ca> wrote: > Hi Rob, > > Thank you very much for the detailed post. It's very useful. > > My Persistence Layer uses Hibernate, which in turn uses ehcache as the > Second Layer cache, but I've always had it turned off, so now would be a > good time to experiment with it I suppose. > > A stable search result is not required in my case, and I would happily > go back to the Persistence Layer each time as I deal with Scientific > results that are updated all the time. A user wouldn't necessarily get > lost while moving from page to page. > > In reference to Josh's solution, I really like the idea of going by the > sorted results and asking for the data after the last known item. I > deal with a multi-field form; upwards of a dozen fields to search on, so > passing data back and forth may not be viable all the time, especially > with a GET given the known character limitation. However, do users > commonly bookmark a search result with a page number? I could > definitely see it. Perhaps the bookmark would encapsulate the AFTER > clause in the URL. > > Thanks again, Rob. It is an interesting problem. > Dustin > > > Rob Heittman wrote: > > Ah, pagination. One of the great programming tradeoffs :-) Have a > > look at this comment thread from Ohloh a while back. > > > > http://www.ohloh.net/forums/3491/topics/1056 > > > > Josh Triplett proposes a good solution that is lightweight for paging > > non-critical data without server state. > > > > You can guarantee a stable search result for the duration of the > > browse by caching the entire result set server side and providing a > > means of moving through it ... that might scale to hundreds of > > results, but not so much to millions. Still, that's the usual Session > > idiom. > > > > Here's what I usually do ... send an HTML or XML representation with > > sufficient information about how to repeat the search and page thru > > it, but keep no server side state per se. I just make sure the data > > layer is smart enough about caching result sets to avoid unnecessary > work. > > > > Example: say I am exposing a fulltext search over a collection of > > 10,000 documents and someone searches on "the". > > > > Client hits Resource (stateless, short lived) by POST to /search with > > something like > > > > the > > > > Resource submits the search to a query service. > > Query service hits fulltext index and gets 9,995 hits. Caches this > > result, "the"="{set of 9,995 hits}" > > Resource consumes first 10 results from query service. Sends > > something like: > > > > 9995 > > 1 > > 10 > > The first of many > > ... > > The tenth of many > > > > Let's say client wants the next page of results. It immediately sends > > back: > > > > the > > 11 > > > > Resource asks query service for "the" again. > > Query service fetches "the"="{set of 9,995 hits}" out of cache. > > Resource consumes results 11-20 from query service ... etc. > > > > This approach is not guaranteed stable. If the result set expires > > from cache and also changes in between paginated queries you might end > > up missing some results, ending up at 9,997 results instead of 9,995, > > etc. When you do soft stuff like Google searches, this happens all > > the time (wait, my blog was on page 1 when I started ...) > > > > But if you were searching for credit card transactions, the > &
Re: Pagination
Hi Rob, Thank you very much for the detailed post. It's very useful. My Persistence Layer uses Hibernate, which in turn uses ehcache as the Second Layer cache, but I've always had it turned off, so now would be a good time to experiment with it I suppose. A stable search result is not required in my case, and I would happily go back to the Persistence Layer each time as I deal with Scientific results that are updated all the time. A user wouldn't necessarily get lost while moving from page to page. In reference to Josh's solution, I really like the idea of going by the sorted results and asking for the data after the last known item. I deal with a multi-field form; upwards of a dozen fields to search on, so passing data back and forth may not be viable all the time, especially with a GET given the known character limitation. However, do users commonly bookmark a search result with a page number? I could definitely see it. Perhaps the bookmark would encapsulate the AFTER clause in the URL. Thanks again, Rob. It is an interesting problem. Dustin Rob Heittman wrote: > Ah, pagination. One of the great programming tradeoffs :-) Have a > look at this comment thread from Ohloh a while back. > > http://www.ohloh.net/forums/3491/topics/1056 > > Josh Triplett proposes a good solution that is lightweight for paging > non-critical data without server state. > > You can guarantee a stable search result for the duration of the > browse by caching the entire result set server side and providing a > means of moving through it ... that might scale to hundreds of > results, but not so much to millions. Still, that's the usual Session > idiom. > > Here's what I usually do ... send an HTML or XML representation with > sufficient information about how to repeat the search and page thru > it, but keep no server side state per se. I just make sure the data > layer is smart enough about caching result sets to avoid unnecessary work. > > Example: say I am exposing a fulltext search over a collection of > 10,000 documents and someone searches on "the". > > Client hits Resource (stateless, short lived) by POST to /search with > something like > > the > > Resource submits the search to a query service. > Query service hits fulltext index and gets 9,995 hits. Caches this > result, "the"="{set of 9,995 hits}" > Resource consumes first 10 results from query service. Sends > something like: > > 9995 > 1 > 10 > The first of many > ... > The tenth of many > > Let's say client wants the next page of results. It immediately sends > back: > > the > 11 > > Resource asks query service for "the" again. > Query service fetches "the"="{set of 9,995 hits}" out of cache. > Resource consumes results 11-20 from query service ... etc. > > This approach is not guaranteed stable. If the result set expires > from cache and also changes in between paginated queries you might end > up missing some results, ending up at 9,997 results instead of 9,995, > etc. When you do soft stuff like Google searches, this happens all > the time (wait, my blog was on page 1 when I started ...) > > But if you were searching for credit card transactions, the > instability would be unacceptable. Here, your search result set must > be stored uniquely and guaranteed to iterate forward and backward to > the same conclusion until the user goes away from it. > > So instead of just repeating the search for "the" I would need to > return a unique key for the result set, that the server saved for me: > > Client hits Resource (stateless, short lived) by POST to /search with > something like > > the > > Resource submits the search to a query service. > Query service hits fulltext index and gets 9,995 hits. Creates a new > stable ID, "abcdefg123." Caches this result, "abcdefg123"="{set of > 9,995 hits}" > Resource consumes first 10 results from query service. Sends > something like: > > abcdefg123 > 9995 > 1 > 10 > The first of many > ... > The tenth of many > > Let's say client wants the next page of results. It immediately sends > back: > > the > abcdefg123 > 11 > > Resource asks query service for "abcdefg123" again. > Query service fetches "abcdefg123"="{set of 9,995 hits}" out of cache. > Resource consumes results 11-20 from query service ... etc. > > Here, if a second user queries on "the" they would get
Re: Pagination
Ah, pagination. One of the great programming tradeoffs :-) Have a look at this comment thread from Ohloh a while back. http://www.ohloh.net/forums/3491/topics/1056 Josh Triplett proposes a good solution that is lightweight for paging non-critical data without server state. You can guarantee a stable search result for the duration of the browse by caching the entire result set server side and providing a means of moving through it ... that might scale to hundreds of results, but not so much to millions. Still, that's the usual Session idiom. Here's what I usually do ... send an HTML or XML representation with sufficient information about how to repeat the search and page thru it, but keep no server side state per se. I just make sure the data layer is smart enough about caching result sets to avoid unnecessary work. Example: say I am exposing a fulltext search over a collection of 10,000 documents and someone searches on "the". Client hits Resource (stateless, short lived) by POST to /search with something like the Resource submits the search to a query service. Query service hits fulltext index and gets 9,995 hits. Caches this result, "the"="{set of 9,995 hits}" Resource consumes first 10 results from query service. Sends something like: 9995 1 10 The first of many ... The tenth of many Let's say client wants the next page of results. It immediately sends back: the 11 Resource asks query service for "the" again. Query service fetches "the"="{set of 9,995 hits}" out of cache. Resource consumes results 11-20 from query service ... etc. This approach is not guaranteed stable. If the result set expires from cache and also changes in between paginated queries you might end up missing some results, ending up at 9,997 results instead of 9,995, etc. When you do soft stuff like Google searches, this happens all the time (wait, my blog was on page 1 when I started ...) But if you were searching for credit card transactions, the instability would be unacceptable. Here, your search result set must be stored uniquely and guaranteed to iterate forward and backward to the same conclusion until the user goes away from it. So instead of just repeating the search for "the" I would need to return a unique key for the result set, that the server saved for me: Client hits Resource (stateless, short lived) by POST to /search with something like the Resource submits the search to a query service. Query service hits fulltext index and gets 9,995 hits. Creates a new stable ID, "abcdefg123." Caches this result, "abcdefg123"="{set of 9,995 hits}" Resource consumes first 10 results from query service. Sends something like: abcdefg123 9995 1 10 The first of many ... The tenth of many Let's say client wants the next page of results. It immediately sends back: the abcdefg123 11 Resource asks query service for "abcdefg123" again. Query service fetches "abcdefg123"="{set of 9,995 hits}" out of cache. Resource consumes results 11-20 from query service ... etc. Here, if a second user queries on "the" they would get their own guaranteed stable result set, unless you can be really smart about knowing which result sets are identical and can share an ID. If the client waits too long before changing pages, and abcdefg123 goes out of cache, the server can either return some sort of error, or repeat the search and send back the response with some kind of flag to indicate that the result set has changed. (I like this last behavior, along with the above "really smart about knowing which result sets are identical") I like ehcache a lot for all this. I can trivially implement memory sensitive caches with disk backing to hold server side resources representing result sets and other goodies. This can be done very close to the Representation level to avoid duplicative work, and the general usage doesn't vary much between different kinds of data layers -- relational queries, Web service queries, Lucene queries, XML document searches, etc. Finally, if you're dealing with "dumb" HTML clients (that work page by page and can't be bothered to keep state like what they queried for in the first place), instead of the XML example above, you can incorporate a form that repeats the search (with the appropriate pagination) directly into your HTML result. This isn't much of a problem with AJAX or desktop clients that can maintain their own state better. Any of that useful? - R On Fri, Jun 5, 2009 at 2:14 PM, Dustin N. Jenkins < dustin.jenk...@nrc-cnrc.gc.ca> wrote: > I'd like to be able to paginate my search results in my Restlet Web > Application as the user
Pagination
I'm using the 2.0M3 version of Restlet with JDK 1.6 in a Fedora Core 8 environment. My View Layer uses the included FreeMarker. I'd like to be able to paginate my search results in my Restlet Web Application as the user can easily return hundreds of results. I've searched around and came across the Value List Holder Design Pattern, but I'm not sure it meets my needs. The fundamental problem I see is caching the entire result set somewhere. The alternative is to hit the database each time, which in my case is plausible since Hibernate has built-in pagination, but then I'm relying on my Persistence Layer to do pagination. Has anyone successfully done RESTful pagination in Restlet, or Java in general? I know Ruby on Rails has a way of caching it, but I don't know if it's the same as, say, a Session would do it in J2EE. Many thanks! Dustin -- Dustin N. Jenkins | Tel/Tél: 250.363.3101 | dustin.jenk...@nrc-cnrc.gc.ca facsimile/télécopieur: (250) 363-0045 National Research Council Canada | 5071 West Saanich Rd, Victoria BC. V9E 2E7 Conseil national de recherches Canada | 5071, ch. West Saanich, Victoria (C.-B) V9E 2E7 Government of Canada | Gouvernement du Canada -- http://restlet.tigris.org/ds/viewMessage.do?dsForumId=4447&dsMessageId=2359784