I've created a new patch which hold's an in-memory object of
JSONObjects.
It adds the results to the object as it scans the children.
If there are too many children it responds with the following:
Status: 300
Header Content-Type: application/json
Body ["/path/to/node.4.json"]
In this case, /path/to/node.4.json is the url which will result in a
safe dump of the tree
Otherwise it outputs the object with a nomal 200.
Pushed the patch to: http://codereview.appspot.com/186167
WDYT?
Simon
On 14 Jan 2010, at 10:02, Dominik Süß wrote:
Sure, this would require no direct streaming but buffering the
result. But
this would always be mandatory if the response code should vary from
the
determinded data. My idea was to collect data (prepare the response,
not
writing in the response) and check for depth in the same run.
For HTTP this would be a 302 since the data may change at any time and
should always be accessed by the original URI.
Dominik
On Thu, Jan 14, 2010 at 10:24 AM, Ian Boston <[email protected]> wrote:
I think, if you wanted to modify the response which is being
streamed and
may have already been committed then you would need to know in
advance.
However if the indication that the results were truncated was in the
response itself (ie add a continuation URL) then a single pass is
probably
all thats required.
The only downside of a continuation URL in the response is it
changes the
HTTP API.
Ian
On 14 Jan 2010, at 09:19, Dominik Süß wrote:
Why would this require two runs.
The data of depth could be collected while iterating through the
levels.
It
would require some adjustments like prefetching for the current
level to
check if the result would get too big but appending this list to the
other
list should be a cheap operation.
Best regards
Dominik
On Wed, Jan 13, 2010 at 9:54 AM, Felix Meschberger <[email protected]
wrote:
Hi,
First off, I think scanning the (sub-)tree twice (once for
checking,
once for sending) is not a good idea performance-wise anyway.
Put this aside, the check-part could scan breath-first, keeping
record
of the number items visited after each level. As soon as the
threshhold
as been reached, the maximum supported level is known.
The rendering-part can then use the level to actually render the
data,
which is done in a depth-first manner.
We might be able to combine the two approaches, by building an in-
memory
representation of the JSON data (JSONObject) and when the
threshold has
been reached, just serialize the JSONObject.
Regards
Felix
On 13.01.2010 00:51, Simon Gaeremynck wrote:
Ok,
Is the following approach better?
Consider node.10.json
Check if the response will contain more than 200 nodes
If so, proceed with the way it is now and send the resources
along with
a 200 response code.
If it is not,
Check if node.0.json results in a set bigger then 200 nodes.
If not check node.1.json, then node.2.json, ...
Basically, keep increasing the level until the number of
resources is
bigger then 200.
This would give the highest recursion level you can request.
The server would then respond with a 300 and (I think?) a header
'Location' with the highest level.
The thing off course is that you would have to loop over all those
nodes
again and again.
Jackrabbit will have caches for those nodes but I'm not really
sure
what
the impact on performance would be.
Simon
On 12 Jan 2010, at 00:53, Roy T. Fielding wrote:
On Jan 11, 2010, at 10:01 AM, Simon Gaeremynck wrote:
Yes, I guess that could work.
But then you can still do node.1000000.json which results in
the same
thing.
I took the liberty to write a patch which checks the amount of
resources will be in the result.
If the result is bigger than a pre-defined OSGi property (ex:
200
resources) it will send a 206
partial content with the dump of 200 resources and will ignore
the
rest.
It can be found at http://codereview.appspot.com/186072
Simon
Sorry, that would violate HTTP. Consider what impact it has on
caching by intermediaries.
Generally speaking, treating HTTP as if it were a database is a
bad design. If the server has a limit on responses, then it
should only provide identifiers that remain under that limit
and forbid any identifiers that would imply a larger limit.
An easy way to avoid this is to respond with 300 and an index of
available resources whenever the resource being requested would
be too
big.
The client can then retrieve the individual (smaller) resources
from
that index.
....Roy