Re: [Catalyst] More detailed proposal for changes related to content negotiation and REST

John Napiorkowski Fri, 09 Aug 2013 12:14:06 -0700



On Friday, August 9, 2013 12:00 PM, Bill Moseley <mose...@hank.org> wrote:


>
>
>On Thu, Aug 8, 2013 at 9:27 PM, John Napiorkowski <jjn1...@yahoo.com> wrote:
>
>
>>
>>https://github.com/perl-catalyst/CatalystX-Proposals-REStandContentNegotiation
> 
>
>
>I currently extend HTTP::Body in a similar way that you describe.   But I have 
>them a separate classes so I just do:
>
>
>use My::Multipart;
>
>then in that I hack in my type for HTTP::Body:
>
>
>
>package My::MultiPart;
>>use HTTP::Body;
>>$HTTP::Body::TYPES->{'multipart/form-data'} = 'My::MultiPart';
>>
>>

Yes, I do this as well.  It feels a bit icky though, but this is a possible 
approach for us as all

>>
>>
>
>
>As you propose, mapping a mime type to a sub seems pretty flexible.  I assume 
>the sub could return a filehandle.   File uploads still need to stream the 
>uploads to disk while making the parameters available as HTTP::Body does now.
>

Yes, my original thinking is the subref would return something that would 
populate body_data.  I suppose that could be an object or file handle.  This is 
one good reason to distinguish body_data from body parameters (which is 
expected to be a hash ref).

I'm not really sure the best thing to do here with multipart and particularly 
with the file upload canonical type 'multipart/formdata'.  My thinking is that 
if someone wants to roll their own, they can, be everthing goes to body_data, 
otherwise one can simple just use the existing built in support for this, which 
really isn't broken.

Now, I am not sure what to do when the multipart contains json, for example (as 
in a file upload but instead of a binary file plus urlencoded params, the 
params are application/json.  I'm very tempted to say this is out of scope for 
the first version.  Really this does point to some underlying inflexibility in 
the existing design.

>
>I like the regex mimetype keys, but should they be an array instead of a hash 
>so can set the order checked?
>

I guess I was thinking to prevent people from listing 'application/json' more 
than once by accident.  And there's some issues with the best way to merge (for 
example you might parse differently in dev from production...

What's the use case you have in mind?  Something like first check for something 
like 'application/vnd.mycompany.user+json' and then fall back to 
'application/(?:vmd.*+)?json' if you don't find it?  Is that an actual case 
you've come across?  

I guess I was thinking also that we do want to make the global functionality a 
bit limited, so that in the next iteration we can support something more fine 
grained, at the controller level perhaps.  I'd hate to introduce such terrible 
globalness to Catalyst which in general has be decent in avoiding that.

>
>I think we must consider large inputs/streams.    You say $_ is the request 
>content.  Is that the full request with headers?   Is the request already 
>de-chunked, for example?  Or am I thinking too low level?

We've spoken before about the parsing larger incoming and chunked data thing 
before.  I would love to address this, but right now it seems like something we 
need better agreement on in the psgi level.  For example, since star man 
already buffers incoming input, it feels silly to me to have catalyst then try 
to re-stream that.  You've already paid the full price of buffering in terms of 
memory, and performance right?  Or am I not understanding?

I'd really like to have something at the Catalyst level that sanely acheives 
this end, but I think part of the price we paid when going to PSGi at the core, 
is that most of the popular plack handlers are pre loading and buffering input, 
even large request input.  This seems to be an area where it might behoove us 
to work with the psgi group to find something stable.  Even the optional 
psgix.io isn't always going to work out, since some people don't want to 
support that in the handler (its a somewhat vague definition I guess and makes 
people uncomfortable).

Until them, or until someone helps me understand that my thinking is totally 
wrong on this score, it seems the best thing to do is to put this out of scope 
for now.  That way we can move on supporting a goodly number of real use cases.

I intended to say that $_ equals a string that is the buffered request body.  
This way we can reserve other args for handling the future streaming case.  I 
was actually pondering something were the sub ref returns a sub ref that gets 
called over and over to do the parse.  

>
>
>
>
>In some apps I'm using Catalyst::Action::REST and others I have some custom 
>code where I use HTTP::Body to parse JSON.   I'm mixed about having the 
>request data end up in $c->req->parameters vs $c->req->data.    I don't really 
>see why form-data/urlencoded should be treated differently than other 
>encodings like JSON.
>

I think my idea with using a new request attribute 'body_data' was so that we'd 
cleanly separate classic style params.  also, there is no certainty or 
expectation that body_data contain a hash ref, it could be an object, an 
arrayref (think if someone uploads a CVS file for example), so there's no easy 
way to map this to body_parameters.  I know the rails people tried to do this, 
and it did lead to some security problems, if I recall properly.

I did for a long time think it would be idea for them to be one and all the 
same, but my thinking 'evolved' based on the issues described.  For what it is 
worth 'body_data' was suggested by mst, if that carries any weight.


>
>I not quite sure about $c->res->body( \%data );   I think body should be the 
>raw body.   What does $c->res->body return?  The serialized json?  The 
>original hashref?   
>
>

I'm not sure I like it either.  I would say body returns whatever you set it 
to, until the point were encoding happens.  It does feel a bit flaky, but I 
can't actually put my finger on a real code smell here.

Any other suggestions?  This is certainly a part of the proposal that is going 
to raise doubt, but I can't think of something better, or assemble problematic 
use cases in my head over it either.

One thinking I had here was to just overload the response->format to allow for 
just having a hashref or array ref,  Then ->body would get populated 
immediately based on that (and you'd get a decent place to catch exceptions as 
well.

Any other thoughts?  I think the idea here was to start with the most simple 
thing I can think of and make it more complex as needed.

>
>
>If a parser dies what kind of exception is thrown?   You say they would not 
>set any response status, but wouldn't we want to catch the error and then set 
>a 400?  (I use exception objects that carry http status, a message to return 
>in the body and a message used for logging at a given level.)
>

How people do exceptions in Perl tends to be nearly religious, and I didn't 
want to hold this up based on figuring that stuff out :)  I was thinking to 
just raise an exception and let the existing Catalyst stuff do its thing.  I'm 
just thinking not to add anything special for this type of error, but just do 
the existing behavior, for better or worse.

>
>I'm not sure what to think of the Provides and Consumes controller attributes. 
>  It's one thing to realize early that the client cannot handle any response 
>I'm willing to provide (json, yaml, whatever), but I'm worried this would blur 
>separation of concerns.  That is, by the time the controller runs we would 
>have already decoded the request and want to just work with the data provided. 
>  And would we want different data returned if the client is asking for json 
>vs yaml?
>
>
>Maybe I'm missing the point of that.
>

I think the main thing is to do the match early, rather than perform the entire 
action or action chain, and then find out that in the end we don't have an 
agreed upon data exchange format.

since request->body_data is intended to be lazy, we won't run that parse code 
until you ask for the data.  We don't need to parse the data to do the basic 
match here, this is just based on the HTTP meta data, no the actual content.  I 
think for common cases this is fine (I realize that yet again this might not be 
the best approach for multipart uploads...)

I think we do want to try and give us a baby step toward a dispatch based not 
just on path args and http method, but allow one to do real content negotiation 
via the actions meta data.  That way if you want to return code differently 
based on if the request accepts json version some other format, we can take 
advantage of Catalyst's built in dispatching for that.

For example, you might format the data differently if the request is Atom, 
verses JSON (or you might set your HTTP headers differently to make up for how 
some types of encoding lack information.  For example, its common in REST 
circles to use the HTTP link header when using some types of formats.

I do think there is a strong use case to say that they data returned can vary 
based on the request accepts.

>
>
>
>BTW - I practice I find it pretty handy to be able to specify/override 
>response encoding via a query param.

Yup, I know that is common, but I think I'd prefer to handle that as Plack 
middleware instead of baking it into catalyst.  I actually baked in all those 
http header overrides for the http method when I did the support for dispatch 
based on Method, and I regret it (it will need to be removed and refactored 
eventually)

>
>-- 
>Bill Moseley
>mose...@hank.org
>_______________________________________________
>
>List: Catalyst@lists.scsys.co.uk
>Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
>Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
>Dev site: http://dev.catalyst.perl.org/
>
>
>

_______________________________________________
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/
Re: [Catalyst] More detailed proposal for changes related to content negotiation and REST

Reply via email to