[jira] [Issue Comment Edited] (STANBOL-481) Multi ContentPart RESTful API extensions

Rupert Westenthaler (Issue Comment Edited) (JIRA) Tue, 14 Feb 2012 06:03:28 -0800

    [ 
https://issues.apache.org/jira/browse/STANBOL-481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13203610#comment-13203610
 ]


Rupert Westenthaler edited comment on STANBOL-481 at 2/14/12 2:01 PM:
----------------------------------------------------------------------

This is a suggestion to change the original proposal for the RESTful API 
extensions to support ContentParts with the request for feedback.

## MultiPart ContentItem RESTful API

Users that parse parameters that cause Stanbol to send multiple ContentParts 
MUST

 * NOT use an Accept header: In this case the result will default to 
"multipart/form-data"
 * set the Accept header to "multipart/form-data"

Users that want to parse multiple ContentItem parts (such as metadata and/or 
alternate content versions) MUST the the Content-Type header to 
"multipart/form-data" and follow the specification as defined in the above 
comments.

### Removal "inputWithMetadata"

In my opinion this parameter is not neede.

Rational: If the parsed content is "multipart/from-data" content negotiation 
can be used to automatically detect that a ContentItem is uploaded. If the 
first part has than the name "metadata" this means that metadata are provided. 
If the first part is "content" than only content (but possible in different 
alternate versions) is provided. 

### Output ContentParts to Responses

__Original Proposal:__ Optional outputWithContentParts[=<section-ordinal>] -> 
the result is multipart (instead of rdf) containing rdf as the first section 
and the parts in the second section, if there is more than one part this second 
section is itself multipart, this argument might be repated to have different 
sections

This proposal suggests the usage of the index (an integer number) to select 
ContentParts to be included. However for users it will be very hard to know 
such indexes because the do depend on the ordering of components adding 
ContentParts to the contentItem.

As an example: As an result of STANBOL-431 the EnhancementJobManager now adds 
an MGraph with the ExecutionMetadata to the ContentItem. Because the 
EnhancementJobManager does this before the first Engine is called. This 
ContentPart will be always the second contentPart (index '1') of an 
ContentItem. A alternate version (e.g. a "plain/text" version created be the 
MetaxaEngine) will be most likely found at index '2'
As soon as this issue is implemented Users will be able to directly parsed 
alternate versions of Content in the Request. This will again change the 
ordering, because such parts will be added before the ExecutionMetadata.

In addition contentPart is an rather abstract concept. I would argue that most 
users would have more interest in specifying what alternate versions of the 
original content they would like to be included in the response. This would ask 
for an interface that allows to specify a list of MediaTypes that should be 
included and maybe an additional switch to exclude the original (parsed) 
content.

Based on [1] there is also a second type of contentPart that is typically 
identified by well known URIs. Currently this is used for the 
ExecutionMetadata, but this could be also used to include the original 
responses of remote services such as Zemanta, opencalais ... For such kind of 
contentParts users would need to have the possibility to include contentParts 
based on the URI

Based on that I suggest the following API:

* __outputContentType=[mediaType]:__ This should include all Blobs with an 
mediaType that is compatible with the parsed value (e.g. '*' ... all, 'text/*'' 
... all text versions, 'text/plain' ... only the plain text version of the 
parsed content. May be used multiple times to parsed several values
* __omitParsed=[true/false]:__ exclude all parsed versions form the response. 
Default is 'false'.
* __outputContentPart=[uri/'*']:__ This will include the ContentPart with that 
URI. The value of '*' indicates that all contentParts (other than Blobs) should 
be included. May be used multiple times to parsed several URIs.
* __rdfFormat=[rdfMimeType]:__ This allows for requests that result in 
multipart/from-data encoded responses to specify the used RDF serialization 
format. Supported formats and defaults are the same as for normal Enhancer 
Requests. 

NOTE: Even if only a single Blob is serialized in the response the Multipart 
MIME part "content" will still be a "mime/alternate", but than only with an 
single entry.


### Omit Metadata in the Response

No changes to the original proposal

* __omitMetadata=[true/false]:__ If enabled no metadata in the result. makes 
only sense with outputContentParts argument, the result will correspond to the 
second section of the malipart returned without this argument 

### Returning a single ContentPart

Requests that use an __"Accept"__ header AND __omitMetadata=true__ are 
interpreted like

* outputContent={accept-header-value}

however instead of using "multipart/form-data" the content parts of this 
request are directly serialized to the Response


[1] 
http://stanbol.staging.apache.org/stanbol/docs/trunk/enhancer/contentitem.html#contentparts

                
      was (Author: rwesten):
    This is a suggestion to change the original proposal for the RESTful API 
extensions to support ContentParts with the request for feedback.

## MultiPart ContentItem RESTful API

Users that parse parameters that cause Stanbol to send multiple ContentParts 
MUST

 * NOT use an Accept header: In this case the result will default to 
"multipart/form-data"
 * set the Accept header to "multipart/form-data"

Users that want to parse multiple ContentItem parts (such as metadata and/or 
alternate content versions) MUST the the Content-Type header to 
"multipart/form-data" and follow the specification as defined in the above 
comments.

### Removal "inputWithMetadata"

In my opinion this parameter is not neede.

Rational: If the parsed content is "multipart/from-data" content negotiation 
can be used to automatically detect that a ContentItem is uploaded. If the 
first part has than the name "metadata" this means that metadata are provided. 
If the first part is "content" than only content (but possible in different 
alternate versions) is provided. 

### Output ContentParts to Responses

__Original Proposal:__ Optional outputWithContentParts[=<section-ordinal>] -> 
the result is multipart (instead of rdf) containing rdf as the first section 
and the parts in the second section, if there is more than one part this second 
section is itself multipart, this argument might be repated to have different 
sections

This proposal suggests the usage of the index (an integer number) to select 
ContentParts to be included. However for users it will be very hard to know 
such indexes because the do depend on the ordering of components adding 
ContentParts to the contentItem.

As an example: As an result of STANBOL-431 the EnhancementJobManager now adds 
an MGraph with the ExecutionMetadata to the ContentItem. Because the 
EnhancementJobManager does this before the first Engine is called. This 
ContentPart will be always the second contentPart (index '1') of an 
ContentItem. A alternate version (e.g. a "plain/text" version created be the 
MetaxaEngine) will be most likely found at index '2'
As soon as this issue is implemented Users will be able to directly parsed 
alternate versions of Content in the Request. This will again change the 
ordering, because such parts will be added before the ExecutionMetadata.

In addition contentPart is an rather abstract concept. I would argue that most 
users would have more interest in specifying what alternate versions of the 
original content they would like to be included in the response. This would ask 
for an interface that allows to specify a list of MediaTypes that should be 
included and maybe an additional switch to exclude the original (parsed) 
content.

Based on [1] there is also a second type of contentPart that is typically 
identified by well known URIs. Currently this is used for the 
ExecutionMetadata, but this could be also used to include the original 
responses of remote services such as Zemanta, opencalais ... For such kind of 
contentParts users would need to have the possibility to include contentParts 
based on the URI

Based on that I suggest the following API:

* __outputContentType=[mediaType]:__ This should include all Blobs with an 
mediaType that is compatible with the parsed value (e.g. '*' ... all, 'text/*'' 
... all text versions, 'text/plain' ... only the plain text version of the 
parsed content. May be used multiple times to parsed several values
* __omitParsed=[true/false]:__ exclude all parsed versions form the response. 
Default is 'false'.
* __outputContentPart=[uri/'*']:__ This will include the ContentPart with that 
URI. The value of '*' indicates that all contentParts (other than Blobs) should 
be included. May be used multiple times to parsed several URIs.

NOTE: Even if only a single Blob is serialized in the response the Multipart 
MIME part "content" will still be a "mime/alternate", but than only with an 
single entry.


### Omit Metadata in the Response

No changes to the original proposal

* __omitMetadata=[true/false]:__ If enabled no metadata in the result. makes 
only sense with outputContentParts argument, the result will correspond to the 
second section of the malipart returned without this argument 

### Returning a single ContentPart

Requests that use an __"Accept"__ header AND __omitMetadata=true__ are 
interpreted like

* outputContent={accept-header-value}

however instead of using "multipart/form-data" the content parts of this 
request are directly serialized to the Response


[1] 
http://stanbol.staging.apache.org/stanbol/docs/trunk/enhancer/contentitem.html#contentparts

                  
> Multi ContentPart RESTful API extensions
> ----------------------------------------
>
>                 Key: STANBOL-481
>                 URL: https://issues.apache.org/jira/browse/STANBOL-481
>             Project: Stanbol
>          Issue Type: Sub-task
>          Components: Enhancer
>            Reporter: Rupert Westenthaler
>            Assignee: Rupert Westenthaler
>         Attachments: contentItemAsMultipartMime.txt
>
>
> Sub-task about the implementation of the RESTful API extensions related to 
> multipart content items
> Copied form the main Issue:
> - query params:
> Optional inputWithMetadata -> expects multipart/mime with 2 sections of which 
> the first is rdf
> Optional outputWithContentParts[=<section-ordinal>] -> the result is 
> multipart (instead of rdf) containing rdf as the first section and the parts 
> in the second section, if there is more than one part this second section is 
> itself multipart, this argument might be repated to have different sections
> Optional omitMetada -> no metadate in the result, makes only sense with 
> outputContentParts argument, the result will correspond to the second section 
> of the malipart returned without this argument 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (STANBOL-481) Multi ContentPart RESTful API extensions

Reply via email to