Re: Taxonomy linking

srecko joksimovic Mon, 09 Jan 2012 23:29:22 -0800

Thank you very much!

That was the information I needed.


Best

On Tue, Jan 10, 2012 at 8:11 AM, Rupert Westenthaler <
[email protected]> wrote:

>
> On 09.01.2012, at 23:31, Srecko Joksimovic wrote:
>
> > Hello Rupert,
> >
> > Could you please give me an example of annotation various types of
> documents?
> > As I understood from
> http://incubator.apache.org/stanbol/docs/trunk/enhancer/engines/metaxaengine.html
> >
> > And
> >
> > curl -i -X POST -H "Content-Type:text/html" -T testpage.html
> http://localhost:8080/engines
> >
> > MIME type should match to document type. But (maybe this is going to be
> stupid question)… when I annotated text, I called method like this:
> > IOUtils.write(_string_to_annotate, out);
> > IOUtils.closeQuietly(out);
> >
> > For document of any type, I should probably convert document content to
> byte array, and then call similar method?
> > I’m asking this because I didn’t see the possibility to provide document
> URL and to get results. I suppose that this would be the only way?
> >
>
> Generally the MIME type of the content MUST  match the parsed value of the
> Content-Type header. Maybe the Metaxa engine has also some way to detect
> the MIME type based on the content, But I do not know if this is the case.
>
> It is also true that for binary documents you need to use byte oriented
> methods of IOUtils. However I would also consider to "stream" the data
> directly from the file to the OutputStream of the POST request to avoid
> loading the whole content into memory.
>
> Note that for textual content you should also correctly set the Charset.
> If you use an other Charset than "UTF-8" you do need to set it as parameter
> to the parsed "Content-Type" header.  Such as
>
>    Media-Type: text/plain;  charset=UTF-16
>
> best
> Rupert
>
>

Re: Taxonomy linking

Reply via email to