Thank you very much! That was the information I needed.
Best On Tue, Jan 10, 2012 at 8:11 AM, Rupert Westenthaler < [email protected]> wrote: > > On 09.01.2012, at 23:31, Srecko Joksimovic wrote: > > > Hello Rupert, > > > > Could you please give me an example of annotation various types of > documents? > > As I understood from > http://incubator.apache.org/stanbol/docs/trunk/enhancer/engines/metaxaengine.html > > > > And > > > > curl -i -X POST -H "Content-Type:text/html" -T testpage.html > http://localhost:8080/engines > > > > MIME type should match to document type. But (maybe this is going to be > stupid question)… when I annotated text, I called method like this: > > IOUtils.write(_string_to_annotate, out); > > IOUtils.closeQuietly(out); > > > > For document of any type, I should probably convert document content to > byte array, and then call similar method? > > I’m asking this because I didn’t see the possibility to provide document > URL and to get results. I suppose that this would be the only way? > > > > Generally the MIME type of the content MUST match the parsed value of the > Content-Type header. Maybe the Metaxa engine has also some way to detect > the MIME type based on the content, But I do not know if this is the case. > > It is also true that for binary documents you need to use byte oriented > methods of IOUtils. However I would also consider to "stream" the data > directly from the file to the OutputStream of the POST request to avoid > loading the whole content into memory. > > Note that for textual content you should also correctly set the Charset. > If you use an other Charset than "UTF-8" you do need to set it as parameter > to the parsed "Content-Type" header. Such as > > Media-Type: text/plain; charset=UTF-16 > > best > Rupert > >
