Hello,

I would like to introduce one more contribution for Apache Stanbol.

It is not an engine, but an HTTP API for Stanbol which pre-processes and
submits analysis tasks, and returns the result synchronously to the
consumer. It aims to simplify development integrations and to provide a
powerful pre-processing API for analysis of URLs.

It implements the *Readability* library, in order to support URL
submissions:
 - loading contents from remote URLs and
 - cleaning them up of all the surrounding noise.

Readability is the same library behind the *Reader* function of Safari that
many users know already.

To summarize:

   - extremely simple APIs to ease prototyping, integration and usage
   - support for textual contents
   - support for URLs
   - *for URLs, preprocessing of HTML pages to capture the actual URL
   content while skipping noise such as ads, menus and so forth*
   - synchronous access (for asynchronous access see idntik.it)

You can find more information and the source code here:
https://github.com/insideout10/stanbol-facade

Shall I open a JIRA to discuss a possible integration in the trunk?

BR,
David Riccitelli

-- check the Swagger for WordLift <http://bit.ly/VtoM5H>
********************************************************************************
InsideOut10 s.r.l.
P.IVA: IT-11381771002
Fax: +39 0110708239
---
LinkedIn: http://it.linkedin.com/in/riccitelli
Twitter: ziodave
---
Layar Partner 
Network<http://www.layar.com/publishing/developers/list/?page=1&country=&city=&keyword=insideout10&lpn=1>
********************************************************************************

Reply via email to