On 20.10.2011, at 14:04, Florent André wrote:
>
>
> On 10/20/2011 01:31 PM, Rupert Westenthaler wrote:
>> Hi
>>
>>> On 10/20/2011 10:35 AM, florent andré wrote:
>>>> With camel Route you have a splitter [2] build in and as a counter part
>>>> an aggregator [3].
>>>>
>>>> For both you can define particular split/aggregate business logic.
>>
>> So you use this to send the different parts of an email to different
>> Stanbol Instances and after that you merge the enhancement results
>> together?
>
> The point is that I added the camel framework *inside* Stanbol - as an
> implementation of JobManager.
> So all EIP routing capabilities are available inside Stanbol, as a process
> chain endpoint (eg : engines/chain1, engines/chain2, ...).
>
> For example, you can build a chain like that :
> from("direct://chain2").to("org.apache.stanbol.engine.MyEngine1").to("org.apache.stanbol.engine.MyEngine2");
> ==> Classical CI output occur
>
> Or have Stanbol polling info from one of the many camel's component [1] and
> output the result as the same.
> ex :
> from("imap://imap.my.mail?login=toto&pass=tata").to("org.apache.stanbol.engine.MyEngine1").to("org.apache.stanbol.engine.MyEngine2").to("http://mySite/addContent");
> ==> pick information from imap, process it and send http request result to
> the CMS.
>
That looks really great. Especially for users that want to use Stanbol to
enhance content from different Enterprise information sources (e.g. Mail, CMS,
RSS feeds …)
Is there a UI/Script language to configure such workflows, or do you need to
write such things in Java.
> [1] http://camel.apache.org/components.html
>
>>
>> On Thu, Oct 20, 2011 at 11:08 AM, florent andré
>> <[email protected]> wrote:
>>> maybe this one : http://www.semanticdesktop.org/ontologies/nmo/
>>> What do you think about that ? Others more suitable ?
>>
>> In the case of E-Mails the semanticdesktop NMO ontology looks ok.
>>
>> I think that the decision on how to model relations between
>> ContentItems should be up to the Stanbol User. Stanbol returns a RDF
>> Graph that connects all enhancements to the ContentItems they are
>> extracted from. Users can than use any Ontology they like to to link
>> such ContentItems together (e.g. in the Business logic of the
>> aggregator) .
>
> The thing is that in this case, the mix can occur in stanbol...
> And IMO as Stanbol offer a way to store contentItem it could be cool that
> Stanbol also offer a way to link this CI when suitable.
>
>>
>>
>> Also note that this is related to the following two topics:
>>
>> 1. Content Adapter Pattern: (User sends PDF; Enhancement Engine asks
>> the ContentAdapter to get the Text version of the PDF). The
>> ContentAdapter could not only support the conversion of Format A>>
>> Format B but also - as in the case of E-Mails - know that there is
>> already a Text AND a HTML version.
>
> Yep, that a point I have to invest more in Camel...
> they have the type-converter element [2], that could also answer to : how to
> convert a CI for send it via CMIS ?
>
> [2] http://camel.apache.org/type-converter.html
>
>>
>> 2. Definition of the Stanbol Enhancement Structure (see STANBOL-351)
>
> that's an hot topic !
>
> ++
>
>> [1]. Here one could argue that Stanbol should support parent child
>> relations between ContentItems.
>>
>> best
>> Rupert
>
>
>
>>
>>> ++
>>>
>>>>
>>>>
>>>> This idea will be not so hard to implement then :
>>>> >> One could also add some additional triples that link the attachment
>>>> with
>>>> >> the Mail and that the content of the Mail is available as a text and
>>>> >> html version.
>>>>
>>>> There is some particular / recommended / standard type of triples for
>>>> describe :
>>>> - attachment graph is link to Mail graph
>>>> - content available as text and html
>>>> ?
>>>>
>>>> Thanks.
>>>>
>>>> [1] : http://camel.apache.org/enterprise-integration-patterns.html
>>>> [2] : http://camel.apache.org/splitter.html
>>>> [3] : http://camel.apache.org/aggregator2.html
>>>>
>>>> On 10/20/2011 09:07 AM, Fabian Christ wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> if I remember correctly, we had the idea to allow different chains of
>>>>> enhancement engines to be configured under different URLs. Maybe
>>>>> Florent's use case is interesting for this. Florent could create an
>>>>> engine that is able to split the different content types and then
>>>>> start enhancement with different chains for each content type. If
>>>>> chains can call other chains, it would be possible to define such
>>>>> complex workflows for content enhancement.
>>>>>
>>>>> Best,
>>>>> - Fabian
>>>>>
>>>>> 2011/10/19 Rupert Westenthaler<[email protected]>:
>>>>>>
>>>>>> Hi florent
>>>>>>
>>>>>> I would create use two enhancement request
>>>>>>
>>>>>> 1. for the Text and
>>>>>> 2. for the Attachment.
>>>>>>
>>>>>> and then merge the returned RDF graphs with the enhancements. One
>>>>>> could also add some additional triples that link the attachment with
>>>>>> the Mail and that the content of the Mail is available as a text and
>>>>>> html version.
>>>>>>
>>>>>> best
>>>>>> Rupert
>>>>>>
>>>>>> On Wed, Oct 19, 2011 at 6:22 PM, florent andré
>>>>>> <[email protected]> wrote:
>>>>>>>
>>>>>>> Hi Stanbolers !
>>>>>>>
>>>>>>> Imagine a classical html mail with attachment.
>>>>>>> This mail is in fact composed by (at least) 3 parts :
>>>>>>> * text/plain mail body
>>>>>>> * html mail body
>>>>>>> * attachment.
>>>>>>>
>>>>>>> One html mail + attachment can be considered as one CI - one piece of
>>>>>>> information/knowledge send by a guy.
>>>>>>>
>>>>>>> In fact, text plain and html will have (pretty much*) the same
>>>>>>> metadatas and
>>>>>>> keeping both is interesting :
>>>>>>> - text plain for processing and annotations positions
>>>>>>> - html for keep the source and be able to enhance the html with rdfa,
>>>>>>> links,...
>>>>>>>
>>>>>>> And attachment, will mostly have a different metadata, but this
>>>>>>> metadatas
>>>>>>> are in a way related to the mail body's one...
>>>>>>>
>>>>>>> It could be domageable - IMO - to manage attachment and mail body
>>>>>>> metadatas
>>>>>>> in a totally disconnected way (aka two different Content Item).
>>>>>>>
>>>>>>> Note that this usecase also match with CMS articles with files (pdf,
>>>>>>> odt...)
>>>>>>> to downloads for further reading.
>>>>>>>
>>>>>>> And now the real question :
>>>>>>> How can we manage nicely this kind of "composed things" ?
>>>>>>>
>>>>>>> Insights are very welcome ! :)
>>>>>>> Have a good day
>>>>>>> ++
>>>>>>>
>>>>>>>
>>>>>>> * pretty much because when can imagine be able to extract some more
>>>>>>> metatadas from html (color, font size, rdfa, ...)
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> | Rupert Westenthaler [email protected]
>>>>>> | Bodenlehenstraße 11 ++43-699-11108907
>>>>>> | A-5500 Bischofshofen
>>>>>>
>>>>>
>>>>>
>>>>>
>>>
>>
>>
>>