Thanks for that detailed answer. Don't worry, I understand that the notion of 
yard is specific to the EntityHub-- I was just using it as an analogy. 

I do have a further question about general architecture: there exists currently 
a CMSAdaptor service that aims to map content repositories and semantic stores 
into each other. Is there an intention of cohering the abstractions for content 
storage and semantic storage between these two efforts (ContentHub and 
CMSAdaptor)? I realize that the intentions of each are distinct, but it seems 
that much of the underlying machinery might be reusable, and it might help 
integrators (like me {grin}) if they both could be addressed with a minimum of 
"glue", in cases where the semantic store is suitable for both indexing and 
ontology storage. If it's not obvious, I'd like to imagine a design in which a 
content repository is both indexed and typed by Stanbol services into the same 
semantic storage.

I have one other question about this specific effort: in IndexingSource I find 
the important method:

Item get(String uri) throws StoreException

so it seems that this interface is meant to be used synchronously in direct 
operation, when get() doesn't block for any long time waiting for a large datum 
to transit or for slow storage to produce results. In order to use this gear in 
these cases, would it be necessary to rewrite the upper-level component 
"Content Create/Update"? Or could one expect to create a kind of queuing 
component and wire it between "Content Create/Update" and "Content Item 
Storage", maintaining synchronous behavior in the upper level of architecture?

---
A. Soroka
Software & Systems Engineering :: Online Library Environment
the University of Virginia Library

On Oct 12, 2012, at 11:27 AM, Suat Gonul wrote:

> Hi,
> 
> On 10/12/2012 5:57 PM, [email protected] wrote:
>> Do I understand rightly that one eventual consequence of this architecture 
>> will be that content items might be stored in some external service with a 
>> standardized interface (say, a JCR repository) and semantic-indexed into 
>> another external service?
> 
> Exactly, this kind of Store implementation would wrap the nodes in a JCR
> repository as ContentItems and those ContentItems would be indexed in
> different SemanticIndexes. We have a plan to provide a basic JCR
> compliant Store implementation in the following months.
> 
>> 
>> The diagram attached to the main issue shows Solr as the implementing 
>> component for the semantic-index. Is there expected to be a possibility to 
>> use an RDF store in that role? (In the way that one can choose Solr or 
>> Clerezza to back a yard in the the EntityHub?)
> 
> This is also correct. We are already working on a Clerezza based
> SemanticIndex implementation, although it is not a yard managed by the
> Entityhub.
> 
>> 
>> Lastly, can you point me to the interfaces that will be actually be used to 
>> store an item? The reason I am asking is that I am wondering about 
>> asynchronizing behavior (for example, for very large content items or very 
>> high-latency storage).
> 
> Sure. We have three main interfaces for the time being:
> 
>  * IndexingSource[1]: Read-only indexing source to be used by the
>    SemanticIndex implementations.
>  * Store[2]: An extension for the IndexingSource interface providing
>    create and delete operations.
>  * SemanticIndex[3]: The interface describing methods to semantically
>    index items
> 
> You can find the current implementations for these interface for the
> ContentItem type in the following links:
> 
>  * FileStore[4]: This Store implementation serializes ContentItems into
>    the zip files and store in the file system.
>  * SolrSemanticIndex[5]: This is a Solr based SemanticIndex
>    implementation.
> 
> Please note that the current state of these won't compile successfully
> since I didn't have time to adjust dependency versions according to
> latest changes in the trunk.
> 
> Best,
> Suat
> 
> [1]
> https://svn.apache.org/repos/asf/stanbol/branches/contenthub-two-layered-structure/commons/semanticindex/servicesapi/src/main/java/org/apache/stanbol/commons/semanticindex/store/IndexingSource.java
> [2]
> https://svn.apache.org/repos/asf/stanbol/branches/contenthub-two-layered-structure/commons/semanticindex/servicesapi/src/main/java/org/apache/stanbol/commons/semanticindex/store/Store.java
> [3]
> https://svn.apache.org/repos/asf/stanbol/branches/contenthub-two-layered-structure/commons/semanticindex/servicesapi/src/main/java/org/apache/stanbol/commons/semanticindex/index/SemanticIndex.java
> [4]
> https://svn.apache.org/repos/asf/stanbol/branches/contenthub-two-layered-structure/contenthub/store/file/src/main/java/org/apache/stanbol/contenthub/store/file/FileStore.java
> [5]
> https://svn.apache.org/repos/asf/stanbol/branches/contenthub-two-layered-structure/contenthub/index/src/main/java/org/apache/stanbol/contenthub/index/solr/SolrSemanticIndex.java
> 
>> This looks like really excellent work!
>> 
>> ---
>> A. Soroka
>> Software & Systems Engineering :: Online Library Environment
>> the University of Virginia Library
>> 
>> On Oct 12, 2012, at 10:35 AM, Suat Gönül wrote:
>> 
>>> Hi Stephane,
>>> 
>>> The parent issue for this structure is STANBOL-471[1]. You can find an
>>> image within that issue representing the general structure offered by the
>>> 2-layered approach. The parent issue has some sub-issues. Especially, in
>>> STANBOL-498 and STANBOL-499, you can find detailed information regarding to
>>> the mentioned two layers. Once, I had written a mail mentioning about this
>>> structure at [2]. Please note that some of the class names have changed
>>> since that mail.
>>> 
>>> The main purpose of this approach is to separate the storage and indexing
>>> functionalities of Contenthub. However, it seems that these changes can be
>>> adapted throughout the Stanbol.For instance, Rupert has already developed
>>> some Store implementations in the scope of Entityhub(see STANBOL-704),
>>> although they are not in the final state yet. This separation will allow to
>>> implement different SemanticIndex implementations for different use cases
>>> based on the same Store keeping some items. There can also be different
>>> Store implementations. For instance a Clerezza graph can be used as a Store
>>> or another Store implementation can be implemented as a bridge between
>>> Stanbol and a real content management system, etc.
>>> 
>>> As solr version, we use the one specified in the parent pom.xml of the
>>> Stanbol. And it is currently 3.2.0.
>>> 
>>> Hope this helps, best,
>>> Suat
>>> 
>>> [1] https://issues.apache.org/jira/browse/STANBOL-471
>>> [2] http://markmail.org/message/o4quthsuubhlswtz
>>> 
>>> On Fri, Oct 12, 2012 at 4:07 PM, Stéphane Gamard <
>>> [email protected]> wrote:
>>> 
>>>> Hi Suat,
>>>> 
>>>> Can I ask you to point me to some doc about the 2-layer service? What
>>>> is its purpose? And another question is about the solr version used,
>>>> which one is it?
>>>> 
>>>> Cheers,
>>>> 
>>>> Stephane
>>>> 
>>>> Sent from my iPhone
>>>> 
>>>> On Oct 12, 2012, at 1:00 PM, Suat Gonul <[email protected]> wrote:
>>>> 
>>>>> Hi Fabian,
>>>>> 
>>>>> I am planning to make a release for Contenthub. We have done some
>>>>> updates on the this component in the "trunk" since the 0.9.0-incubating
>>>>> release.  However, as you know there is also a new structure for
>>>>> Contenthub in the "contenthub-two-layered-branch". Although, the work in
>>>>> the branch is still in progress and the changes in the trunk are not so
>>>>> much, I would like to make a release of Contenthub before merging that
>>>>> branch into to the trunk.
>>>>> 
>>>>> Currently, we are doing some improvements on Contenthub in the trunk.
>>>>> Once that job is done, we can prepare a release. WDYT?
>>>>> 
>>>>> Best,
>>>>> Suat
>>>>> 
>>>>> On 10/12/2012 12:10 AM, Fabian Christ wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> I am investigating the components that should go for a release in the
>>>>>> near future. We should try to bring as much components to a 1.0 status
>>>>>> as possible. After graduation it would be a good sign to start and
>>>>>> establish a release cycle.
>>>>>> 
>>>>>> My first release candidate would be the Enhancer with all Enhancement
>>>>>> Engines. So I will check the Enhancer and engines if all requirements
>>>>>> for a release are met (license, POMs, etc).
>>>>>> 
>>>>>> What about other components? Please, make suggestions as I do not have
>>>>>> a detailed overview of the status of all the code parts and branches.
>>>>>> 
>>>>>> Best,
>>>>>> - Fabian
>> 
> 

Reply via email to