Am 09.01.2013 um 21:55 schrieb Neal R Lewis <[email protected]>:

> Uses:
>     Maintain independence of annotations with respect to independent pipelines
>     Archival Storage
>     Temporary storage between idependent pipelines 
>     Modify Type System 
>     Contain Meta Data about CAS 
>     Perform fine or coarse grained CRUD operations on a CAS 
>         
> 
> Functionality: 
>     Insert / Delete complete CAS(es)
>     Insert / Delete fragments of CAS(es) (individual SOFAs or FSes 
> [annotations])
>     Assemble CAS with SOFA and all / some / none feature structures 
>         - This would help reduce the size of CAS to its necessary components 
> before passing it to independent pipelines. It would also require the 
> construction of valid CASes for use in  Analytic Engines, complete with valid 
> Views
>          
>     Update CASes within the store (i.e, inserting annotations):
>         - This would allow for adding deltas from AEs 
> 
> 
> Some of these might seem redundant, but I hope they give a general overview.  
> Does this seem to summarize it well ? 

One of the requirements I had on my list was a rather technical requirement and 
I miss it in your summary:

The CAS storage should be embeddable in an application. Having to deploy an 
extra server
causes additional effort/pain when rolling out an application, in particular 
web applications
that ship as WARs. It would also facilitate shipping the storage as an Eclipse 
plugin at some
point in the future. For me, talking to the storage via direct native method 
invocation is also 
preferrable to talking to it via any kind of remote interface or via any kind 
of query language
that ends up being encoded as Strings in Java.

Adding on that, it would be good being able to run multiple storages at the 
same time with their data kept
in different directories on the file system. Consider running a parameter 
sweeping experiment in which 
every run of an experiment gets his own storage saved in an output folder for 
that run along with other data
generated by that run. To archive the run, I can just back up that folder.

Both of these aspects are influenced by the way that Apache Lucene and HSQLDB 
can be used.

Is it a requirement for you that the storage must be able to run as a server 
that you talk to via network?

Cheers,

-- Richard

-- 
------------------------------------------------------------------- 
Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab (UKP-TUD) 
FB 20 Computer Science Department      
Technische Universität Darmstadt 
Hochschulstr. 10, D-64289 Darmstadt, Germany 
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
[email protected] 
www.ukp.tu-darmstadt.de 
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
-------------------------------------------------------------------

Reply via email to