:)

Btw. the indexing system in UIMA didn't appear extensible to me last
time I checked. Considering somebody would introduce a x/y coordinates
scheme for image data. This would call for some spatial index, e.g. a
k-d tree. While it is possible to define different indexes of the bag,
set, and sorted kind, it is not possible to add a new kind of index.
I think, this would be quite a useful feature, also for linguistic data.
E.g. an index to efficiently navigate the dominance relations in syntactic
tree structures.

At the UIMA@GSCL 2013 workshop, Nicolas Hernandez [1] provided a nice
summary of kinds of navigation that would be nice to have in UIMA,
but are currently not supported. His work, alas, focusses on text. I
imagine that the processing of audio and video data whips up a whole
new batch of desirable types of navigation and indexing.

Although in UIMA, anchoring and annotations have been conflated into
the same thing (e.g. Annotation), it is not uncommon to consider
anchoring an entirely different aspect from annotations (cf. [2-4]).
This recognizes that there are specific considerations for each kind
of anchoring (different kinds of discrete/continuous x-dimensional 
spaces, identifiable segments, alignments, etc.) in particular related
to navigation and relation.

Cheers,

-- Richard

[1] http://ceur-ws.org/Vol-1038/paper_11.pdf
[2] http://dl.acm.org/citation.cfm?id=1273097
[3] http://www.doaj.org/doaj?func=fulltext&aId=812876
[4] http://dl.acm.org/citation.cfm?id=1642059.1642060

On 04.12.2013, at 17:16, Marshall Schor <[email protected]> wrote:

> Echoing Richard,
> 
> 1) It would perhaps make more sense to be more direct about each of the
> different types of data.  UIMA "built-in" only the most "popular" things - and
> Annotation was one of them.
> 
> Annotation derives from Annotation-base, which just defines an associated 
> Sofa /
> view.
> 
> So it would make more sense to define different kinds of highest-level
> abstractions for your project, related to the different kinds of views/sofas. 
> Audio might entail a begin / end style of offsets;  Images might entail a pair
> x-y coordinates, to describe a (square) subset of an image.  Video might do
> something like audio, or something more complex...
> 
> UIMA's use of the AnnotationBase includes insuring that when you 
> add-to-indexes
> (an operation that implicitly takes a "view" - and adds a FS to that view), 
> that
> if the FS is a subtype of AnnotationBase, then the FS must be indexed in the
> associated view to which that FS "belongs"; if you try to add-to-index in a 
> view
> other than the one the FS was created in, you get this kind of error:
> 
> Error - the Annotation "{0}" is over view "{1}" and cannot be added to indexes
> associated with the different view "{2}".
> 
> The logic behind this restriction is:  an Annotation (or, more generally, an
> object having a supertype of AnnotationBase) is (by definition) associated 
> with
> a particular Sofa/View,  and it is more likely that it is an error if that
> annotation is indexed with a sofa it doesn't belong with.
> 
> Of course, Feature Structures which are not Annotations (or more generally, 
> not
> derived from AnnotationBase), can be indexed in multiple views.
> 
> 2) By keeping separate notions for pointers-into-the-Sofa, you can define
> algorithmic mappings for these that make the best sense for your project,
> including notions of fuzzyness, time-shift (imagine the audio is out-of-sync
> with the video, like lots of u-tube things seem to be), etc.
> 
> -Marshall
> 
> 
> On 12/4/2013 9:31 AM, Jens Grivolla wrote:
>> Hi, we're now starting the EUMSSI project, which deals with integrating
>> annotation layers coming from audio, video and text analysis.
>> 
>> We're thinking to base it all on UIMA, having different views with separate
>> audio, video, transcribed text, etc. sofas.  In order to align the different
>> views we need to have a common offset specification that allows us to map 
>> e.g.
>> character offsets to the corresponding timestamps.
>> 
>> In order to avoid float timestamps (which would mean we can't derive from
>> Annotation) I was thinking of using audio/video frames with e.g. 100 or 1000
>> frames/second.  Annotation has begin and end defined as signed 32 bit ints,
>> leaving sufficient room for very long documents even at 1000 fps, so I don't
>> think we're going to run into any limits there.  Is there anything that could
>> become problematic when working with offsets that are probably quite a bit
>> larger than what is typically found with character offsets?
>> 
>> Also, can I have several indexes on the same annotations in order to work 
>> with
>> character offsets for text analysis, but then efficiently query for
>> overlapping annotations from other views based on frame offsets?
>> 
>> Btw, if you're interested in the project we have a writeup (condensed from 
>> the
>> project proposal) here:
>> https://dl.dropboxusercontent.com/u/4169273/UIMA_EUMSSI.pdf and there will
>> hopefully soon be some content on http://eumssi.eu/
>> 
>> Thanks,
>> Jens

Reply via email to