Edward,

Pycas looks like a pretty complete interface to the CAS, very nice.
Given the interface with the XMI format, it wouldn't take too much
more effort to create a pycas annotator that would be interoperable
with the new uima-as framework extension in Java. Uima-as uses the XMI
format CAS for the service interface, compliant with the Oasis
standards work. Connectivity would be via ActiveMQ's python client,
see http://activemq.apache.org/python.html. Among many other things in
ActiveMQ, we really liked the extensive language support for clients.

Is having an interoperable pycas service of interest, or do you see
the main utility of pycas being offline post processing of CAS files?
More information on uima-as is at
http://cwiki.apache.org/UIMA/uimaasdoc.html

>From your questions in the documentation, going through an
implementation of the CAS from scratch must have given you a good
perspective for UIMA design issues. I'll start the discussion going
with some of them below.

    *   Does the term 'feature structure' apply to all instances of
TOP and its subclasses? Or just to instances that are neither
primitives or arrays. If the latter, then what term (if any) is used
to describe instances of TOP & its subclasses?

I think should TOP should be analogus to Object in Java, which would
make TOP a feature structure. However, primitive attributes like
Integer and Float also derive from TOP. This seems like something that
could be cleaned up in the UIMA type system.

    * Exactly which types should be considered inheritance final and
feature final?
    * From the Java code, it looks like uima.cas.String is inheritance
final; but clearly, it is possible to inherit from it: that's the
whole point of the <allowedValues> element in the typeDescription!

More of the same?

    * What should ViewCAS.get_view_name() do for a view with no sofa?

Only _InitialView is allowed to have no Sofa. All other views always
have Sofas created when the view is created.

    * Are subclasses of String considered primitive or not? It appears
from the Java code like they would return false for
type.isPrimitive().

Primitive sounds right to me.

Congratulations on getting pycas to this point.
Eddie

On Feb 18, 2008 5:12 PM, Edward Loper <[EMAIL PROTECTED]> wrote:
> I played around with the existing python support for uima, and wasn't
> really satisfied with it.  It's all done through a swig interface to
> c++, and the result isn't exactly easy to use.  So I put together a
> pure-python package that provides support for reading and writing UIMA
> CAS data files. The main motivation behind writing this package was to
> allow UIMA data to be read and written by Python programs in a manner
> that is natural to the Python language.  Here's a very simple example
> use case:
>
> >>> import pycas
> >>> # Load a CAS from an XMI or an XCAS file:
> >>> cas = pycas.xml.load_cas('myDocument.xml', 'myTypeSystem.xml')
> >>> # Look up a type object from myTypeSystem.xml:
> >>> Token = cas.type_system['org.mydomain.Token']
> >>> # Iterate over all instances of that type, and perform some work:
> >>> for fs in cas.get_annotation_index(Token):
> ...     token.someProperty = func(token.someOtherProperty)
> >>> # Write the modified CAS to an XMI file:
> >>> pycas.xml.save_cas(cas, 'myModifiedDocument.xml')
>
> I put up a temporary webpage for it:
>
> http://www.cis.upenn.edu/~edloper/pycas/
>
> I'd like to release it as an open source project, but wanted to get
> feedback from the good uima folks at apache & ibm first.  Some
> possibilities include: (a) releasing it as a standalone project; (b)
> incorporating it into the main UIMA project; and (c) adding it under
> the "corpus reader" subpackage of nltk (http://nltk.org).  (The name
> "pycas" could be changed as well -- I picked it by analogy with jcas.)
>
> n.b.: pycas does not attempt to provide support for many of the
> "framework" features of UIMA, including the ability to combine
> processing components together to create applications. It focuses only
> on providing access to the data structures that UIMA uses to manage
> annotations.  If someone else wants to extend what I've done, that's
> fine, but all I really wanted was convenient read/write access to UIMA
> data files.
>
> -Edward
>

Reply via email to