AW: General question about UimaFIT

2016-09-09 Thread Armin.Wegner
Hi Asher! As a work around, you can use an empty type system, TypeSystemDescription tsd = TypeSystemDescriptionFactory.createTypeSystemDescription("EmptyTypeSystem"); add types programmatically, tsd.addType(typeName, null, CAS.TYPE_NAME_ANNOTATION); and get them later with Type type = cas.ge

CPE processors and analysis engines

2016-09-06 Thread Armin.Wegner
Hi! What's the best practice to combine analysis engines into CAS processors? Should every analysis engine become its own CAS processor? Should analysis engines be combined to aggregates which become CAS processors? What are the conditions for doing so: technical, semantical, logical? Best, Ar

AW: CPE memory usage

2016-08-28 Thread Armin.Wegner
Hi Jens, I just want to confirm your information. As you said, the query gets slower the larger start is, even using filters. The best solution is to get all ids first (may take some time), and then to get each documents by id successively. There is a request handler (get) and a Java API method

AW: Ruta 2.4.0 - High memory needs

2016-08-18 Thread Armin.Wegner
Hello Peter, I found it thanks to your help. There was another Ruta script maliciously hiding in the pipeline setting up test annotations and therefore using all of Ruta's defaults. I discovered it as I used your code from the unit test which, of course, works perfectly fine. I will create Ruta

AW: Ruta 2.4.0 - High memory needs

2016-08-18 Thread Armin.Wegner
Hi Peter, doesn't work like that for me. I've removed DefaultSeeder and added my own seeder implementing RutaAnnotationSeeder. Now, I have all of Ruta's standard tokens plus my own tokenization at the same time. Cheers, Armin -Ursprüngliche Nachricht- Von: Peter Klügl [mailto:peter.klu

AW: Ruta 2.4.0 - High memory needs

2016-08-18 Thread Armin.Wegner
Hello Peter! Please correct me if I'm wrong. My understanding of how Ruta works is as follows. 1. The RutaBasic annotations are always created. RETAINTYPE and FILTERTYPE have no influence of annotation creation. They influence the use of those types in rules, only. 2. The configuration param

Ruta 2.4.0 - Setting parameters using Maven target

2016-08-17 Thread Armin.Wegner
Hi, how to set configuration parameters of analysis engines created with the Maven plugin, before they are created of course? Can the parameters be configured from within the pom? Cheers, Armin pgp_qtPVu0Bmq.pgp Description: PGP signature

AW: CPE memory usage

2016-08-16 Thread Armin.Wegner
Hi Jens, nice tips. I will try that one with the filters, first. I just need to make a view changes. Thank you, Armin -Ursprüngliche Nachricht- Von: j...@grivolla.net [mailto:j...@grivolla.net] Im Auftrag von Jens Grivolla Gesendet: Dienstag, 16. August 2016 13:34 An: user@uima.apache.o

AW: Ruta 2.4.0 - High memory needs

2016-08-10 Thread Armin.Wegner
Hi Peter, I will give it a try and report back in a view days. Thanks a lot, Armin -Ursprüngliche Nachricht- Von: Peter Klügl [mailto:peter.klu...@averbis.com] Gesendet: Mittwoch, 10. August 2016 14:50 An: user@uima.apache.org Betreff: Re: Ruta 2.4.0 - High memory needs Hi, 18MB of

Ruta 2.4.0 - High memory needs

2016-08-09 Thread Armin.Wegner
Hello again! One down, one to go. Are there best practices or tricks to reduce Ruta's memory needs? I tried to use the following script to merge names. Document{->GREEDYANCHORING(true)}; First+ Full {->MARK(Full)}; Full Last+ {->MARK(Full)}; First+ Last+ {->MARK(Full)}; Document{->GREEDYANCHORI

AW: CPE memory usage

2016-08-09 Thread Armin.Wegner
Hi! Finally, it looks like that Solr causes the high memory consumption. The SolrClient isn't expected to be used like I did it. But it isn't documented either. The Solr documentation is very bad. I just happened to find a solution on the web by accident. Thanks, Armin -Ursprüngliche Nach

AW: CPE memory usage

2016-08-08 Thread Armin.Wegner
Hi Richard! I've changed the document reader to a kind of no-op-reader, that always sets the document text to an empty string: same behavior, but much slower increase in memory usage. Cheers, Armin -Ursprüngliche Nachricht- Von: Richard Eckart de Castilho [mailto:r...@apache.org] Gese

AW: CPE memory usage

2016-08-08 Thread Armin.Wegner
Hello Richard! No, I can't change the reader. It's reading from Solr. The response documents are put in a queue. The querying logic is done in hasNext(). hasNext() returns true if the queue is not empty. If the queue is empty, hasNext() sends a request to Solr and puts the response documents in

CPE memory usage

2016-08-07 Thread Armin.Wegner
Hi! I'm using uimaFIT 2.2.0 and uimaj 2.8.1. The collectection processing engine is slowy eating up all memory until it gets killed by the system. This happens even when I'm just runnging a collection reader and no other compoments (no analysis at all). Does anyone has experiented a similar be

AW: Collection processing engine remove annotations

2016-07-15 Thread Armin.Wegner
Hallo Johannes, that was exactly what I was looking for. It works fine, now. Thanks a lot, Armin -Ursprüngliche Nachricht- Von: Johannes Darms [mailto:johannes.da...@scai.fraunhofer.de] Gesendet: Freitag, 15. Juli 2016 09:35 An: user@uima.apache.org Betreff: Re: Collection processing e

Collection processing engine remove annotations

2016-07-15 Thread Armin.Wegner
Hi! How to remove annotations in a collection processing engine? Doing it in process() of an annotator failed. Is this even possible? Best, Armin pgpfgCQFO7URp.pgp Description: PGP signature

AW: Ruta Maven Plugin

2015-10-06 Thread Armin.Wegner
Hi Peter, I've created a new ruta file with only a single line and nothing else. TYPESYSTEM org.apache.uima.ruta.engine.BasicTypeSystem; In the generated type system file you read if using importByName. Without importByName there is no BasicTypeSystem at all. But the types seem to

AW: Ruta Maven Plugin

2015-10-06 Thread Armin.Wegner
Hi Peter, I like to add something to my last post. I can force that exception to occur by setting resolveImports to true in the plain ruta project. There's no java yet. Regards, Armin -Ursprüngliche Nachricht- Von: armin.weg...@bka.bund.de [mailto:armin.weg...@bka.bund.de] Gesendet: Di

AW: Ruta Maven Plugin

2015-10-06 Thread Armin.Wegner
Hi Peter, I'm using 2.3.1, now. I set importByName to true and added the DateTypeSystem.xml to uimaFIT's types.txt. It now throws InvalidXMLException: An import could not be resolved. No file with the name "BasicTypeSystem.xml" was found in the class path or data path. The types.txt file cre

AW: Ruta Maven Plugin

2015-10-06 Thread Armin.Wegner
Hi Peter, I thought that you could tell me where this temp dir is comming from. But as you did not I now suspect that it's related to the CPE. The CPE builder is materializing all descriptors into a temp dir. So the whole problem is most likely caused by some resources referenced in a descripto

AW: Ruta Maven Plugin

2015-10-06 Thread Armin.Wegner
Hi Peter, this helped a little bit, but it is still not running. I had to add the resources section to the pom. ... src/main/ruta s

Ruta Maven Plugin

2015-10-05 Thread Armin.Wegner
Hi, how ist ruta-maven-plugin supposed to be used? Is there a detailed step by step description? I've created a new empty maven project, added a script in the source folder src/main/ruta and a text file containing a list of words to src/main/resources. mvn package builds a ...Engine.xml and a .

AW: DKPro NamedEntity ClassCastException

2015-07-30 Thread Armin.Wegner
Hi Richard! No, I don't use initialize() without args directly. I use initialize(UimaContext context) and call super.initialize(context). Best, Armin -Ursprüngliche Nachricht- Von: Richard Eckart de Castilho [mailto:r...@apache.org] Gesendet: Donnerstag, 30. Juli 2015 14:46 An: user@ui

AW: DKPro NamedEntity ClassCastException

2015-07-30 Thread Armin.Wegner
Hello Richard and Peter! The problem is solved but not really understood. a) Downgrading form uimaFIT 2.1.0 to uimaFIT 2.0.0 helped. So, it must be connected to 2.1.0 in some way. b) Adding an annotator with cas.getJCas() or jcas.getCas().getJCas() in process() to the begin of the pipeline do

AW: DKPro NamedEntity ClassCastException

2015-07-30 Thread Armin.Wegner
Hello Peter! That works but doesn't solve the underlying problem. The line is form DKPro's StanfordNamedEntityRecognizer. Using your solution, I get the same error with ClearTK-TimeML. There must be something wrong elsewhere. If I remember correct, Richard said that it may be the initialization

DKPro NamedEntity ClassCastException

2015-07-30 Thread Armin.Wegner
Hello! I'm getting a java.lang.ClassCastException: org.apache.uima.cas.impl.AnnotationImpl cannot be cast to de.tudarmstadt.ukp.dkpro.core.api.ner.type.NamedEntity using the annotator below in a CPE. It's a Maven project using de.tudarmstadt.ukp.dkpro.core.stanfordnlp-gpl:1.6.1, de.tudarmsta

Ruta - UIMA-4062

2015-07-21 Thread Armin.Wegner
Hi Peter! The change request UIMA-4062 is implemented, isn't it? So how does an end user use it? How to read a wordlist as an UIMA external resource once and use it with Ruta.apply() and MARKFAST on every CAS? Thanks, Armin pgppzUqsCMnGM.pgp Description: PGP signature

log4j

2015-06-29 Thread Armin.Wegner
Hi! How to use log4j with UIMA? Specifying -Dlog4j.configuration=file: and -Dorg.apache.uima.logger.class=org.apache.uima.util.impl.Log4jLogger_impl on the VM command line yields a lot of INFO message from a lot of *_impl classes I do not want to see. These messages are not logged with java.uti

AW: Marking cosnecutive tokens with RUTA

2015-06-10 Thread Armin.Wegner
Hi, yeah, that once hit me, too. It has something to do with the internal sorting of annotations with the same start offset. I annotated some meta data for the whole document in an annotation with start offset 0 and end offset 0. That's not good. The end offset must be the length of the documen

AW: Developing UIMA Ruta Rules in Maven Projects

2015-04-02 Thread Armin.Wegner
Hi Peter! There is no sarcasm at all. I really want to use Maven. It works fine for me. convention over configuration is a nice thing. And I prefer programming/typing over clicking. It gives me more control, is more stable and such... Cheers, Armin -Ursprüngliche Nachricht- Von: Pete

AW: Developing UIMA Ruta Rules in Maven Projects

2015-04-01 Thread Armin.Wegner
Hi Peter! > Experienced developers would maybe only use maven-based ruta project in > future and would not rely on the old Workspace projects at all. Exactly, let me program it... > I assume that a user could convert a ruta project to a maven project and > do the building by configuring the po

AW: RUTA and shared resources

2015-01-27 Thread Armin.Wegner
Hi! Looks good, but is not part of the current release. It's not that urgent to deviate from the current stable release. Any ideas when 2.3.0 will be released. Thanks, Armin -Ursprüngliche Nachricht- Von: Silvestre Losada [mailto:silvestre.los...@gmail.com] Gesendet: Sonntag, 25. Janua

AW: Using OpenNLP type annotations with UIMAfit

2015-01-26 Thread Armin.Wegner
Hi Aleksandar! For full flexibility I use CAS (not JCas). It's a bit inelegant to use, but you can introduce new types at runtime. Together with UIMAfit it is very nice in JUnit tests. And you can set types (type names) as annotator parameters. For example, you can choose the input and output t

AW: RUTA and shared resources

2015-01-23 Thread Armin.Wegner
Hi Peter! Thanks for your help. I will look at it. At least for now, greedy anchoring and markfast work as expected. But I've used only short word lists with simple entries. Cheers, Armin -Ursprüngliche Nachricht- Von: Peter Klügl [mailto:pklu...@uni-wuerzburg.de] Gesendet: Donners

RUTA and shared resources

2015-01-22 Thread Armin.Wegner
Hello! This a very short and simple gazetteer using RUTA. Document{->GREEDYANCHORING(true)}; %s*{->MARKFAST(%s,'%s')}; where the first %s is replaced using String.format() by the name of the source type, the second %s is replaced by the target type name, and the third %s is replaced by the URL

A simple CAS Consumer for populating a Solr Index

2014-11-27 Thread Armin.Wegner
Hi Rob! This simple code example sends annotations of type Person, Location and Organization to a Solr server. There must be the fields text, person, location, and organization defined in Solr, as well. You need org.apache.solr:solr-solrj:4.9.0 or higher jar. Regards, Armin public class Sol

AW: Exception handling

2014-11-11 Thread Armin.Wegner
Hello Sumit and Richard! I've used CpeBuilder from uimaFIT as basis for my own. Essentially, I removed setAnalysisEngine() and made addProcessor() public. Doing so I can set everything I need so far. But it's a very specific and therefore limited first try. Nothing to be contributed yet. There

AW: AW: Exception handling

2014-11-07 Thread Armin.Wegner
Hi Sumit! Setting dropCasOnException works. org.apache.uima.fit.cpe.CpeBuilder::createProcessor() is private and ACTION_ON_MAX_ERROR a static field. It seems to me that the CpeBuilder is missing some methods. There should be a method to create a new processor from a given analysis engine with

AW: Exception handling

2014-11-07 Thread Armin.Wegner
Hi Sumit, I've got a CPE up and running. But where do the interfaces CpeIntegratedCasProcessor and CpeCasProcessors fit in? Can you point me to some documentation or example source code, please? Thanks, Armin -Ursprüngliche Nachricht- Von: Sumit Madan [mailto:sumit.ma...@scai.fraunhofe

AW: Filter Cas from UIMA fit pipeline

2014-11-07 Thread Armin.Wegner
Hi Carsten, I've never used it, but according to the documentation you can do this with a flow controller. The bad thing is, Richard told me a while ago that it is not so easy to build your own flow controller. Cheers, Armin -Ursprüngliche Nachricht- Von: Carsten Schnober [mailto:schn

Exception handling

2014-11-06 Thread Armin.Wegner
Hi! An exception in an analysis engine causes the whole pipeline to crash. That’s not what I want? The processing of the current document should stop. To do so, I can catch the exception in that engine. But afterwards no more engines should touch that document. The pipeline should continue with

PearPackagingMavenPlugin and CVS

2014-10-23 Thread Armin.Wegner
Hi! PearPackagingMavenPlugin copies the CVS subdirs to the PEAR. Can this be changed? How? Cheers, Armin pgp5B1b1EW63M.pgp Description: PGP signature

AW: AW: Lucas

2014-08-26 Thread Armin.Wegner
Hi Erik and Jörn, I've used Solr in the meantime. It is so easy to quickly write a CAS consumer that sends documents to a Solr web service. Writing to a Lucene index is minimally more work. Could this be the reason why nobody cares about the outdated version? Is there really a need for Lucas an

AW: Lucas

2014-08-12 Thread Armin.Wegner
Hi Renauld, that's nice, thank you. Are you using Lucene 4.x or an older version? It's a while ago, that I've asked that question and I didn't get much response. Is the project dead? Is it just to easy to code a simple annotator for Lucene or Solr to justify the effort maintaining Lucas and So

Lucas

2014-07-28 Thread Armin.Wegner
Hi! Is someone using Lucas? It seems to be slightly outdated. It depends on Lucene 2.9.3. Lucene is at version 4.9.0 right now. Is there an alternative? Regards, Armin pgpV9OhSW9ts5.pgp Description: PGP signature

AW: DKpro StanfordNamedEntityRecognizer ClassCastException

2014-07-25 Thread Armin.Wegner
Hi Richard, Nailed it. The pipeline with DKPro's StanfordNamedEntityRecognizer does work with uimafit-core:jar:2.0.0 and uimaj-core:jar:2.4.2 but it does not work with uimafit-core:jar:2.1.0 and uimaj-core:jar:2.6.0. It runs with uimafit-core:2.0.0 aund uimaj-core:2.6.0, too. Thanks a lot, Arm

AW: DKpro StanfordNamedEntityRecognizer ClassCastException

2014-07-24 Thread Armin.Wegner
Hi Richard! It looks like your absolutely right. I have changed all JCas stuff in the consumer's resource to pure CAS and it works. But why is JCas support not initialized? The reader calls an annotator for document meta data that uses JCas. It says "new DocumentMetaData(cas.getJCas())" and ac

AW: DKpro StanfordNamedEntityRecognizer ClassCastException

2014-07-24 Thread Armin.Wegner
Hello Richard! Your fix doesn't change anything. So I tried to narrow down the problem. At least, I can tell that it is not a problem specific to DKPro. I have the same kind of exception when not using DKPro at all. My guess now is that it maybe has something to do with chaining resources. I tr

AW: uimaFIT - types.txt

2014-07-24 Thread Armin.Wegner
Thanks you, Richard. It works. -Ursprüngliche Nachricht- Von: Richard Eckart de Castilho [mailto:r...@apache.org] Gesendet: Dienstag, 22. Juli 2014 17:23 An: user@uima.apache.org Betreff: Re: uimaFIT - types.txt Rather see these instructions in the latest version of the uimaFIT refere

uimaFIT - types.txt

2014-07-22 Thread Armin.Wegner
Hi, The final runnable jar contains the META-INF/org.apache.uima.fit/types.txt from a maven dependency and not from the project itself. Can something be done about this? Cheers Armin pgpy5xmLtjXPW.pgp Description: PGP signature

AW: Ruta - MARKFAST

2014-06-30 Thread Armin.Wegner
Hi, Peter! I got that. I restricted MARKFAST on segments. It works just nearly perfect. How does MARKFAST match things? Using Document{->MARKFAST(MyType, { "a", "b", "a b" }); on a b yields "a b" and "b" but not "a". I would like to have "a" as well. Can this be done? Buy the way: I love R

Ruta - MARKFAST

2014-06-30 Thread Armin.Wegner
Hello! On which annotation type does MARFKAST work? Can I restrict MARKFAST to a single annotation Type, say my own token type? It would be nice to restrict a ruta script to a set of annotations by giving that set of annotations explicitly, like Document{-> INPUT(Token, Organization, Location)

AW: Restricting an aggregate engine to a substring or mention

2014-06-25 Thread Armin.Wegner
Hi Richard, you're right. I have to use new CASes or views. Or I can use the same CAS and restrict the analysis engine to a substring. But that would imply having parameters for the substring's begin and end offsets in the analysis engine: Oh, wait a minute, wasn't that my original question? C

Annotation sets?

2014-06-23 Thread Armin.Wegner
Hello! Are there annotations sets in UIMA? With annotations sets you can group annotations. For example, you may have named entity annotations in a gold standard set and the actual named entities found by an analysis engine in another set. In both sets the location entities are named Location,

AW: Restricting an aggregate engine to a substring or mention

2014-06-22 Thread Armin.Wegner
Hello! I've got another maybe not so good idea. Why not pass an aggregate analysis engine as a parameter? First, build an aggregate analysis engine the usual way. Second, serialize it to an XML-string. Third, pass that string to the SegmentProcessingAE as String parameter together with another

AW: Restricting a aggregate engine to a substring or mention

2014-06-20 Thread Armin.Wegner
Hi Oli! If I get it right, the ability for restricting processing to mentions of given types is inherited from a base class. So every analysis engine that should do this, must inherit from that base clase. Sure, that's one way of doing it. But it's part of the analysis engine. Thanks, Armin -

AW: uimafit - String[] parameter in Resource_ImplBase

2014-06-04 Thread Armin.Wegner
Hello Richard! I would like to have a writer that writes all mentions of a given type. The type is given by name as a AE parameter. The way the mentions are formatted should be interchangeable. So the formatter varies and should be encapsulated as a AE resource (or maybe not?). public class A

WG: uimafit - String[] parameter in Resource_ImplBase

2014-06-03 Thread Armin.Wegner
Hi, I cancelled it. Actually, I don't have a resource. I just tried to modularize my code a little bit. But uimafit's use of injection makes this difficult and no fun at all. Some people consider using injection to be a good programming style. I personally hate it. It kills my highly modulariz

AW: uimafit - String[] parameter in Resource_ImplBase

2014-05-19 Thread Armin.Wegner
Hi Richard, I will try that and report back. Thanks Armin -Ursprüngliche Nachricht- Von: Richard Eckart de Castilho [mailto:r...@apache.org] Gesendet: Montag, 19. Mai 2014 11:11 An: user@uima.apache.org Betreff: Re: uimafit - String[] parameter in Resource_ImplBase Hi Armin, UIMA onl

uimafit - String[] parameter in Resource_ImplBase

2014-05-19 Thread Armin.Wegner
Hi! I can't use a configuration parameter of type String[] in a class extending Resource_ImplBase. A cast exception is thrown. A simple String works fine. Arrays doesn't. Is it even possible to use an array of String as parameter with Resource_ImplBase? Cheers Armin pgp2QgOH5CEbA.pgp Descr

AW: Working with very large text documents

2013-10-18 Thread Armin.Wegner
Dear Jens, dear Richard, Looks like I have to use a log file specific pipeline. The problem was that I did not knew it before the process crashed. It would be so nice having a general approach. Thanks, Armin -Ursprüngliche Nachricht- Von: Richard Eckart de Castilho [mailto:r...@apache.

AW: Working with very large text documents

2013-10-18 Thread Armin.Wegner
Hi Jens, It's a log file. Cheers, Armin -Ursprüngliche Nachricht- Von: Jens Grivolla [mailto:j+...@grivolla.net] Gesendet: Freitag, 18. Oktober 2013 11:05 An: user@uima.apache.org Betreff: Re: Working with very large text documents On 10/18/2013 10:06 AM, Armin Wegner wrote: > What a

AW: Working with very large text documents

2013-10-18 Thread Armin.Wegner
Hi Richard, As far as I know, Java strings can not be longer than 2 GB on 64bit VMs. Armin -Ursprüngliche Nachricht- Von: Richard Eckart de Castilho [mailto:r...@apache.org] Gesendet: Freitag, 18. Oktober 2013 10:43 An: user@uima.apache.org Betreff: Re: Working with very large text doc

Working with very large text documents

2013-10-18 Thread Armin.Wegner
Hi, What are you doing with very large text documents in an UIMA Pipeline, for example 9 GB in size. A. I expect that you split the large file before putting it into the pipeline. Or do you use a multiplier in the pipeline to split it? Anyway, where do you split the input file? You can not jus

AW: HashMap as type feature

2013-10-17 Thread Armin.Wegner
Hi Thomas, thanks for your answer. Using HashMap, does the n-th element of keySet() always corresponds to the n-th element of values()? Is this a defined behavior in Java? Cheers, Armin -Ursprüngliche Nachricht- Von: Thomas Ginter [mailto:thomas.gin...@utah.edu] Gesendet: Mittwoch, 16.

AW: HashMap as type feature

2013-10-17 Thread Armin.Wegner
Dear Richard, to use StringStringMapEntry, needn't it subclass TOP or FeatureStructure? Is it possible to store an arbitray object into a CAS? Cheers, Armin -Ursprüngliche Nachricht- Von: Richard Eckart de Castilho [mailto:r...@apache.org] Gesendet: Mittwoch, 16. Oktober 2013 18:02 An:

AW: Processing a List of Strings with UIMA Addons components

2013-08-06 Thread Armin.Wegner
Dear Marshall, Consider an input text from which only some parts should be processed. After processing the text should be there in one piece again. Let A denote parts of no interest and let b denote parts to analyse further. XAX is split up into X, A, and X. There is nothing to do for the X seg

AW: Java level prerequsite upgrade?

2013-07-28 Thread Armin.Wegner
No, not for me. You can even switch to Java 7. Armin -Ursprüngliche Nachricht- Von: Marshall Schor [mailto:m...@schor.com] Gesendet: Sonntag, 28. Juli 2013 16:05 An: uima-user Betreff: Java level prerequsite upgrade? Dear Users, The UIMA developers would like to be able to start using

Ruta with Eclipse Kepler installation problem

2013-07-25 Thread Armin.Wegner
Hi, Ruta Workbench 2.0.1 can not be installed on Eclipse Kepler because the dependecy to DLTK are not matched. Any ideas? Armin pgppfECv17k3R.pgp Description: PGP signature

AW: UIMA 2.4.0 - AnnotationIndex.Subiterator

2013-06-19 Thread Armin.Wegner
Hello David, that works perfectly well. Thank you, Armin -Ursprüngliche Nachricht- Von: David Garcia Narbona [mailto:david.garc...@barcelonamedia.org] Gesendet: Mittwoch, 19. Juni 2013 13:01 An: user@uima.apache.org Betreff: Re: UIMA 2.4.0 - AnnotationIndex.Subiterator Sorry, I meant

AW: uimafit - configuration parameter used twice

2013-06-19 Thread Armin.Wegner
Hi Richard, I'm using uimafit's Resource_Impl, now. It is even easer to use than Initializable. Thanks for all your fast help, Armin -Ursprüngliche Nachricht- Von: Richard Eckart de Castilho [mailto:richard.eck...@gmail.com] Gesendet: Freitag, 14. Juni 2013 09:40 An: user@uima.apache.o

Java objectes living outside and inside of a pipeline

2013-06-19 Thread Armin.Wegner
Hi, I'd like to use java objectes in a pipeline which are constructed before the pipeline is run and which are still there, after the pipeline has finished its job. Is this even possible? Cheers, Armin

UIMA 2.4.0 - AnnotationIndex.Subiterator

2013-06-19 Thread Armin.Wegner
Hi, Using this code AnnotationIndex indexA = cas.getAnnotatinIndex(typeA); FSIterator itA = indexA.iterator(); // outer loop while (itA.hasNext()) { AnnotationFS annotA = itA.next(); FSIterator itB = indexA.subiterator(annotA); // inner loop while (itB.hasNext())

Uimafit source browser

2013-06-17 Thread Armin.Wegner
Hi, the uimafit source browser at code.google shows only HTML source code. Cheers, Armin

uimafit - configuration parameter used twice

2013-06-13 Thread Armin.Wegner
Hi, the following code uses a file namer class of class org.uimafit.factory.initializable.Initializable to create a CAS consumer. aggregateBuilder.add(AnalysisEngineFactory.createPrimitiveDescription(Writer.class, Writer.PARAM_OUTPUT_DIRECTORY_PATH, "output", Writer.PARAM_FILE_NAMER_CLASS_NAM

TikaAnnotator 2.3.2-SNAPSHOT - Tika 0.8

2013-06-10 Thread Armin.Wegner
Hi, TikaAnnotator depends on Tika 0.8. The actual version of Tika is 1.3. Is there a newer version of TikaAnnotator which does run with Tika 1.3? Cheers, Armin

UIMA Dictionary Annotator - Change Request

2013-05-29 Thread Armin.Wegner
Hi, What is the right way to make a change request for Apache UIMA Dictionary Annotator? Cheers, Armin

AW: Ruta - MARKFAST

2013-05-23 Thread Armin.Wegner
Hello Jörn, absolutely right. But for now I'm still a nooby. That's why I'm asking so much. Cheers, Armin -Ursprüngliche Nachricht- Von: Jörn Kottmann [mailto:kottm...@gmail.com] Gesendet: Donnerstag, 23. Mai 2013 14:24 An: user@uima.apache.org Betreff: Re: Ruta - MARKFAST On 05/23/2

Ruta - Optional First Token Problem

2013-05-23 Thread Armin.Wegner
Hi! In Ruta 2.0.2-SNAPSHOT, rules with an optional first element do not work. The optional part seems to be mandatory. Using DECLARE Test; "a"? "b" "c"{->MARK(Test, 1, 3)}; on a b c x b c marks "a b c" (0, 5) but not "b c" (8, 11). Cheers, Armin

AW: AW: Ruta - MARKFAST

2013-05-23 Thread Armin.Wegner
Hello Peter, Now that I understand it, it's a nice feature. By the way, where can I find a good documentation of Ruta? I only know of http://people.apache.org/~pkluegl/site/textmarker-current/tools.textmarker.book.html and http://tmwiki.informatik.uni-wuerzburg.de/. A more detailed description

AW: AW: Ruta 2.0.2-SNAPSHOT - Eclipse Plugin Installation

2013-05-22 Thread Armin.Wegner
Hello Peter, I used the Eclipse update site you provided. Thanks. But I also tried to make the Maven build. "mvn clean install" on trunk succeeded. "mvn clean package" on ruta-eclipse-update-site failed because ant can't find eclipse.home (ECLIPSE_HOME). Regards, Armin -Ursprüngliche Nac

AW: Ruta 2.0.2-SNAPSHOT - Eclipse Plugin Installation

2013-05-22 Thread Armin.Wegner
Hi Peter, the eclipse update site does not work. Eclipse does not recognize it as an udpate site. Cheers, Armin -Ursprüngliche Nachricht- Von: Peter Klügl [mailto:pklu...@uni-wuerzburg.de] Gesendet: Dienstag, 21. Mai 2013 13:24 An: user@uima.apache.org Betreff: Re: Ruta 2.0.2-SNAPSHO

AW: Ruta - MARKFAST

2013-05-22 Thread Armin.Wegner
Hi Peter, your example does work perfectly fine. But try this as word list and input document: nach Christus nach der Zeitenwende n. C. n.C. nC. n. Chr. n. d. Z. n.d.Z. unserer Zeit unserer Zeitrechnung u. Z. u.Z. v. C. v.C. vC. v. Chr. v. d. Z. v.d.Z. vor Christus vor der Zeitenwende vor unsere

Ruta - MARKFAST

2013-05-21 Thread Armin.Wegner
Hello! Is there any possibility to match strings like nC. v. Chr. with MARKFAST? Cheers, Armin

AW: Ruta - Token Order

2013-05-21 Thread Armin.Wegner
Hi Peter, I think that the rule doesn't matter. But I tried to find calender dates. To find out what was going wrong I reduced the original more complex rule to DECLARE Date; Document{->RETAINTYPE(BREAK, SPACE)}; NUM{REGEXP("\\d\\d")->MARK(Date, 1, 2)} PERIOD; on the input text 12. Mai 1803

Ruta - Token Order

2013-05-21 Thread Armin.Wegner
Hi, In Ruta 2.0.2-SNAPSHOT a token with begin offset 0 and end offset 2 comes before a token with begin offset 0 and end offset 0. The token order is not as I expected. Thus in my case, SourceDocumentAnnotation was the second token in the token sequence and the rule didn't match. It took me som

AW: Ruta 2.0.2 - Grouping Problems

2013-05-21 Thread Armin.Wegner
Hi Peter, It works. Thank you very much. Armin -Ursprüngliche Nachricht- Von: Peter Klügl [mailto:pklu...@uni-wuerzburg.de] Gesendet: Freitag, 17. Mai 2013 17:19 An: user@uima.apache.org Betreff: Re: Ruta 2.0.2 - Grouping Problems Hi, the problem should be fixed now. Best, Peter

Ruta 2.0.2-SNAPSHOT - Eclipse Plugin Installation

2013-05-21 Thread Armin.Wegner
Hi! I've checkout Ruta 2.0.2-SNAPSHOT with svn checkout https://svn.apache.org/repos/asf/uima/sandbox/ruta/trunk and build it succesfully with mvn clean install. Now, how to install the Eclipse plugins? Is there a local reposity or update site for Eclipse? Or, which files need to be copied ma

Ruta 2.0.2 - Grouping Problems

2013-05-17 Thread Armin.Wegner
Hello! Let A, B, C, D and F denote type names. Then, A B? C D{->MARK(F, 1, 4)} works. A (B)? C D{->MARK(F, 1, 4)} causes a NullPointerException. (A B)? C D{->MARK(F, 1, 4)} causes an ArrayIndexOutOfBoundException: -1. Any ideas? Cheers, Armin

Ruta - RETAINTYPE

2013-05-16 Thread Armin.Wegner
Hello! In Ruta SNAPSHOT-2.0.1 Document{->RETAINTYPE(ALL)}; does not retain SPACE. It is the same with ANY and WS. By the way, where can I get the newest version of Ruta (jar, svn, etc)? Cheers, Armin

Textmarker/Ruta - negative look behind/ahead

2013-05-07 Thread Armin.Wegner
Hello! Is there a mechanism like negative look behind or negative look ahead in Ruta? Cheers, Armin

AW: AW: Textmarker - Qualification of Types

2013-05-07 Thread Armin.Wegner
Hi Peter, That was really helpful, Thanks again, Armin -Ursprüngliche Nachricht- Von: Peter Klügl [mailto:pklu...@uni-wuerzburg.de] Gesendet: Montag, 6. Mai 2013 15:39 An: user@uima.apache.org Betreff: Re: AW: Textmarker - Qualification of Types Hi, I should have mentioned that you c

AW: AW: Textmarker - Qualification of Types

2013-05-07 Thread Armin.Wegner
Hi Peter, I've uninstalled Textmarker and installed Ruta from your update sites. It seems to work. Thank you, Armin -Ursprüngliche Nachricht- Von: Peter Klügl [mailto:pklu...@uni-wuerzburg.de] Gesendet: Montag, 6. Mai 2013 10:56 An: user@uima.apache.org Betreff: Re: AW: Textmarker - Q

Ruta 2.0.1-SNAPSHOT - CVS directory exception

2013-05-07 Thread Armin.Wegner
Hello Peter, IN 2.0.1-SNAPSHOT the RutaLauncher complains about the CVS directory in the input folder: Exception in thread "main" java.io.FileNotFoundException: /input/CVS (Is a directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.(FileInputStream

AW: How to create and use a repository for UIMA annotators?

2013-05-06 Thread Armin.Wegner
Hi, In my opinion, the best way to do it, is to use an empty type system with the collection reader. You can create one with TypeSystemDescriptionFactory.createTypeSystemDescription(). If one of your annotators needs a type system, add it there, e. g. AnalysisEngineFactory.createPrimitiveDescr

AW: Textmarker - Qualification of Types

2013-05-05 Thread Armin.Wegner
Hi Peter, That is fine. I'm using 2.0.0 core jar from maven central. Can you give me a snapshot update site, please? Thank you, Armin -Ursprüngliche Nachricht- Von: Peter Klügl [mailto:pklu...@uni-wuerzburg.de] Gesendet: Freitag, 3. Mai 2013 15:04 An: user@uima.apache.org Betreff: Re:

Textmarker - Qualification of Types

2013-05-03 Thread Armin.Wegner
Hi, I'm running Textmarker on a CAS XMI file with a lot of annotations from different annotators and different type systems. There are some type names used more than once, but with different name spaces. All types are defined in the type system included with TYPESYSTEM. Prepending the namespace

AW: uimaFit way of creating an analysis description from an XML descriptor file

2013-04-30 Thread Armin.Wegner
Hi Richard, The file is in the file system. But I don't want to create an AnalysisEngine with AnalysisEngineFactory.createAnalysisEngineFromPath(). I'd rather like to have an AnalysisEngineDescription. But there is not method createAnalysisEngineDescriptionFromPath(). Is there an easy way to ge

AW: uimaFit way of creating an analysis description from an XML descriptor file

2013-04-30 Thread Armin.Wegner
Hi Richard, I'm not talking about type system descriptors, but of analysis engine descriptors. I would like to create an AnalysisEngineDescription from an analysis engine descriptor file, e. g. like one the Textmarker Workbench created in a Textmarker Eclipse project. I'd like to add this Anal

AW: Getting the effective type system

2013-04-30 Thread Armin.Wegner
Hello Richard, using your second suggestion, I've written a very simple CAS consumer like the one in [2]. It's a one-liner and works fine: public final void process(final CAS cas) throws AnalysisEngineProcessException { try { TypeSystemUtil.typeSystem2TypeSystemDescripti

  1   2   >