Re: Limiting the memory used by an annotator ?

2017-04-29 Thread Thilo Goetz
In situations like these, I usually limit the size of the input documents. There are various policies you can adopt. You can refuse to handle long documents; you can cut off long documents at an arbitrary point; or you can split long documents at more or less sensible positions (try to find a p

Re: Right to Left Languages in CVD and Document Analyzer

2015-10-26 Thread Thilo Goetz
Hi Davood, unfortunately, that is not possible. --Thilo On 10/25/2015 04:51 PM, d.heidarp...@ut.ac.ir wrote: Hi Thilo, There's no problem in displaying the Farsi Text from right to left for me, actually I need right alignment for the text. Is there a way to change the alignment from left to ri

Re: AW: Working with very large text documents

2013-10-18 Thread Thilo Goetz
Don't you have a hadoop cluster you can use? Hadoop would handle the file splitting for you, and if your UIMA analysis is well-behaved, you can deploy it as a M/R job, one record at a time. --Thilo On 10/18/2013 12:25 PM, armin.weg...@bka.bund.de wrote: Hi Jens, It's a log file. Cheers, Ar

Re: Designing collection readers: Reading multiple XML files containing multiple CASes

2013-10-07 Thread Thilo Goetz
I just want to point out that there is an alternative. I never use collection readers and cas consumers myself. Instead, I do the reading of the input and the aggregation of the output outside the framework, where I have more control over things. Just my opinion though. See http://uima.apa

Re: custom FSRepository(?)

2013-07-19 Thread Thilo Goetz
On 07/19/2013 11:03 AM, Ingo Thon wrote: Hi List-Members, I'm using UIMA in a very large project. For two reasons I would like to store annotations /partly and the SofaText/SofaStream: 1.) The workflow of our application is roughly as follows: First, UIMA AE is used to add Meta Data to the arti

Re: Multiple References to an Array

2013-07-02 Thread Thilo Goetz
On 07/01/2013 07:39 PM, John David Osborne (Campus) wrote: Thanks Thilo, that was helpful. Is this (1.0) the standard you were referring to? http://docs.oasis-open.org/uima/v1.0/uima-v1.0.html -John Yes. On 6/20/13 7:09 AM, "Thilo Goetz" wrote: On 06/19/2013 10:14 PM,

Re: Multiple References to an Array

2013-06-20 Thread Thilo Goetz
On 06/19/2013 10:14 PM, John David Osborne (Campus) wrote: Does anybody know what the underlying reason that this WARNING is generated? WARNING: Warning: multiple references to an array. Reference identity will not be preserved in XMI. 6/19/13 2:22:12 PM - 11: org.apache.uima.cas.impl.XmiCasS

Re: Ruta - Token Order

2013-05-21 Thread Thilo Goetz
On 05/21/2013 01:37 PM, Peter Klügl wrote: Hi, On 21.05.2013 12:47, armin.weg...@bka.bund.de wrote: Hi, In Ruta 2.0.2-SNAPSHOT a token with begin offset 0 and end offset 2 comes before a token with begin offset 0 and end offset 0. The token order is not as I expected. Thus in my case, Source

Re: Is it possible to add Feature(s) to Top?

2012-10-16 Thread Thilo Goetz
On 16/10/12 01:21, Shahim Essaid wrote: > Hi All, > > Does the UIMA API provide a way to add features to the base type > system? I see that the default TS is created and locked in CASImpl so > I am assuming that there is no API way for doing this. Can I add > additional features in the CASImpl co

Re: PEAR Classpath issues

2012-07-27 Thread Thilo Goetz
Hi Erik, On 27/07/12 10:51, Erik Fäßler wrote: > Hi Thilo! > > Thanks for your answer! Some comments and further questions below: > > Am 27.07.2012 um 08:24 schrieb Thilo Goetz: > >>> I did intentionally not include my libraries in the classpath here, because

Re: PEAR Classpath issues

2012-07-26 Thread Thilo Goetz
On 26/07/12 18:17, Erik Fäßler wrote: > Hi all, > > I know this question has been asked a few times in the past years but I > didn't really come to a definite answer or a solution to the problem. The > issue is to use a PEAR packaged simple AE in a pipeline without setting the > classpath manua

Re: I don't understand the benefits of CAS

2012-07-19 Thread Thilo Goetz
Another way of saying this is that if the framework owns the data, it can provide services for that data (such as serialization and network transport) seamlessly and transparently. If you pass any kind of objects between components, this is generally not possible. --Thilo On 19/07/12 00:00, Mars

Re: UTF8 Encoded documents processing

2012-05-27 Thread Thilo Goetz
On 27/05/12 16:59, Seid Muhie wrote: > Dear Thilo Goetz > Thank you for your response > > I have aleardy tried different ways of reading text file with different > encodings. > > For example using commons IO FileUtils class, I tried as follows > > ...

Re: UTF8 Encoded documents processing

2012-05-26 Thread Thilo Goetz
On 26/05/12 23:13, Seid Muhie wrote: > dear all > I have Unicode document I want to process. > Following the tutorial at > this, > the code stucks at the last line. > > File taeDescriptor = new > File("desc\\

Re: Repackaging an unpackaged pear file

2012-04-27 Thread Thilo Goetz
On 27/04/12 00:52, Mike O'Leary wrote: > Mike O'Learyy writes: > >> >> Thilo Goetz writes: >> >>> >>> On 26/04/12 18:10, Marshall Schor wrote: >>>> Thanks Thilo. >>>> >>>> Could you unzip the pear with an un

Re: Repackaging an unpackaged pear file

2012-04-26 Thread Thilo Goetz
On 26/04/12 18:10, Marshall Schor wrote: > Thanks Thilo. > > Could you unzip the pear with an unzipper, and do the change to fix the > file path and then zip it back up again? That way the variable > replacement stuff wouldn't run. > > -Marshall > Yes but you need the original pear to do that.

Re: Repackaging an unpackaged pear file

2012-04-26 Thread Thilo Goetz
On 25/04/12 23:20, Marshall Schor wrote: > I hope its trivial :-) (But I haven't tried it...). It's not trivial, because the pear installer desctructively replaces variables with local paths on installation. If you don't know what you're doing, it will be much easier to ask the other team to get

Re: InlineXMLCasConsumer fails depending on locale

2012-02-21 Thread Thilo Goetz
On 21/02/12 16:15, Jens Grivolla wrote: > On 02/21/2012 04:08 PM, Thilo Goetz wrote: >> On 21/02/12 15:59, Jens Grivolla wrote: >>> it appears that InlineXMLCasConsumer depends on the system locale for >>> some internal transformations. The output appears to be written i

Re: InlineXMLCasConsumer fails depending on locale

2012-02-21 Thread Thilo Goetz
On 21/02/12 15:59, Jens Grivolla wrote: > Hi, > > it appears that InlineXMLCasConsumer depends on the system locale for > some internal transformations. The output appears to be written in UTF8 > (outStream.write(xmlAnnotations.getBytes("UTF-8"));) but when used on a > machine with a locale of ASC

Re: Having mutliple instances of an AE writing in the same output file - thread safe

2012-01-26 Thread Thilo Goetz
On 26/01/12 10:26, Alexander Klenner wrote: > Hi there, > > is there a tutorial for the problem mentioned above? We have multiple > instances of an AE that produce output that has to be collected in one final > output file (all instances are ought to share this file via e.g UIMAContext), > the

Re: AW: Annotation/Feature creation, changing types

2011-12-07 Thread Thilo Goetz
Forgot to add a link to the docs: http://uima.apache.org/d/uimaj-2.3.1/references.html#ugr.ref.cas On 07/12/11 14:07, Thilo Goetz wrote: > On 07/12/11 07:44, armin.weg...@bka.bund.de wrote: >> Hello Tomas, >> >> try this in your annotator: >> >> // cas is a C

Re: AW: Annotation/Feature creation, changing types

2011-12-07 Thread Thilo Goetz
ike the CVD. --Thilo > > -Ursprüngliche Nachricht- > Von: Tomas By [mailto:t...@cmu.edu] > Gesendet: Mittwoch, 7. Dezember 2011 07:21 > An: user@uima.apache.org > Betreff: Re: Annotation/Feature creation, changing types > > Hi, > > Thanks for the reply. >

Re: Annotation/Feature creation, changing types

2011-12-06 Thread Thilo Goetz
On 06/12/11 20:47, Tomas By wrote: > Hi all, > > I am wondering if it is possible to (for example) first create an annotation > or a feature that has no type, and then set the type in a second step. > > From looking at the docs, it seems there is no obvious way to do this. Correct, there is no w

Re: CAS Visual Debugger ignoring logger properties

2010-07-30 Thread Thilo Goetz
On 7/30/2010 20:40, Marshall Schor wrote: On 7/30/2010 11:55 AM, Thilo Goetz wrote: On 7/30/2010 12:14, armin.weg...@bka.bund.de wrote: Hello, The CAS Visual Debugger ingores the logger properties file. It does not log to console und writes uima.log into my home directory. The VM arguments

Re: CAS Visual Debugger ignoring logger properties

2010-07-30 Thread Thilo Goetz
On 7/30/2010 12:14, armin.weg...@bka.bund.de wrote: Hello, The CAS Visual Debugger ingores the logger properties file. It does not log to console und writes uima.log into my home directory. The VM arguments in Eclipse's Run Configuration say: "-Djava.util.logging.config.file=${project_loc}/co

Re: determining whether a feature has been set

2010-05-03 Thread Thilo Goetz
On 5/3/2010 15:15, Klaus Rothenhäusler wrote: > Hi, > is there any way to determine whether a feature of a primitive > numerical type has been set in a particular feature structure? The > methods like getIntValue(feat), getFloatValue(feat), etc. all return a > zero value if the feature hasn't been

Re: Restrictions on sofa data array

2010-04-27 Thread Thilo Goetz
On 4/27/2010 16:17, Eddie Epstein wrote: > Hi, > > On Mon, Apr 26, 2010 at 9:56 AM, Klaus Rothenhäusler > wrote: >> That's the way I'm using UIMA right now. However, as practically all >> downstream annotators work on tokens, I would find it much more >> intuitive if I could assign annotations a

Re: Best way to compare FSIndexes from different CASes on the same Sofa

2010-04-21 Thread Thilo Goetz
On 4/20/2010 18:22, Bart Mellebeek wrote: > Hi, > > I was wondering about the following question: what is the best way to > compare FSIndexes from different CASes on the same Sofa? > I have two different CASes on the same SofaString: CAS1 was obtained as > the result of running an Annotator; CAS2

Re: New UIMA website

2010-04-19 Thread Thilo Goetz
It is explained here: http://uima.apache.org/mail-lists.html On 4/19/2010 18:01, Crystal Glasgow wrote: > Do you know how I get off this email list? > > Thanks, > > Crystal > > > > On Mon, Apr 19, 2010 at 8:55 AM, Thilo Goetz wrote: > >> Hi all, &g

New UIMA website

2010-04-19 Thread Thilo Goetz
Hi all, as part of our move to top-level project, the UIMA website has moved from incubator.apache.org/uima to uima.apache.org. The old address will continue to work, but redirect to the new one. Please use the new address in future. Please let us know if you notice any broken links or factual e

Re: Has the maling list moved to another address?

2010-04-15 Thread Thilo Goetz
Yes, as part of our graduation, the list has moved. The website still has the old info, though. It will take a bit until it's all sorted out. The new lists are: user@uima.apache.org d...@uima.apache.org comm...@uima.apache.org On 4/15/2010 19:06, Manuel Fiorelli wrote: > Hello list, > I receiv