Hi Armin, Apologies for late response. I was able to load a datatable as external resource, I think that the example showed in comment is self-explanatory. If you have any issues loading it, please contact me.
Kind regards. On 23 January 2015 at 08:59, <[email protected]> wrote: > Hi Peter! > > Thanks for your help. I will look at it. > At least for now, greedy anchoring and markfast work as expected. But I've > used only short word lists with simple entries. > > Cheers, > Armin > > > > > > -----Ursprüngliche Nachricht----- > Von: Peter Klügl [mailto:[email protected]] > Gesendet: Donnerstag, 22. Januar 2015 11:24 > An: [email protected] > Betreff: Re: RUTA and shared resources > > Hi, > > Am 22.01.2015 um 09:20 schrieb [email protected]: > > Hello! > > > > This a very short and simple gazetteer using RUTA. > > > > Document{->GREEDYANCHORING(true)}; > > %s*{->MARKFAST(%s,'%s')}; > > First of all, I am sorry that I was not yet able to implement the greedy > matching for the gazetteers/wordlists. I have not forgotten it. > Just curious: does the rule perform as you expect/intend? I mean the > combination of greedy anchoring and the windowed stream caused by the > matching condition. > > > > > > where the first %s is replaced using String.format() by the name of > the source type, the second %s is replaced by the target type name, and > the third %s is replaced by the URL of a word list. Doing so, it's a > little bit for flexible. This is done once in > CasAnnotator_ImplBase.initialize(). > > > > Then the script is executed with Ruta.apply(cas, script) in process(). > But that means that the word list is read again for every CAS processed. > Is there any way to have RUTA use the word list as a > SharedResourceObject, so that it is read once only? > > The problem is that Ruta.apply() creates a new descriptor and a new > analysis engine. You could integrate the ruta analysis engine in your > analysis engine as a field or something and call its process() in your > process() method (and initialize()). Then, the worlists should not be > reloaded for each process(). > > As for SharedResourceObject: This should be done, but it was never at > the top of my todo list. I hope I will find the time sometime. > > You maybe want to take a look at UIMA-4062 and UIMA-4074, especially > Silvertre's comment on UIMA-4062 (29/Oct/14 19:12) where he loads a > table using external resources. Could also work for you maybe. Maybe > Silvestre can share his experiences? > > Best, > > Peter > > > > > Regards, > > Armin > > >
