Thanks Michael. Dictionaries processing time is reasonable. It's the document analyzer execution time that is the bottleneck. I will merge the dictionaries and compile them as you suggested. However, I am not sure which command line tool you are referring to. Do you mean: org.apache.uima.conceptMapper.dictionaryCompiler.CompileDictionary.java? Thanks for the vacation heads up. Ahmed
On Mon, Jun 23, 2008 at 2:37 PM, Michael Tanenblatt <[EMAIL PROTECTED]> wrote: > The short answer is "no". Not yet, anyway. > > But, here are some things that might help. First, if dictionary loading > times are long, you can use the command line tool supplied in the package to > compile the dictionary, and use the compiled dictionary. If you do this, > remember that you will need to change the AE descriptors to use the correct > implementation of the dictionary loader, e.g.: > > <externalResource> > ... > > > <implementationName>org.apache.uima.conceptMapper.support.dictionaryResource.CompiledDictionaryResource_impl</implementationName> > ... > </externalResource> > > That said, if you are using 13 dictionaries, that means you are running 13 > copies of ConceptMapper in your pipeline, which means that you are > traversing each file's text at 13 times just for your ConceptMapper > invocations. If you could merge the dictionaries into one, you should see a > marked speedup. Clearly, it a near-term enhancement of ConceptMapper would > be to enable the loading of multiple dictionaries, which get merged at > initialization time. > > One side note: I am going to be on vacation starting on June 25 and will > only have occasional access to email until I return on July 12. I will try > to answer questions during that time when I do have access, but I really > have no idea how often that will be. > > > > On Jun 23, 2008, at 2:19 PM, Ahmed Abdeen Hamed wrote: > > Hello UIMA members,I am using the document analyzer example to analyze >> large >> files from multiple dictionaries. One of the raw files is 7.5MB. The >> number >> of dictionaries is 13, 1MB is the size of each. Is there some sort of a >> matrix that you can use to predict the execution time? Has any one written >> a >> paper on the performance analysis of ConceptMapper? >> Please let me know if you can. >> Best wishes, >> -------------------------------------------------------- >> Ahmed Abdeen Hamed >> Scientific Informatics Project Leader >> MBLWHOI Library >> Marine Biological Laboratory >> 7 MBL Street Woods Hole, MA 02543 USA >> +1 508 289 7676 >> -- >> email: [EMAIL PROTECTED] >> -- >> >
