Dan McCreary wrote: > Hello, > > I am somewhat new to UIMA so I apologize if I misunderstand some things. > But this is a very interesting question for me. > > I see Lucene as a very wildly adopted but *Java-only framework* of tools for > building and maintaining keyword *indexes *on many types of documents. > Lucene also has great support for HADOOP and MapForce-type saleability. But > Lucene is also designed to work with many front end tools like POI libraries > to extract text from Microsoft Word, Excel, PowerPoint etc. > > I see Apache UIMA as a general purpose *analytic pipeline architecture *with > the strengths of a very advanced common in-memory processing model. > > I thin there is a huge win-win for both projects if we can make UIMA enrich > text documents with entities before they are indexed by Lucene and also make > these tools much easier to install and work together. You should not have > to be a Java developer just to install these tools and have them index and > search our file systems. > > I have spent many hours trying to get UIMA to work without success. Perhaps > it has to do with trying to get it to work on a 64 bit Vista.... :-O > We have UIMA running on 64 bit Linuxes. Please consider starting another thread about issues around getting it working on 64 bit Vista - that could be quite useful to the community.
-Marshall > - Dan > > > On Thu, Dec 4, 2008 at 12:12 PM, Greg Holmberg <[EMAIL PROTECTED]>wrote: > > >> Roberto-- >> >> It does seem like there should be a close relationship between the two >> groups. >> >> I don't know much about Lucene--can you educate me? For example, have you >> given any thought to what to do with UIMA annotations? From what little >> I've read about Lucene, they seem to have a thing called a document >> analyzer, but they don't mean the same thing we mean by analysis in the NLP >> community. They appear to mean something more like "tokenizer". So I >> haven't yet found a place to put UIMA annotations, say for example, named >> entities or parts of speech. I'm wondering if Lucene needs a major feature >> enhancement before its truly useful with UIMA? >> >> What are your thoughts on how the integrate the two? What functionality is >> possible? >> >> Greg Holmberg >> >> >> -------------- Original message ---------------------- >> From: "Roberto Franchini" <[EMAIL PROTECTED]> >> >>> Hi, >>> I'm going to write a Lucene CAS consumer. The porpouse is to create a >>> Lucene document, or more than one, for each CAS. >>> Last year (2007) the JENA university lab (JULIE lab? is it right?) >>> delivered such a component, named LUCAS. Then it disappeared. >>> LUCAS seems a good piece of software. >>> The Technische Universit�t Darmstadt developed one too: >>> http://www.ukp.tu-darmstadt.de/projects/dkpro/. (I will write to >>> them). >>> >>> There's anybody interested to share knowledge and/or code to do that >>> >> component? >> >>> I think that Lucene and UIMA can be very good friends :) >>> >>> Roberto >>> >>> PS: I apologize for my bad English. >>> >>> -- >>> Roberto Franchini >>> http://www.celi.it >>> http://www.blogmeter.it >>> http://www.memesphere.it >>> Tel +39-011-6600814 >>> jabber:[EMAIL PROTECTED] <[EMAIL PROTECTED]>skype:ro.franchini >>> >> > > >
