Il giorno 13/mar/2012, alle ore 18.07, Marshall Schor ha scritto: > Hi Tommaso, > > Thanks for this pointer. Is there anything we can say about this? Is it > just in trunk, or has it been already in some releases? What does it do?
it's only in trunk (version 4.0-SNAPSHOT) and it just allows to run Lucene tokenizers using annotations extracted by UIMA AEs (i.e. org.apache.uima.TokenAnnotation or org.apache.uima.SentenceAnnotation annotations) to create a Lucene token stream; one just has to specify reference to the AE descriptor and annotation type (optionally, one can add featurePath to add the TypeAttribute to created tokens, for example for adding PoS). More generally we can say Lucene now has UIMA-based tokenizers. Regards, Tommaso p.s.: the code base for that was developed for the 'Natural Language Search in Solr' talk at LuceneEurocon 2011 [1] [1] : http://www.lucidimagination.com/devzone/events/conferences/ApacheLuceneEurocon2011/natural-language-search-solr > > Thanks. -Marshall > > On 3/13/2012 1:00 PM, Tommaso Teofili wrote: >> we may add mention to Lucene-UIMA analysis module [1]. >> Tommaso >> >> [1] : http://svn.apache.org/repos/asf/lucene/dev/trunk/modules/analysis/uima/ >> >> >> Il giorno 13/mar/2012, alle ore 16.46, Peter Klügl ha scritto: >> >>> Nothing to add from my side. >>> >>> Peter >>> >>> On 13.03.2012 16:41, Marshall Schor wrote: >>>> Here's the board report fleshed out - I plan to post this tonight. >>>> >>>> Board report for Apache UIMA, for March 2012. >>>> >>>> Apache UIMA's mission: the creation and maintenance of open-source >>>> software related to the analysis of unstructured data, guided by the >>>> UIMA Oasis Standard. >>>> >>>> Releases: >>>> No releases since last report, but had 4 release candidates for the >>>> C++ version of the UIMA framework, and it's pretty close to a release. >>>> UIMA-AS (Asynchronous Scaleout) is being worked on for release. >>>> >>>> Activity: >>>> Typical mailing list activities for this quarter. We did notice that >>>> the approximate downloads (as reported here: >>>> http://people.apache.org/~vgritsenko/stats/projects/uima.html#Downloads-N1008F) >>>> seems to have doubled in the last 2 quarters. >>>> >>>> Community: >>>> We added Peter Klügl to the PMC. >>>> >>>> Issues: No Board level issues at this time >>>> >>>> Trademarks/Branding: >>>> This is now complete: >>>> Branding checklist: >>>> Project Website Basics - done >>>> Website Navigation Links - done >>>> Trademark Attributions - done >>>> Logos and Graphics - done >>>> Project Metadata - done >>>> Read PMC Branding Responsibilities - done, >>>> all PMC members have confirmed >>>> >>>> >>>> -Marshall >>>> >>>> On 3/6/2012 11:25 AM, Marshall Schor wrote: >>>>> It's time for the quarterly status report to the board, covering mid Dec >>>>> - mid March. >>>>> >>>>> We have Peter K. joining the PMC. >>>>> We have 4 UIMA CPP release candidates - no release yet... >>>>> We have work toward preparing a 2.4.0 release of UIMA-AS. >>>>> We have continuing work on several components. >>>>> >>>>> Please reply with additional items for the board report :-) >>>>> >>>>> -Marshall >>>>> >>>>> >>>>> >>>>> >>> >>> -- >>> --------------------------------------------------------------------- >>> Dipl.-Inf. Peter Klügl >>> Universität Würzburg Tel.: +49-(0)931-31-86741 >>> Am Hubland Fax.: +49-(0)931-31-86732 >>> 97074 Würzburg mail: [email protected] >>> http://www.is.informatik.uni-wuerzburg.de/en/staff/kluegl_peter/ >>> --------------------------------------------------------------------- >>> >>
