Bringing ClearTK to UIMAv3?

2022-07-29 Thread Richard Eckart de Castilho
Hi folks,

I have set up a branch of ClearTK building against UIMA v3.3.0. 

It compiles.

Some tests seem to fail when building with tests from the command line.
When I ran at least one of the tests in Eclipse, it worked there.
I didn't investigate.

Is this of interesting to somebody? Would somebody maybe like to help
finishing this upgrade and getting the tests to work?

https://github.com/ClearTK/cleartk/pull/443

Cheers,

-- Richard


Re: Apache cTAKES GitHub mirror is stuck in 2019 [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS]

2022-06-28 Thread Richard Eckart de Castilho
Hi all,

> On 6. Jun 2022, at 16:09, Finan, Sean 
>  wrote:
> 
> Hi Kean,
> 
> Thank you for the suggestion and the link. I am really glad that people are 
> interested in this guithub topic and taking it seriously. It would be great 
> if we could make it happen.
> 
> While definitely a possibility, the git LFS paradigm is something that I 
> would like to avoid. 
> 
> Like keeping our models on SVN, it would also require separating models from 
> code into two different repos, e.g. github and bitbucket. As opposed to 
> bitbucket, the apache svn repos are long established, familiar to and 
> supported by the apache infrastructure team. The same goes for the apache 
> foundation use of github. I like being able to lean on the apache infra team 
> for help.

So GitHub seems to have support for LFS [1]. What I do not know is if the ASF's 
GitHub plan allows us to use this and if so if there is a volume limit. Would 
have to ask INFRA about that.

The use of Git and GitHub is well supported by the INFRA team. For example, 
there is self-service for creating and managing repos. [2]

There is also the `.asf.yaml` mechanism for configuring GitHub repos and 
hooking them up with the ASF infrastructure including mailing lists, website 
publishing, etc. etc. [3]

> The apache Jenkins servers are linked to the svn repos, making continuous 
> integration easy - on the rare occasion when somebody does change something 
> in a model repo. While I expect anybody savvy enough to work on models to 
> also have the knowhow and wherewithal to work with a separate svn repo, I 
> don't want them to need to get out to jenkins and manually kick off snapshot 
> builds.

Jenkins also supports GitHub very well [4]. For example, in UIMA, we just drop 
a `Jenkinsfile` [5,6] configuration file into each repo and Jenkins picks them 
up even gives us support pull requests [7].
I'm happy to help you setting that up for cTAKES as well.

> Probably most important is the requirement of the client user to have the LFS 
> command line client. I think that there are enough hoops stuck in front of 
> getting ctakes installed/checked out/cloned/etc. and it seems to me that one 
> of the biggest reasons to use github is to make things easier for absolute 
> newbies to just pull down code and experiment.

It is an additional hoop to jump through indeed, but it is a one-time action to 
install LFS. Chances are that people may even already have it set up because 
they use it in other repos.

> Keeping the models on a separate svn repo would mean that they aren't checked 
> out as code, but would be put in the .m2 maven area when a user runs maven 
> compile. While the total footprint of full ctakes would still be the same 
> size, it would essentially make the code directory smaller and initial 
> downloads/checkouts would be faster. Plus, if done properly maybe it could 
> "clean up" all of those nearly identically named modules in my intellij 
> project window and I'd stop clicking on the wrong one when I've had too much 
> coffee.

Nowadays, I fear that people may not have svn installed anymore ;) So requiring 
svn to download models and drop them into m2 might be an inconvenience. If the 
models live in a Maven Repository and can be dragged in as a normal dependency, 
that would seem most convenient.

Cheers,

-- Richard

[1] 
https://docs.github.com/en/repositories/working-with-files/managing-large-files/configuring-git-large-file-storage
[2] https://gitbox.apache.org
[3] https://s.apache.org/asfyaml
[4] https://builds.apache.org/job/UIMA/
[5] https://github.com/apache/uima-uimaj/blob/main/Jenkinsfile
[6] https://github.com/apache/uima-build-jenkins-shared-library 
[7] https://builds.apache.org/job/UIMA/job/uima-uimaj/view/change-requests/

Re: Apache cTAKES GitHub mirror is stuck in 2019 [EXTERNAL]

2022-06-02 Thread Richard Eckart de Castilho
On 2. Jun 2022, at 14:22, Finan, Sean 
 wrote:
> 
> I don't know much about how this is done.  If anybody out there has knowledge 
> or experience that they can pass on, please share.

When we did this for UIMA, the steps were documented here:

  https://uima.apache.org/convert-to-git.html

Not 100% sure if this is still the way to go - INFRA may know more.

Basically, if the Git(Hub) mirror is working properly, then at some point you 
can tell Infra to make it the main repo and to put SVN into read-only.
But first, the Git(Hub) mirror needs to be up-to-date.

I'm hanging out on the ASF slack e.g. in the ComDev channel - feel free to ping 
me there.

Cheers,

-- Richard

Apache cTAKES GitHub mirror is stuck in 2019

2022-06-02 Thread Richard Eckart de Castilho
Hi,

it appears that the GitHub mirror of Apache cTAKES may be stuck.

When I check the svn log of https://svn.apache.org/repos/asf/ctakes/trunk/, I 
can
see activity as recent as May 2022.

However, on GitHub, I can only see stale branches:

https://github.com/apache/ctakes/branches

Wouldn't it be good if the GitHub mirror would be kept up-to-date?

Best,

-- Richard



End of the road for UIMAv2 - please upgrade to UIMAv3

2022-03-08 Thread Richard Eckart de Castilho
On 17. Aug 2021, at 22:08, Finan, Sean  wrote:
> 
> If you absolutely require uima 3 for some reason then I don't think that I 
> can help you.  You may want to ask the uima lists about mixing versions or 
> equivalent v2 solutions for your goals.

Besides connecting pipes through remote services, there is no way to combine 
UIMAv2 and UIMAv3.

Work on UIMAv2 has fully stopped.

UIMAv2 is very likely not going to get any more updates and bug fixes. 
A very last uimaFIT 2.6.0 might still make it, but that's likely it.

I would strongly recommend that you upgrade to v3 as soon as possible.
If you have and trouble doing so, please let me know. The easiest way
is via the Apache UIMA users mailing list.

Best,

-- Richard 

(Apache UIMA PMC Chair)



Re: uimafit version and commit messages

2017-11-28 Thread Richard Eckart de Castilho
The uimaFIT releases mainly fix bugs and add smaller features.
It should be pretty safe to update to 2.4.0. I didn't face
any problems upgrading e.g. DKPro Core which is pretty large
and uses uimaFIT intensively.

Cheers,

-- Richard (atm maintaining uimaFIT)

> On 28.11.2017, at 01:45, David Kincaid  wrote:
> 
> Thanks for upgrading uimafit. I was thinking of giving it a try myself when
> I had a chance. I see that you upgrades to 2.30, but the most recent
> version is 2.4.0. Was there a problem with 2.4.0?
> 
> - Dave



Re: Travis for testing

2017-08-20 Thread Richard Eckart de Castilho
On 19.08.2017, at 16:34, Andrey Kurdumov  wrote:
> 
> Given the fact that cTakes is available over GitHub (
> https://github.com/apache/ctakes) I interested in having configure Travis
> to run exiting test suite of cTakes.
> 
> That give clear visibility of the workflow and this investment in the
> infrastructure could help let other people start faster.

The ASF runs a Jenkins server. It includes the necessary plugins to
build pull requests and to update the build status on GitHub.

Also, the "Embeddable Build Status" plugin is available which
can provide you with a "badge" that indicates the build status.

Travis offers a great *free* service to the OSS community, in
particular to smaller projects to help them get started with
proper development infrastructure. But since the ASF has proper
development infrastructure run on own resources, there is no
need to make use of this free service - IMHO we should leave the
free resources to others who do not have their own build infrastructure.

Cheers,

-- Richard


Re: jcas to json error

2017-06-07 Thread Richard Eckart de Castilho
On 07.06.2017, at 07:33, Kumar, Avanish  wrote:
> 
> Exception in thread "main" java.lang.NoSuchMethodError: 
> org.apache.uima.cas.impl.TypeImpl.getSuperType()Lorg/apache/uima/cas/impl/TypeImpl;

Could it be that you are mixing JARs from different version of UIMA, e.g. 
uimaj-core 2.7.0 with uimaj-json 2.10.0 or something like that?

Cheers,

-- Richard

Re: UIMA 3.0.0

2017-02-14 Thread Richard Eckart de Castilho
> On 14.02.2017, at 17:35, David Kincaid  wrote:
> 
> I see there is a new release of UIMA in the works and it's labeled as
> 3.0.0. That jump seems to imply significant changes/updates. Is anyone in
> the ctakes community close enough to the UIMA project to know if there is
> anything beneficial to ctakes in there? Has anyone been bold enough to try
> ctakes with the newest version?


Hi all,

please mind that this is UIMA 3.0.0 *ALPHA*. This is meant for early
access to the new architecture so we (the UIMA project) can get feedback
on things that break, possibly on things that can still be improved (in
incompatible ways) before we move on to BETA and eventually to a general
availability release.

The main thing that is changing with UIMA v3 is the internal management
of the CAS. In v2, UIMA used its own memory management similar to the way
it is implemented in the UIMA C++ version. With UIMA v3, this changes
radically. The CAS and the feature structures in the CAS are now proper
Java objects subject to Java garbage collection. Preliminary testing
indicates that this change can yield some quite significant performance
improvements.

The completely rewritten CAS also means that JCas classes need to be
regenerated to be compatible with v3. This is probably the most significant
breaking change.

There is also a completely new API to retrieve annotations from the CAS.
It was inspired by the uimaFIT (J)CASUtil methods as well as the Java
Streaming API.

It would be great if you find the time to have a look at UIMA v3.
We're happy to hear any feedback you might have and to help you 
overcome any rough parts you might hit - just leave a post on the
d...@uima.apache.org mailing list :)

Cheers,

-- Richard


Karma for Jira

2016-06-10 Thread Richard Eckart de Castilho
Hi all,

could somebody please add me to the cTAKES project in Jira
such that I can assign issues to myself?

Best,

-- Richard


Re: Welcome Richard Eckart de Castilho as a cTAKES committer

2016-05-28 Thread Richard Eckart de Castilho
Hi all,

cool, thanks! :) Looking forward to helping out with cleaning up
some aspects of the cTAKES codebase.

Best,

-- Richard

> On 27.05.2016, at 21:23, Pei Chen <chen...@apache.org> wrote:
> 
> The Apache cTAKES PMC is pleased to introduce Richard Eckart de
> Castilho as a new committer. We are very happy with the sustained
> growth of the project and look forward to continued contributions from
> the community and adding to the ranks of the cTAKES committers.
> 
> --Pei



cTAKES dirty on checkout

2016-05-13 Thread Richard Eckart de Castilho
Hi all,

when checking out the sources of cTAKES from SVN with Eclipse, most of the 
projects are dirty because the Eclipse settings (.classpath and jdt.core.prefs) 
are in the SVN. The particular difference is that on my machine, the projects 
are configured to use a Java 8, while in SVN, it is configured to be a Java 7. 

The parent POM of cTAKES states Java 8

1.8
1.8

Since the Eclipse files in SVN are at least outdated, maybe it would be a good 
idea to drop the .classpath and jdt prefs files from SVN and prevent them from 
being committed?

Cheers,

-- Richard

Re: Combining Knowledge- and Data-driven Methods for De-identification of Clinical Narratives

2015-10-08 Thread Richard Eckart de Castilho
As far as I know, you can convert as long as ALL original authors / copyright 
holders agree to the conversion. Only the original authors may assign new 
licenses to their work. You might also want to double check that the codebase 
doesn't contain any copy/pasted code from third sources.

As a third party, you cannot convert GPL code to ASL.

Mind, I am not a lawyer.

If you need a more advice, post to legal-discuss@asf.

Cheers,

-- Richard

On 08.10.2015, at 21:32, andy mcmurry  wrote:

> caution: Im not sure you can convert GPL3 to ASL2
> anyone know for sure?
> 
> On Thu, Oct 8, 2015 at 12:03 PM, Chen, Pei 
> wrote:
> 
>> This is great news!
>>> What is the current status and procedure? Is there an explicit
>> contribution to cTAKES? Is there an ICLA? What about the license of the
>> sourceforge project?
>> Jira has been opened to track this:
>> https://issues.apache.org/jira/browse/CTAKES-384
>> 
>> 1) Azad, would you be willing to switch licenses?  I believe it's
>> currently GNU3 -> ASL 2.0?
>> 2) Create a project/module in cTAKES sandbox for this
>> 3) Export/Import sourceforge and attach the code to the Jira initially.
>> One of the current cTAKES committers can commit it to the repo (Until folks
>> can commit directly to the ctakes repo directly going forward.)
>> 
>> -Original Message-
>> From: Peter Klügl [mailto:peter.klu...@averbis.com]
>> Sent: Thursday, October 08, 2015 8:06 AM
>> To: dev@ctakes.apache.org
>> Subject: Re: Combining Knowledge- and Data-driven Methods for
>> De-identification of Clinical Narratives
>> 
>> Hi,
>> 
>> I can offer my help here if required.
>> 
>> I have experience in translating JAPE rules to UIMA Ruta and already
>> worked with clinical notes, e.g., also concerning deidentification.
>> 
>> The problem is that I can only invest a few hours in the next two weeks.
>> I will have more time next month or even more next year.
>> 
>> What is the current status and procedure? Is there an explicit
>> contribution to cTAKES? Is there an ICLA? What about the license of the
>> sourceforge project?
>> 
>> Best,
>> 
>> Peter
>> 
>> Am 01.10.2015 um 16:20 schrieb Pei Chen:
>>> Hi Azad,
>>> This is awesome news.  Thanks for adding in the code that was
>>> referenced by the paper.  I'll create a Jira to track we need to port
>>> it over to UIMA/Ruta.
>>> 
>>> In the meantime, the link is at:
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__sourceforge.net_p_
>>> 
>> clinical-2Ddeid_code_ci_master_tree_=BQICaQ=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WY=yjhqco4EH0XrR798kbkzfYcFQ8z8MR9UF8mMRSjKTH0=_k7AbwzkVrRwTrNC3LArZ5hQ5Q47eh06KCDla7UBugY=
>> for those who may be interested in helping out...
>>> 
>>> --Pei
>>> 
>>> Hello Pei,
>>> 
>>> I hope all is well.
>>> 
>>> I have now uploaded the source code for cDeid
>>> (https://urldefense.proofpoint.com/v2/url?u=http-3A__sourceforge.net_p
>>> _clinical-2Ddeid_code_ci_master_tree_=BQICaQ=qS4goWBT7poplM69zy_3x
>>> hKwEW14JZMSdioCoppxeFU=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WY
>>> 
>> =yjhqco4EH0XrR798kbkzfYcFQ8z8MR9UF8mMRSjKTH0=_k7AbwzkVrRwTrNC3LArZ5hQ5Q47eh06KCDla7UBugY=
>> ) ; I have tried to make the code as portable and modular as possible with
>> some trade-off for performance. This should help with porting the code to
>> cTAKES/UIMA.
>>> 
>>> Once you let the community know I will try to get involved to help
>>> with translating JAPE to RUTA, etc.
>>> 
>>> Best,
>>> Azad
>> 
>> 



Re: ytex DBconsumer and groovy parser

2014-07-01 Thread Richard Eckart de Castilho
Hi John,

there is actually no grand difference between analysis engines and consumers.

Per default, a UIMA runtime may create multiple instances of an analysis engine 
and run them in parallel (if the runtime supports that),
but a consumer must see all data going through the pipeline, so there can 
only be once instance.

The default value of flag about being allowing multiple instances or not is the 
only real difference.

Basically any analysis engine that does only read annotations from the CAS but 
not add/change anything is a consumer. Consequently, a consumer can be added 
anywhere in the pipeline, not only at the end (I sometimes do that to see 
intermediate results).

If a component has the allow multiple instances flag set to false (which is 
usually what you want), then runtimes may react to that differently. E.g. the 
Collection Processing Engine (CPE) will single-thread all components (analysis 
engines or consumers) after it hits the first component with allow multiple 
instances set to false (which is typically a consumer). So to make optimal use 
of the CPEs multi-threading capabilities, such components should be towards the 
end of the CPE pipeline.

I believe there is a Java interface declaration and base classes for 
CasConsumers in UIMA - I haven't used these in years. The uimaFIT API doesn't 
even support these because everything can also be (and is within uimaFIT) 
nicely modeled using analysis engines and the allow multiple instances flag.

Cheers,

-- Richard

On 02.07.2014, at 04:01, Masanz, James J. masanz.ja...@mayo.edu wrote:

 Hi John,
 
 Not positive this is the line you are referring to, but there is a line in 
 cTAKES_clinical_pipeline.groovy (which is not in sandbox, btw) that has a 
 comment about 
 
 createAnalysisEngineDescription  expects name to not end in .xml even though 
 filename actually does
 
 I am guessing the comment you see is trying to say the same thing. 
 
 cTAKES_clinical_pipeline.groovy is in  ctakes-core/scripts/groovy
 
 In that script, line 321 is where the writer is specified. There is no 
 separately defined consumer in the same sense that the CPE GUI has 
 consumers that are separate from annotators. The script just uses the last 
 annotator  as a consumer and convention is AFAIK to call them writers in 
 this case.
 
 Hope that helps,
 -- James
 
 -Original Message-
 From: John Green [mailto:john.travis.gr...@gmail.com] 
 Sent: Tuesday, July 01, 2014 7:15 PM
 To: dev@ctakes.apache.org
 Subject: ytex DBconsumer and groovy parser
 
 If someone has a free minute, which, judging from my own life is probably
 not the case - where in the groovy scrips in sandbox do you define the
 consumer to use? There is one comment that says dont put the .xml here
 then there is a path to the dictionary ae. Im working by ssh from the
 hospital a lot in my free time in the ICU and running gui CPEs isn't
 gonna cut it.
 
 Apropos the ytex dbconsumer - I should be able to just tack this on to the
 end of the ytex aggregate pipeline?
 
 I'm probably still asking very naive questions but to date I still haven't
 had the time to dive into UIMA's base very well, so I apologize.
 
 My goal is to run the full ytex pipeline from the command line with the
 ytex dbconsumer ...
 
 Thanks for everyone's patience,
 John



Re: suggestion for default pipelines

2014-04-16 Thread Richard Eckart de Castilho
It would be nice if uimaFIT provided a Maven plugin to automatically
generate descriptors for aggregates. Maybe if we come up with a 
convention for factories, e.g. a class with static methods that do
not take any parameters and that return descriptors, or methods
that bear a specific Java annotation, e.g. @AutoGenerateDescriptor)
it should be possible to implement such a Maven plugin.

Cheers,

-- Richard

On 16.04.2014, at 05:21, Steven Bethard steven.beth...@gmail.com wrote:

 +1. And note that once you have a descriptor, you can generate the
 XML, so we should arrange to replace the current XML descriptors with
 ones generated automatically from the uimaFIT code. That should reduce
 some synchronization problems when the Java code was changed but the
 XML descriptor was not.
 
 Steve
 
 On Tue, Apr 15, 2014 at 8:52 AM, Miller, Timothy
 timothy.mil...@childrens.harvard.edu wrote:
 The discussion in the other thread with Abraham Tom gave me an idea I
 wanted to float to the list. We have been using some UIMAFit pipeline
 builders in the temporal project that maybe could be moved into
 clinical-pipeline. For example, look to this file:
 
 http://svn.apache.org/viewvc/ctakes/trunk/ctakes-temporal/src/main/java/org/apache/ctakes/temporal/pipelines/TemporalExtractionPipeline_ImplBase.java?view=markup
 
 with the static methods getPreprocessorAggregateBuilder() and
 getLightweightPreprocessorAggregateBuilder()   [no umls].
 
 So my idea would be to create a class in clinical-pipeline
 (CTakesPipelines) with static methods for some standard pipelines (to
 return AnalysisEngineDescriptions instead of AggregateBuilders?):
 
 getStandardUMLSPipeline()  -- builds pipeline currently in
 AggregatePlaintextUMLSProcessor.xml
 getFullPipeline() -- same as above but with SRL, constituency parsing,
 etc., every component in ctakes
 
 We could then potentially merge our entry points -- I think Abraham's
 experience points out that this is currently confusing, as well as
 probably not implemented optimally. For example, either
 ClinicalPipelineWithUmls or BagOfCUIsGenerator would use that static
 method to run a uimafit-style pipeline. Maybe we can slowly deprecate
 our xml descriptors too unless people feel strongly about keeping those
 around.
 
 Another benefit is that the cTAKES API is then trivial -- if you import
 ctakes into your pom file getting a UIMA pipeline is one UimaFit call:
 
 builder.add(CTAKESPipelines.getStandardUMLSPipeline());
 
 
 I think this would actually be pretty easy to implement, but hoping to
 get some feedback on whether this is a good direction.
 
 Tim



Re: Represent your project at ApacheCon

2014-01-29 Thread Richard Eckart de Castilho
Hi Andy,

you might find this interesting:

https://github.com/jimpil/clojuima

-- Richard

On 29.01.2014, at 12:24, andy mcmurry mcmurry.a...@gmail.com wrote:

 I'm hoping to attend.
 If I were to present, it would be using NLP to find evidence of DNA
 mutations causing disease. Interesting topic (for me at least) but I'm not
 sure if ApacheCON would be into it.
 
 PS: the radio silence is because I've been working on both the VM and a
 wrapper for REST that runs in the JVM (clojure). Clojure, having its
 origins in LISP, is a better fit for serious NLP work than Groovy however,
 groovy is probably easier for a novice to understand. (which is important).
 These days everyone understands REST, so the idea of providing a VM with
 REST support for NLP services is highly attractive (to me at least).
 
 Does this interest anyone else?
 
 --AndyMC



How are cTAKES resources distributed via Maven Central?

2014-01-27 Thread Richard Eckart de Castilho
Hi

I was looking if/how cTAKES distributes resources via Maven Central. I found 
some, but I am actually quite a bit confused now.

There are the component-res artifacts [1], like ctakes-pos-tagger-res. These 
have a JAR and a sources-JAR. The JAR is practically empty, but the sources-JAR 
appears to contain the actual resources. Is there a special reason for this?

Additionally, there is the ctakes-resources-distribution which is distributed 
as a bin.zip via Maven Central. It appears to contain UMLS data. Has it been 
replaced by ctakes-resources-umls2011ab in 3.1.1? The 
ctakes-resources-umls2011ab JAR actually contains data, contrary to the 
component-res JARs mentioned above. Why is there data in this JAR, but not in 
the component-res JARs?

/me scratching head…

Please enlighten me :)

Cheers,

-- Richard

[1] http://search.maven.org/#search%7Cga%7C1%7Cctakes%20res

Re: scala and groovy

2013-12-13 Thread Richard Eckart de Castilho
);
 
 And those four lines still result in the following:
 
 Resolving dependency: org.cleartk#cleartk-util;0.9.2 {default=[default]}
 Preparing to download artifact org.cleartk#cleartk-util;0.9.2!cleartk-util.jar
 Preparing to download artifact org.apache.uima#uimaj-core;2.4.0!uimaj-core.jar
 Preparing to download artifact org.uimafit#uimafit;1.4.0!uimafit.jar
 Preparing to download artifact args4j#args4j;2.0.16!args4j.jar
 Preparing to download artifact com.google.guava#guava;13.0!guava.jar
 Preparing to download artifact com.carrotsearch#hppc;0.4.1!hppc.jar
 Preparing to download artifact commons-io#commons-io;2.4!commons-io.jar
 Preparing to download artifact commons-lang#commons-lang;2.4!commons-lang.jar
 Preparing to download artifact 
 org.apache.uima#uimaj-tools;2.4.0!uimaj-tools.jar
 Preparing to download artifact 
 org.springframework#spring-core;3.1.0.RELEASE!spring-core.jar
 Preparing to download artifact 
 org.springframework#spring-context;3.1.0.RELEASE!spring-context.jar
 Preparing to download artifact org.apache.uima#uimaj-cpe;2.4.0!uimaj-cpe.jar
 Preparing to download artifact 
 org.apache.uima#uimaj-document-annotation;2.4.0!uimaj-document-annotation.jar
 Preparing to download artifact 
 org.apache.uima#uimaj-adapter-vinci;2.4.0!uimaj-adapter-vinci.jar
 Preparing to download artifact org.apache.uima#jVinci;2.4.0!jVinci.jar
 Preparing to download artifact 
 org.springframework#spring-asm;3.1.0.RELEASE!spring-asm.jar
 Preparing to download artifact 
 commons-logging#commons-logging;1.1.1!commons-logging.jar
 Preparing to download artifact 
 org.springframework#spring-aop;3.1.0.RELEASE!spring-aop.jar
 Preparing to download artifact 
 org.springframework#spring-beans;3.1.0.RELEASE!spring-beans.jar
 Preparing to download artifact 
 org.springframework#spring-expression;3.1.0.RELEASE!spring-expression.jar
 Preparing to download artifact aopalliance#aopalliance;1.0!aopalliance.jar
 org.codehaus.groovy.control.MultipleCompilationErrorsException: startup 
 failed:
 General error during conversion: Error grabbing Grapes -- [download failed: 
 org.springframework#spring-asm;3.1.0.RELEASE!spring-asm.jar]
 
 java.lang.RuntimeException: Error grabbing Grapes -- [download failed: 
 org.springframework#spring-asm;3.1.0.RELEASE!spring-asm.jar]
 
 
 I tried deleting .groovy/grapes/org.springframework  but get the same error
 I don't see this as being friendly for new users if downloading dependencies 
 is not so simple.
 
 -Original Message-
 From: dev-return-2317-Masanz.James=mayo@ctakes.apache.org 
 [mailto:dev-return-2317-Masanz.James=mayo@ctakes.apache.org] On Behalf Of 
 Richard Eckart de Castilho
 Sent: Friday, December 13, 2013 12:16 PM
 To: dev@ctakes.apache.org
 Subject: Re: scala and groovy
 
 On 13.12.2013, at 15:27, Steven Bethard steven.beth...@gmail.com wrote:
 
 P.S. I've stayed out of this whole Groovy thing because we (at
 ClearTK) had some bad experiences with Groovy in the past. Mainly with
 Groovy scripts getting out of sync with the rest of the code base,
 just like XML descriptors, though perhaps the IDEs and Maven are
 better now and that's no longer a problem? But this whole grape
 thing instead of standard Maven isn't changing my mind. Not that I
 planned to switch away from Scala for my scripting anyway, but...
 
 
 I heard and read about your bad experiences with Groovy. I believe
 that the IDEs got somewhat better at handling Groovy. However, I think a
 difference needs to be made depending on the use case.
 
 Some people use the XML files as a format to exchange pipelines
 with each other. However, alone, these files are not of much use.
 One benefit of using Groovy as a pipeline-exchange format is, that
 it can actually get all its dependencies itself via Grape. The
 Groovy script is quite self-contained (although it relies on the
 Maven infrastructure for downloading its dependencies).
 Another is, that thanks to uimaFIT, the Groovy code is much less
 verbose than the XML descriptors.
 
 At the UKP Lab, we also use Groovy sometimes for high-level experiment
 logic. For us, it is a good compromise between inflexible and
 verbose XML files and flexible and verbose Java code. Groovy is flexible
 and concise and the IDE support is meanwhile reasonable.
 
 Mind that the IDE support for Grapes (at least in Eclipse) is hilarious.
 Grapes cause the IDE to become quite unresponsive as the artifact resolution
 is now well integrated into the IDE.
 
 So here is my summarized opinion when to use or not to use Groovy:
 
 == Examples / Exchange ==
 
 In order to get quick results for new users and to showcase the capabilities
 of a component collection such as DKPro Core or cTAKES, I think the Groovy 
 scripts
 are a convenient vehicle. At DKPro Core, we also packaged all the resources 
 (models)
 as Maven artifacts, which gives us an additional edge over the manual 
 downloading
 currently happening in the cTAKES Groovy prototypes.
 
 == High-level experiment orchestration ==
 
 Groovy

Re: scala and groovy

2013-12-13 Thread Richard Eckart de Castilho
I can understand your reservations. However, they appear to be similar to the 
reservations that some people have against using Maven (which also 
automatically downloads stuff, although is for developers) or using 
web-services (e.g. the UMLS service used by cTAKES).

A Groovy script is certainly no replacement for a full download, for all the 
reasons that you are describing. I think it can be a supplement for those who 
do not want to start out with the full download.

It may be possible to combine both approaches, though. E.g. use the same script 
in a scenario which does auto-downloading and in a scenario where the user has 
downloaded a distribution. In the second case, the distribution would have to 
come with proper configuration files to point the artifact resolution mechanism 
at the folders to which the distribution has been downloaded. It sounds 
reasonable, but it is probably much less straight forward then it sounds. But 
eventually, that is part of the idea, that you can trade convenience 
(auto-downloads) for control (pre-downloaded artifacts).

I believe, the script approach also shows where resource handling could be 
improved, e.g. by distributing certain resources as Maven artifacts and/or 
incorporating the ability of automatically downloading resources directly in 
analysis engines. IMHO, there shouldn't be any code which explicitly downloads 
resources.

In DKPro Core, we support both. If a resources is available on the classpath 
(e.g. by virtue of being a Maven dependency, by being referred to by a @Grab, 
or by having been downloaded as part of a distribution), it is used form there. 
Otherwise, our AEs try to automatically download the resource from our Maven 
repository (unless this is explicitly disabled).

In my experience, using technologies like Maven or Grapes in a corporate 
environment should be supplemented by a private artifact repository run by the 
corporation, e.g. to reduce network issues when talking to external 
repositories, or to distribute proprietary artifacts (resources, analysis 
components, or other libraries). Corporate users should then use this 
repository as a general proxy to access any artifacts. E.g. at the UKP Lab, we 
run such an internal repository. All our users get all artifacts through there 
- it caches everything anybody ever used, so we can even continue to use 
artifacts should the remote repository be down temporarily, permanently, or if 
artifacts got deleted. We trust in the Maven infrastructure, but we like to 
have control over the artifacts.

Some stuff, like the Groovy scripts, we do only as a service to newbies or for 
doing small things, e.g. simple conversion pipelines. They are a result of 
trying to provide some usable examples for people that have reservations 
against installing Eclipse, setting up Maven, etc. And they appear to be less 
intimidating then Java to people who know e.g. Python, because they are 
directly executable and quite readable. 

I'm not perfectly happy with them, because there is still stuff that is too 
technical, e.g. all the import statements. Eventually, a similar technology 
would be nice which only consists of the pipeline declaration (no @Grabs, no 
imports), but still functions in the same way (including auto-downloads). But, 
that is - just as the pre-deploy scenario - future work ;)

Anyway, I would also like to thank you for experimenting with the idea and 
testing its implications in a corporate environment! 

-- Richard

On 13.12.2013, at 19:46, Masanz, James J. masanz.ja...@mayo.edu wrote:

 
 Thanks Richard for doing all that testing.
 
 But the idea that we cannot easily get to what is causing the issue, together 
 with the fact Tim was able to reproduce one of my issues [1], leads me to 
 question using dynamic downloading of anything for our users.
 
 I would prefer to see a single download that a user extracts from, which I 
 see having the following advantages
 - no mysterious suspected network issues
 - user can be told how much space will be taken up
 - user has easy control where things will be put (rather than having to 
 configure where grapes will be stored, if user does not want them under their 
 home directory)
 
 That's my 2 cents.
 
 Yes, I am behind a firewall. And in fact I am VPN'd in to work. But I suspect 
 some of our users do that too.
 
 [1] http://markmail.org/message/lgo7eyruotl7nnix
 
 -- James
 
 
 -Original Message-
 From: dev-return-2322-Masanz.James=mayo@ctakes.apache.org 
 [mailto:dev-return-2322-Masanz.James=mayo@ctakes.apache.org] On Behalf Of 
 Richard Eckart de Castilho
 Sent: Friday, December 13, 2013 3:36 PM
 To: dev@ctakes.apache.org
 Subject: Re: scala and groovy
 
 Hi James,
 
 I enabled info on the grape resolving using
 
  export JAVA_OPTS=-Dgroovy.grape.report.downloads=true $JAVA_OPTS
 
 Then I tried your script three times. 
 
 1) First, I just ran without any changes to my system (custom grapeConfig.xml 
 which avoids using .m2

Re: cTAKES Groovy...

2013-12-12 Thread Richard Eckart de Castilho
Might be a temporary network problem. The artifact is on Maven Central:

http://search.maven.org/#artifactdetails%7Cedu.mit.findstruct%7Cfindstructapi%7C0.0.1%7Cjar

-- Richard

On 12.12.2013, at 15:01, Masanz, James J. masanz.ja...@mayo.edu wrote:

 The story continues:
 
 The @GrabResolver line from Richard did the trick for jwnl.
 
 But I cleared my .groovy/grapes and  .m2/repository and tried running 
 parser.groovy and get the following:
 
 org.codehaus.groovy.control.MultipleCompilationErrorsException: startup 
 failed:
 General error during conversion: Error grabbing Grapes -- [download failed: 
 edu.mit.findstruct#findstructapi;0.0.1!findstructapi.jar]
 
 java.lang.RuntimeException: Error grabbing Grapes -- [download failed: 
 edu.mit.findstruct#findstructapi;0.0.1!findstructapi.jar]
 
 FYI. I will take a look but if anyone has any hints, don't be shy
 
 
 -Original Message-
 From: dev-return-2299-Masanz.James=mayo@ctakes.apache.org 
 [mailto:dev-return-2299-Masanz.James=mayo@ctakes.apache.org] On Behalf Of 
 Finan, Sean
 Sent: Friday, December 06, 2013 2:38 PM
 To: dev@ctakes.apache.org
 Subject: RE: cTAKES Groovy...
 
 Good stuff -  Thanks Richard
 
 -Original Message-
 From: Masanz, James J. [mailto:masanz.ja...@mayo.edu] 
 Sent: Friday, December 06, 2013 3:30 PM
 To: 'dev@ctakes.apache.org'
 Subject: RE: cTAKES Groovy...
 
 Thanks Richard! That did the trick
 
 I'll create a JIRA and update the script including adding a comment that that 
 @GrabResolver  is only needed for pre-OpenNLP 1.5.3 and should be removed 
 when we upgrade to 1.5.3+. and I'll update CTAKES-191 Update Apache OpenNLP 
 dependency to 1.5.3 with a  reminder to update the script.
 
 Trunk of cTAKES still uses 1.5.2-incubating
 
 -Original Message-
 From: dev-return-2297-Masanz.James=mayo@ctakes.apache.org 
 [mailto:dev-return-2297-Masanz.James=mayo@ctakes.apache.org] On Behalf Of 
 Richard Eckart de Castilho
 Sent: Friday, December 06, 2013 2:12 PM
 To: dev@ctakes.apache.org
 Subject: Re: cTAKES Groovy...
 
 On 06.12.2013, at 18:01, Masanz, James J. masanz.ja...@mayo.edu wrote:
 
 I have not solved my issues on my ubuntu server yet where Error 
 grabbing Grapes -- [unresolved dependency: jwnl#jwnl;1.3.3: not found]
 
 This has also already been fixed in OpenNLP 1.5.3, so there must be some 
 dependency on OpenNLP 1.5.(1|2)-incubating.
 
 Anyway, you should be able to fix it by adding this to the beginning of your 
 Groovy script, in front of the Grapes:
 
 @GrabResolver(name='opennlp.sf.net', 
  root='http://opennlp.sourceforge.net/maven2')
 
 -- Richard
 



Re: cTAKES Groovy...

2013-12-12 Thread Richard Eckart de Castilho
I believe that Grape (like Maven) caches failures. It might be necessary to
delete any cached info on that artifact from your local grape repository 
before you try again. Btw. there you might (or might not) be able to find
additional information on why the download failed.

Check out the trouble-shooting section in the DKPro Core Groovy recipe page:

http://code.google.com/p/dkpro-core-asl/wiki/DKProGroovyCookbook#Trouble_shooting

on cache flushing and on enabling verbose information on Grape downloads.

-- Richard

On 12.12.2013, at 15:22, Masanz, James J. masanz.ja...@mayo.edu wrote:

 Shouldn't be firewall - other grapes download fine.
 
 I created a short groovy script to just grab findstructapi - I copy/pasted 
 the @grab line from from the Groovy Grape section of 
 http://search.maven.org/#artifactdetails%7Cedu.mit.findstruct%7Cfindstructapi%7C0.0.1%7Cjar
 
 And I still get
 
 org.codehaus.groovy.control.MultipleCompilationErrorsException: startup 
 failed:
 General error during conversion: Error grabbing Grapes -- [download failed: 
 edu.mit.findstruct#findstructapi;0.0.1!findstructapi.jar]
 
 java.lang.RuntimeException: Error grabbing Grapes -- [download failed: 
 edu.mit.findstruct#findstructapi;0.0.1!findstructapi.jar]
 
 Very odd.
 
 My script is simply:
 
 #!/usr/bin/env groovy
 @Grab(group='edu.mit.findstruct', module='findstructapi', version='0.0.1') 
 import java.io.File;
 
   if(args.length  1) {
   System.out.println(Please specify input directory);
   System.exit(1);
   }
   System.out.println(Input parm is:  + args[0]);
   System.exit(0);
 
 
 -Original Message-
 From: dev-return-2305-Masanz.James=mayo@ctakes.apache.org 
 [mailto:dev-return-2305-Masanz.James=mayo@ctakes.apache.org] On Behalf Of 
 William Karl Thompson
 Sent: Thursday, December 12, 2013 11:06 AM
 To: dev@ctakes.apache.org
 Subject: RE: cTAKES Groovy...
 
 Seems unlikely to be the source of your problem, but could it be a firewall 
 issue?
 
 -Original Message-
 From: Richard Eckart de Castilho [mailto:r...@apache.org] 
 Sent: Thursday, December 12, 2013 11:04 AM
 To: dev@ctakes.apache.org
 Subject: Re: cTAKES Groovy...
 
 Might be a temporary network problem. The artifact is on Maven Central:
 
 http://search.maven.org/#artifactdetails%7Cedu.mit.findstruct%7Cfindstructapi%7C0.0.1%7Cjar
 
 -- Richard
 
 On 12.12.2013, at 15:01, Masanz, James J. masanz.ja...@mayo.edu wrote:
 
 The story continues:
 
 The @GrabResolver line from Richard did the trick for jwnl.
 
 But I cleared my .groovy/grapes and  .m2/repository and tried running 
 parser.groovy and get the following:
 
 org.codehaus.groovy.control.MultipleCompilationErrorsException: startup 
 failed:
 General error during conversion: Error grabbing Grapes -- [download 
 failed: edu.mit.findstruct#findstructapi;0.0.1!findstructapi.jar]
 
 java.lang.RuntimeException: Error grabbing Grapes -- [download failed: 
 edu.mit.findstruct#findstructapi;0.0.1!findstructapi.jar]
 
 FYI. I will take a look but if anyone has any hints, don't be shy
 
 
 -Original Message-
 From: dev-return-2299-Masanz.James=mayo@ctakes.apache.org 
 [mailto:dev-return-2299-Masanz.James=mayo@ctakes.apache.org] On 
 Behalf Of Finan, Sean
 Sent: Friday, December 06, 2013 2:38 PM
 To: dev@ctakes.apache.org
 Subject: RE: cTAKES Groovy...
 
 Good stuff -  Thanks Richard
 
 -Original Message-
 From: Masanz, James J. [mailto:masanz.ja...@mayo.edu]
 Sent: Friday, December 06, 2013 3:30 PM
 To: 'dev@ctakes.apache.org'
 Subject: RE: cTAKES Groovy...
 
 Thanks Richard! That did the trick
 
 I'll create a JIRA and update the script including adding a comment that 
 that @GrabResolver  is only needed for pre-OpenNLP 1.5.3 and should be 
 removed when we upgrade to 1.5.3+. and I'll update CTAKES-191 Update Apache 
 OpenNLP dependency to 1.5.3 with a  reminder to update the script.
 
 Trunk of cTAKES still uses 1.5.2-incubating
 
 -Original Message-
 From: dev-return-2297-Masanz.James=mayo@ctakes.apache.org 
 [mailto:dev-return-2297-Masanz.James=mayo@ctakes.apache.org] On 
 Behalf Of Richard Eckart de Castilho
 Sent: Friday, December 06, 2013 2:12 PM
 To: dev@ctakes.apache.org
 Subject: Re: cTAKES Groovy...
 
 On 06.12.2013, at 18:01, Masanz, James J. masanz.ja...@mayo.edu wrote:
 
 I have not solved my issues on my ubuntu server yet where Error 
 grabbing Grapes -- [unresolved dependency: jwnl#jwnl;1.3.3: not found]
 
 This has also already been fixed in OpenNLP 1.5.3, so there must be some 
 dependency on OpenNLP 1.5.(1|2)-incubating.
 
 Anyway, you should be able to fix it by adding this to the beginning of your 
 Groovy script, in front of the Grapes:
 
 @GrabResolver(name='opennlp.sf.net', 
 root='http://opennlp.sourceforge.net/maven2')
 
 -- Richard



Re: cTAKES Groovy...

2013-12-12 Thread Richard Eckart de Castilho
I tried with the small script (buh):

export JAVA_OPTS=-Dgroovy.grape.report.downloads=true $JAVA_OPTS
HighFire-6:~ bluefire$ ./buh
Resolving dependency: edu.mit.findstruct#findstructapi;0.0.1 {default=[default]}
Preparing to download artifact 
edu.mit.findstruct#findstructapi;0.0.1!findstructapi.jar
Downloaded 13 Kbytes in 2326ms:
  [SUCCESSFUL ] edu.mit.findstruct#findstructapi;0.0.1!findstructapi.jar 
(2311ms)
Please specify input directory

looks ok to me.

-- Richard

On 12.12.2013, at 15:54, Tim Miller timothy.mil...@childrens.harvard.edu 
wrote:

 I was able to replicate the error after removing the findstruct directories 
 from my .groovy and .m2 repositories.
 
 On 12/12/2013 12:22 PM, Masanz, James J. wrote:
 Shouldn't be firewall - other grapes download fine.
 
 I created a short groovy script to just grab findstructapi - I copy/pasted 
 the @grab line from from the Groovy Grape section of
 http://search.maven.org/#artifactdetails%7Cedu.mit.findstruct%7Cfindstructapi%7C0.0.1%7Cjar
 
 And I still get
 
 org.codehaus.groovy.control.MultipleCompilationErrorsException: startup 
 failed:
 General error during conversion: Error grabbing Grapes -- [download failed: 
 edu.mit.findstruct#findstructapi;0.0.1!findstructapi.jar]
 
 java.lang.RuntimeException: Error grabbing Grapes -- [download failed: 
 edu.mit.findstruct#findstructapi;0.0.1!findstructapi.jar]
 
 Very odd.
 
 My script is simply:
 
 #!/usr/bin/env groovy
 @Grab(group='edu.mit.findstruct', module='findstructapi', version='0.0.1')
 import java.io.File;
 
  if(args.length  1) {
  System.out.println(Please specify input directory);
  System.exit(1);
  }
  System.out.println(Input parm is:  + args[0]);
  System.exit(0);


Re: cTAKES user interface

2013-10-29 Thread Richard Eckart de Castilho
Maven allows to do marvelous things on the CLI, provided you throw in an 
additional component: Groovy.

We did some amazing self-contained Groovy scripts with uimaFIT and DKPro Core 
which you might find interesting

  http://code.google.com/p/dkpro-core-asl/wiki/DKProGroovyCookbook

-- Richard

On 29.10.2013, at 23:09, Miller, Timothy 
timothy.mil...@childrens.harvard.edu wrote:

 I think this is also an area where Maven integration was a small step 
 backwards (I greatly appreciate the steps forward it allowed). I used to run 
 stuff from the command line and in scripts more often but it's slightly less 
 straightforward setting up the classpath with maven -- before you could put a 
 simple java -cp lib/*.jar class name in a script, now I'm not sure how to 
 go about it using maven. I'm sure there's a way, but I am afraid of falling 
 down the maven rabbit hole.
 Tim
 
 
 On Oct 29, 2013, at 5:53 PM, Chen, Pei wrote:
 
 +1
 Pan, the short answer is yes- it can be done in CLI.  
 The problem is that most of us who are already familiar with the nitty 
 gritty are probably doing this with some sort of custom scripts or solution.
 Cc' the dev group to get a fresh perspective; not sure what the easiest 
 would be-- run the CPE via command line with default input/output 
 directories or running a Driver Main Class as part of examples.
 
 --Pei



Re: CTAKES-248- include original covered text of NEs which can't be recovered post if NE is from a disjoint span

2013-10-02 Thread Richard Eckart de Castilho
What benefit would it have to store a string with some separation character 
(which may mean that the separation character in the elements may need to be 
escaped), over using a feature of type FSArrayToken pointing to the original 
segments?

Not sure if that is what Karthik meant when referring to fetching the matched 
atom.

-- Richard

On 02.10.2013, at 01:46, Karthik Sarma ksa...@ksarma.com wrote:

 Hmm, couldn't you just fetch the matched atom and use that? Should be the
 same information (without, I suppose, the original ordering and split).
 
 --
 Karthik Sarma
 UCLA Medical Scientist Training Program Class of 20??
 Member, UCLA Medical Imaging  Informatics Lab
 Member, CA Delegation to the House of Delegates of the American Medical
 Association
 ksa...@ksarma.com
 gchat: ksa...@gmail.com
 linkedin: www.linkedin.com/in/ksarma
 
 
 On Tue, Oct 1, 2013 at 12:37 PM, Masanz, James J. 
 masanz.ja...@mayo.eduwrote:
 
 Yes, this would help address that multiple permutations example.  The new
 getOriginalText method would return something like Acute|Disease.  Right
 now I'm thinking of just using vertical bar as delimiter, to start with at
 least, but think it should be configurable.
 
 -Original Message-
 From: dev-return-2067-Masanz.James=mayo@ctakes.apache.org [mailto:
 dev-return-2067-Masanz.James=mayo@ctakes.apache.org] On Behalf Of
 Chen, Pei
 Sent: Tuesday, October 01, 2013 9:38 AM
 To: dev@ctakes.apache.org
 Subject: CTAKES-248- include original covered text of NEs which can't be
 recovered post if NE is from a disjoint span
 
 This sounds pretty cool.
 James, will this address the multiple permutations lookup example:
 Acute alcoholic liver disease.  There is a cui: C0001314: Acute Disease,
 but if you getCoveredText(), on the UMLSConcept, you would actually get the
 same Acute alcoholic liver disease instead of Acute Disease.
 So, there is a new field called getOriginalText() that matched the hit?
 
 -Original Message-
 From: james-mas...@apache.org [mailto:james-mas...@apache.org]
 Sent: Monday, September 30, 2013 5:49 PM
 To: comm...@ctakes.apache.org
 Subject: svn commit: r1527792 - /ctakes/trunk/ctakes-type-
 system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSyst
 em.xml
 
 Author: james-masanz
 Date: Mon Sep 30 21:48:01 2013
 New Revision: 1527792
 
 URL: http://svn.apache.org/r1527792
 Log:
 CTAKES-248  - for named entities, since the annotation just has the
 begin and
 end offset, it is requested to have a way to get the original covered
 text
 (especially for disjoint spans) so it is possible to know which words in
 the
 covered text were actually used in the matching to the dictionary entry
 
 Modified:
ctakes/trunk/ctakes-type-
 system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSyst
 em.xml
 
 Modified: ctakes/trunk/ctakes-type-
 system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSyst
 em.xml
 URL: http://svn.apache.org/viewvc/ctakes/trunk/ctakes-type-
 system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSyst
 em.xml?rev=1527792r1=1527791r2=1527792view=diff
 ==
 
 Binary files - no diff available.



Re: ClearNLP POSTagger

2013-04-08 Thread Richard Eckart de Castilho
Hi,

did you train new models for the ClearNLP/OpenNLP tools? (Maybe I knew if I had 
followed a past discussion on models more closely…)

Cheers,

-- Richard

Am 08.04.2013 um 18:15 schrieb Chen, Pei pei.c...@childrens.harvard.edu:

 Hi,
 While working on the Dependency Parser/SRL labeler,  we also have a POSTagger 
 from ClearNLP.  It is fairly simple and I have the code ready (also trained 
 on the same data as the dep parser- MiPaq/SHARP) to be checked-in.  What does 
 the folks think:
 We can include both Analysis Engines in the ctakes-pos-tagger project.  But 
 should we leave the current OpenNLP in the default pipeline or default to the 
 latest?
 
 The ClearNLP POS tagger shows more robust results on unknown words by 
 generalizing lexical features.  You can find the reference from this paper.
 Fast and Robust Part-of-Speech Tagging Using Dynamic Model Selection, Jinho 
 D. Choi, Martha Palmer, Proceedings of the 50th Annual Meeting of the 
 Association for Computational Linguistics (ACL'12), 363-367, Jeju, Korea, 
 2012. [1] It also uses AdaGrad for machine learning, which is a more advanced 
 learning algorithm than maximum entropy used by OpenNLP.
 
 [1] http://aclweb.org/anthology-new/P/P12/P12-2071.pdf


-- 
--- 
Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab (UKP-TUD) 
FB 20 Computer Science Department  
Technische Universität Darmstadt 
Hochschulstr. 10, D-64289 Darmstadt, Germany 
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
eck...@ukp.informatik.tu-darmstadt.de 
www.ukp.tu-darmstadt.de 
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
---