CTAKES-384-20160129.patch applied.

> On Jan 29, 2016, at 4:34 AM, Peter Klügl <peter.klu...@averbis.com> wrote:
> 
> Hi,
> 
> the problems were caused by the svn client in my Eclipse. Sorry for the
> trouble, I should have looked more closely at the ciomplete patch.
> 
> I attached a new patch created with commandline tools wich looks correct
> now.
> 
> Pei, can you apply the new patch?
> 
> Best,
> 
> Peter
> 
> Am 28.01.2016 um 15:57 schrieb Peter Klügl:
>> Thanks Pei.
>> 
>> I fear there was again a problem with the patch. All new files are
>> missing (and also the svn-ignore settings).
>> 
>> Can you take a look?
>> 
>> Best,
>> 
>> Peter
>> 
>> Am 28.01.2016 um 14:43 schrieb Pei Chen:
>>> patch applied.
>>> Thanks,
>>> Pei
>>> 
>>> On Thu, Jan 28, 2016 at 4:14 AM, Peter Klügl <peter.klu...@averbis.com> 
>>> wrote:
>>>> Hi Pei,
>>>> 
>>>> can you commit the recent patch for us?
>>>> 
>>>> CTAKES-384-20160120.patch
>>>> 
>>>> Best,
>>>> 
>>>> Peter
>>>> 
>>>> Am 20.01.2016 um 19:35 schrieb Pei Chen:
>>>>> Hi,
>>>>> Sorry I was swamped recently.
>>>>> But yeah, we can even create an extended type system to store these items 
>>>>> temporarily and add them into the main/core type system afterwards.
>>>>> There was an existing item to upgrade UIMA, but agreed- it will require 
>>>>> much more testing.  If it works, we can upgrade it in our sandbox area or 
>>>>> create a branch if necessary.
>>>>> 
>>>>> —Pei
>>>>> 
>>>>>> On Jan 18, 2016, at 9:06 AM, Peter Klügl <peter.klu...@averbis.com> 
>>>>>> wrote:
>>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> a new patch is attached.
>>>>>> 
>>>>>> @Pei:
>>>>>> are there suitable annotation types in the cTAKES type system? Some
>>>>>> project in cTAKES uses something like OntologyMatch... I map it to
>>>>>> IdentifiedAnnotation right now, but there are many empty features...
>>>>>> 
>>>>>> @Azad:
>>>>>> I changed the rules a bit, especially the capitalization like I use it
>>>>>> in ruta normally. The wordlist are compiled to a trie by the maven
>>>>>> plugin. I also added the two regexes for url and email. I extended the
>>>>>> regex for the url. I also changed the evaluation order of some rules
>>>>>> (with @). Feel free to add simple examples to examples.csv for the unit
>>>>>> tests.
>>>>>> 
>>>>>> Let me know if you need more information about the changes.
>>>>>> 
>>>>>> Do you wanna have help with the other rule sets? Or should we split them 
>>>>>> up?
>>>>>> 
>>>>>> Best,
>>>>>> 
>>>>>> Peter
>>>>>> 
>>>>>> Am 18.01.2016 um 11:04 schrieb Peter Klügl:
>>>>>>> Hi,
>>>>>>> 
>>>>>>> great. I will integrate them in the project and in the next patch.
>>>>>>> 
>>>>>>> Best,
>>>>>>> 
>>>>>>> Peter
>>>>>>> 
>>>>>>> Am 18.01.2016 um 00:58 schrieb Azad Dehghan:
>>>>>>>> Three NERs translated and uploaded.
>>>>>>>> 
>>>>>>>> PS. I will validate all NERs once we have them all completed.
>>>>>>>> 
>>>>>>>> Cheers,
>>>>>>>> Azad
>>>>>>>> 
>>>>>>>> On 24 November 2015 at 10:37, Azad Dehghan <azad.dehg...@gmail.com> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> This is on my todo list for Dec. as well. If there are any more 
>>>>>>>>> volunteers
>>>>>>>>> for translating JAPE to RUTA, please get in touch.
>>>>>>>>> 
>>>>>>>>> Cheers,
>>>>>>>>> Azad
>>>>>>>>> 
>>>>>>>>> On 24 Nov 2015 09:55, "Peter Klügl" <peter.klu...@averbis.com> wrote:
>>>>>>>>>> Hi,
>>>>>>>>>> 
>>>>>>>>>> I just wanted to mention that I haven't forgot about it. 
>>>>>>>>>> Unfortunately,
>>>>>>>>>> there is just no spare time right now. I hope I will be able to 
>>>>>>>>>> provide
>>>>>>>>>> the patches in December.
>>>>>>>>>> 
>>>>>>>>>> Best,
>>>>>>>>>> 
>>>>>>>>>> Peter
>>>>>>>>>> 
>>>>>>>>>> Am 06.11.2015 um 16:40 schrieb Pei Chen:
>>>>>>>>>>> Hi Peter,
>>>>>>>>>>> I think the ctakes-examples is probably a good starting point at 
>>>>>>>>>>> least
>>>>>>>>>>> in terms of maven modules, etc.  I think it would be good if we use
>>>>>>>>>>> uimaFIT style as primary approach to wiring components together and
>>>>>>>>>>> generate desc's as secondary...
>>>>>>>>>>> I think the actual components that would be required is probably 
>>>>>>>>>>> best
>>>>>>>>>>> left up to what is actually required for best performing c-deid.  
>>>>>>>>>>> The
>>>>>>>>>>> output would be interesting, I'm not sure if we should treat this as
>>>>>>>>>>> an independent preprocessing component or part of a pipeline (in 
>>>>>>>>>>> which
>>>>>>>>>>> case, we may need to propose a change to the type system or perhaps 
>>>>>>>>>>> an
>>>>>>>>>>> alternative JCas view.  You can probably open up that discussion to
>>>>>>>>>>> the dev group as you see fit.)
>>>>>>>>>>> 
>>>>>>>>>>> My 2 cents...
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Fri, Nov 6, 2015 at 3:38 AM, Peter Klügl 
>>>>>>>>>>> <peter.klu...@averbis.com>
>>>>>>>>> wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> 
>>>>>>>>>>>> Is there a cTAKES project that may serve as an example on how the
>>>>>>>>> cTAKES
>>>>>>>>>>>> community develops or how a project should look like?
>>>>>>>>>>>> I learned that different people set up UIMA project in a quite
>>>>>>>>> different
>>>>>>>>>>>> manner and I do not what to get inspired by "some sort of 
>>>>>>>>>>>> out-dated"
>>>>>>>>>>>> approach in the cTAKES repo.
>>>>>>>>>>>> 
>>>>>>>>>>>> Are there restriction or preferences about the preprocessing
>>>>>>>>> components
>>>>>>>>>>>> that should be used and the kind of "output" of the project.
>>>>>>>>>>>> Components: On which components may the componetns rely: tokenizer,
>>>>>>>>> ...
>>>>>>>>>>>> parser, ... dict lookup?
>>>>>>>>>>>> "output": Should the project provide a pipeline or a single AE?
>>>>>>>>>>>> 
>>>>>>>>>>>> More comments below.
>>>>>>>>>>>> 
>>>>>>>>>>>> Am 03.11.2015 um 16:54 schrieb Azad Dehghan:
>>>>>>>>>>>>>> Who else plans to provide patches for it? Just to avoid duplicate
>>>>>>>>> work
>>>>>>>>>>>>>> and to coordnate the efforts ...
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> I would like to help with the translating JAPE to RUTA.
>>>>>>>>>>>> You can already go ahead with the UIMA Ruta Workbench if you want, 
>>>>>>>>>>>> or
>>>>>>>>>>>> wait until I set up the project with ruta integration.
>>>>>>>>>>>> 
>>>>>>>>>>>> If any questions arise, just ask :-)
>>>>>>>>>>>> 
>>>>>>>>>>>>>> Is there a development dataset which was utilized for the initial
>>>>>>>>>>>>>> development, and if yes, is it possible to contribute it too?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> The data set is unfortunately not publicly available; i2b2
>>>>>>>>>>>>> <https://www.i2b2.org/NLP/DataSets/Main.php> typically releases 
>>>>>>>>>>>>> the
>>>>>>>>> data
>>>>>>>>>>>>> sets 12 months after a given challenge; this is done on an
>>>>>>>>> individual basis
>>>>>>>>>>>>> and involve a Data Use Agreement.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> However, I will be able to conduct and coordinate the validation.
>>>>>>>>>>>>> 
>>>>>>>>>>>> Ok, I'll investigate if we have already access to the dataset here.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>>> My first step would be:
>>>>>>>>>>>>>> - set up a maven project
>>>>>>>>>>>>>> - set up a development pipeline in a test (with cTAKES components
>>>>>>>>>>>>>> replacing the previous ANNIE preprocessing)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> But one item that we need to review is the 3rd party libs jars 
>>>>>>>>>>>>>> that
>>>>>>>>>>>>>> were included to ensure compatibility.  I’ll be sure to take a 
>>>>>>>>>>>>>> look
>>>>>>>>> at
>>>>>>>>>>>>>> that over the next few weeks.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> —Pei
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> @Pei - once ANNIE components are replaced there is should not be a
>>>>>>>>> need to
>>>>>>>>>>>>> worry about the 3rd party libs.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Also, just a thought: we may want to create an independent 
>>>>>>>>>>>>> component
>>>>>>>>> for
>>>>>>>>>>>>> the Two Pass recognition (TwoPass.java) as this method have shown
>>>>>>>>> useful
>>>>>>>>>>>>> for general NER on longitudinal data and surely useful independent
>>>>>>>>> of the
>>>>>>>>>>>>> deid component.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> Azad
>>>>>>>>>>>>> 
> 

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to