> > > > Who else plans to provide patches for it? Just to avoid duplicate work > and to coordnate the efforts ... >
I would like to help with the translating JAPE to RUTA. > > Is there a development dataset which was utilized for the initial > development, and if yes, is it possible to contribute it too? > The data set is unfortunately not publicly available; i2b2 <https://www.i2b2.org/NLP/DataSets/Main.php> typically releases the data sets 12 months after a given challenge; this is done on an individual basis and involve a Data Use Agreement. However, I will be able to conduct and coordinate the validation. > > My first step would be: > - set up a maven project > - set up a development pipeline in a test (with cTAKES components > replacing the previous ANNIE preprocessing) > > > > But one item that we need to review is the 3rd party libs jars that > were included to ensure compatibility. I’ll be sure to take a look at > that over the next few weeks. > > —Pei > > @Pei - once ANNIE components are replaced there is should not be a need to worry about the 3rd party libs. Also, just a thought: we may want to create an independent component for the Two Pass recognition (TwoPass.java) as this method have shown useful for general NER on longitudinal data and surely useful independent of the deid component. Cheers, Azad