Thanks so much! Bonnie MacKellar
On Fri, Feb 22, 2019 at 7:03 AM Erik Fäßler <[email protected]> wrote: > Hey, > > just wanted to say that I didn’t come around to make the component > available yet, will do first thing next week! > > Best, > > Erik > > > On 20. Feb 2019, at 19:47, Bonnie MacKellar <[email protected]> > wrote: > > > > Hi, > > Yes, we are using that format. I have a parser that I wrote, but it isn't > > integrated into UIMA. It runs separately and loads the full clinical > trial > > data into a triplestore (Stardog). I would be interested in your system > > since I am not really familiar with how to write file readers in the UMIA > > framework. Perhaps I can merge my parser into it and end up with just the > > right thing. If you can make it available, I would definitely be > > interested. I will take a look at the other links as well. Thanks!! > > > > Bonnie MacKellar > > > > On Wed, Feb 20, 2019 at 3:54 AM Erik Fäßler <[email protected]> > > wrote: > > > >> Dear Bonnie, > >> > >> are you talking about the clinical trial XML format used by > >> ClinicalTrials. <http://clinicaltrials.org/>gov by any chance? > >> If so, I did create a UIMA reader for these data. Its not perfect but > >> perhaps enough for your purposes and also you might want to enhance it. > >> Please let me know if you would be interested in that, I did not get > >> around to make it publicly available yet but could do so quickly. > >> > >> To answer the general question to the best of my knowledge: > >> There is no such thing as a general XML reader built-in into the UIMA > >> framework. For all non-trivial formats, a specific reader is necessary. > >> This also holds true with regard to the employed type system. > >> That being said, there are UIMA readers that try to serve as a general > XML > >> reading facility, e.g. the “XML Reader” from our lab (JULIELab, > >> https://github.com/JULIELab/jcore-base/tree/master/jcore-xml-reader < > >> https://github.com/JULIELab/jcore-base/tree/master/jcore-xml-reader>). > >> However, in my experience XML inputs come in a lot of different forms > >> which might often not be suitable to a generic approach which is why I > >> wrote quite a few UIMA readers for specific XML formats in the past. > >> > >> Hope that helps, > >> > >> Erik > >> > >>> On 20. Feb 2019, at 01:13, Bonnie MacKellar <[email protected]> > >> wrote: > >>> > >>> This is probably a very naive question, but I can't seem to find > anything > >>> about this. I currently have a lot of XML files (clinical trial > >>> descriptions). My current workflow is to run a preprocessor that parses > >> the > >>> XML and generates text files in a simple format. I then run these files > >> in > >>> a UIMA pipeline, using FileCollectionReader to load the text files, > RUTA > >> to > >>> parse the simple format, the Metamap annotator to do some UMLS > >> annotations, > >>> and finally I have a writer that generates RDF triples from the UMIA > >>> annotations and loads the triples into a database. This has worked but > is > >>> clunky, especially the preprocessing. I feel like there has to be a > >> better > >>> way. Is there any support for reading XML files or do I need to write > my > >>> own CollectionReader? Are there any other tools within UIMA for > handling > >>> XML text? > >>> > >>> thanks, > >>> Bonnie MacKellar > >> > >> > >
