Re: Using REST api with UIMA
I am unfamiliar with this project. I took a quick look and see that: a) it's old - not been worked on for more than 3 years b) The serialization technique is to serialize using UIMA's CAS -> XCAS (an older form of xml serialization), followed by a transform from the xml to json. I didn't look, but I'd guess that the json -> cas path did a json -> xcas, followed by a deserialization of the XCAS format by UIMA. This would be relatively inefficient I think. I would guess this would be unlikely to be included into UIMA. -Marshall On 5/4/2017 10:43 AM, Luca Toldo wrote: > Thankyou Marshall for your fast and authoritative reply. > > The deserialization from JSON to CAS is really important since this will > „bridge“ the UIMA community with the micro service REST world. > > I've found the following project > https://github.com/windj007/uima-serialization/ and I am interested on your > opinion about its level of „maturity“ / likelihood of inclusion in the UIMA > releases. > > Thanks > Luca > > >> Am 04.05.2017 um 16:37 schrieb Marshall Schor : >> >> The core UIMA support has not (yet) implemented a deserializer back into CAS >> form, for this. >> >> The main idea behind CAS -> JSON conversion was to provide the CAS info to >> JSON >> Consumers. >> >> We have multiple other serializations (CAS -> XMI, etc.) that are designed >> for >> "transport" and include deserializers. >> >> Of course, it is quite possible to create an addition to UIMA which can >> deserialize the JSON format(s) - that just hasn't yet been done. >> Contributions >> welcome! >> >> -Marshall >> >> >> On 5/4/2017 3:49 AM, Luca Toldo wrote: >>> The following Java code (inspired from >>> http://stackoverflow.com/questions/40838999/getting-output-in-json-format-in-uima >>> ) >>> >>> import java.io.*; >>> import org.apache.uima.fit.factory.JCasFactory; >>> import org.apache.uima.jcas.JCas; >>> import org.apache.uima.cas.CAS; >>> import org.apache.uima.json.JsonCasSerializer; >>> public class test { >>>public static void main(String [] args ) throws IOException { >>>try { >>>String note="Lorem ipsum incididunt ut labore et >>> dolore magna aliqua"; >>>JCas jcas = JCasFactory.createJCas(); >>>jcas.setDocumentText(note); >>>JsonCasSerializer jcs = new JsonCasSerializer(); >>>jcs.setPrettyPrint(true); >>>StringWriter sw = new StringWriter(); >>>CAS cas = jcas.getCas(); >>>jcs.serialize(cas, sw); >>>System.out.println(sw.toString()); >>>} catch (Exception ex) { >>>} >>>} >>> } >>> >>> >>> delivers properly formatted JSON CAS: >>> >>> {"_context" : { >>>"_types" : { >>> "DocumentAnnotation" : {"_id" : "uima.tcas.DocumentAnnotation", >>>"_feature_types" : {"sofa" : "_ref" } }, >>> "Sofa" : {"_id" : "uima.cas.Sofa", >>>"_feature_types" : {"sofaArray" : "_ref" } }, >>> "Annotation" : {"_id" : "uima.tcas.Annotation", >>>"_feature_types" : {"sofa" : "_ref" }, >>>"_subtypes" : ["DocumentAnnotation" ] }, >>> "AnnotationBase" : {"_id" : "uima.cas.AnnotationBase", >>>"_feature_types" : {"sofa" : "_ref" }, >>>"_subtypes" : ["Annotation" ] }, >>> "TOP" : {"_id" : "uima.cas.TOP", >>>"_subtypes" : ["AnnotationBase", "Sofa" ] } } }, >>> "_views" : { >>>"_InitialView" : { >>> "DocumentAnnotation" : [ >>>{"sofa" : 1, "begin" : 0, "end" : 55, "language" : >>> "x-unspecified" } ] } }, >>> "_referenced_fss" : { >>>"1" : {"_type" : "Sofa", "sofaNum" : 1, "sofaID" : "_InitialView", >>> "mimeType" : "text", "sofaString" : "Lorem ipsum incididunt ut labore et >>> dolore magna aliqua" } } } >>> >>> How to deserialize that back into CAS object ? >>> >
Re: Using REST api with UIMA
Thankyou Marshall for your fast and authoritative reply. The deserialization from JSON to CAS is really important since this will „bridge“ the UIMA community with the micro service REST world. I've found the following project https://github.com/windj007/uima-serialization/ and I am interested on your opinion about its level of „maturity“ / likelihood of inclusion in the UIMA releases. Thanks Luca > Am 04.05.2017 um 16:37 schrieb Marshall Schor : > > The core UIMA support has not (yet) implemented a deserializer back into CAS > form, for this. > > The main idea behind CAS -> JSON conversion was to provide the CAS info to > JSON > Consumers. > > We have multiple other serializations (CAS -> XMI, etc.) that are designed for > "transport" and include deserializers. > > Of course, it is quite possible to create an addition to UIMA which can > deserialize the JSON format(s) - that just hasn't yet been done. Contributions > welcome! > > -Marshall > > > On 5/4/2017 3:49 AM, Luca Toldo wrote: >> The following Java code (inspired from >> http://stackoverflow.com/questions/40838999/getting-output-in-json-format-in-uima >> ) >> >> import java.io.*; >> import org.apache.uima.fit.factory.JCasFactory; >> import org.apache.uima.jcas.JCas; >> import org.apache.uima.cas.CAS; >> import org.apache.uima.json.JsonCasSerializer; >> public class test { >>public static void main(String [] args ) throws IOException { >>try { >>String note="Lorem ipsum incididunt ut labore et >> dolore magna aliqua"; >>JCas jcas = JCasFactory.createJCas(); >>jcas.setDocumentText(note); >>JsonCasSerializer jcs = new JsonCasSerializer(); >>jcs.setPrettyPrint(true); >>StringWriter sw = new StringWriter(); >>CAS cas = jcas.getCas(); >>jcs.serialize(cas, sw); >>System.out.println(sw.toString()); >>} catch (Exception ex) { >>} >>} >> } >> >> >> delivers properly formatted JSON CAS: >> >> {"_context" : { >>"_types" : { >> "DocumentAnnotation" : {"_id" : "uima.tcas.DocumentAnnotation", >>"_feature_types" : {"sofa" : "_ref" } }, >> "Sofa" : {"_id" : "uima.cas.Sofa", >>"_feature_types" : {"sofaArray" : "_ref" } }, >> "Annotation" : {"_id" : "uima.tcas.Annotation", >>"_feature_types" : {"sofa" : "_ref" }, >>"_subtypes" : ["DocumentAnnotation" ] }, >> "AnnotationBase" : {"_id" : "uima.cas.AnnotationBase", >>"_feature_types" : {"sofa" : "_ref" }, >>"_subtypes" : ["Annotation" ] }, >> "TOP" : {"_id" : "uima.cas.TOP", >>"_subtypes" : ["AnnotationBase", "Sofa" ] } } }, >> "_views" : { >>"_InitialView" : { >> "DocumentAnnotation" : [ >>{"sofa" : 1, "begin" : 0, "end" : 55, "language" : "x-unspecified" >> } ] } }, >> "_referenced_fss" : { >>"1" : {"_type" : "Sofa", "sofaNum" : 1, "sofaID" : "_InitialView", >> "mimeType" : "text", "sofaString" : "Lorem ipsum incididunt ut labore et >> dolore magna aliqua" } } } >> >> How to deserialize that back into CAS object ? >> >
Re: Using REST api with UIMA
The core UIMA support has not (yet) implemented a deserializer back into CAS form, for this. The main idea behind CAS -> JSON conversion was to provide the CAS info to JSON Consumers. We have multiple other serializations (CAS -> XMI, etc.) that are designed for "transport" and include deserializers. Of course, it is quite possible to create an addition to UIMA which can deserialize the JSON format(s) - that just hasn't yet been done. Contributions welcome! -Marshall On 5/4/2017 3:49 AM, Luca Toldo wrote: > The following Java code (inspired from > http://stackoverflow.com/questions/40838999/getting-output-in-json-format-in-uima > ) > > import java.io.*; > import org.apache.uima.fit.factory.JCasFactory; > import org.apache.uima.jcas.JCas; > import org.apache.uima.cas.CAS; > import org.apache.uima.json.JsonCasSerializer; > public class test { > public static void main(String [] args ) throws IOException { > try { > String note="Lorem ipsum incididunt ut labore et > dolore magna aliqua"; > JCas jcas = JCasFactory.createJCas(); > jcas.setDocumentText(note); > JsonCasSerializer jcs = new JsonCasSerializer(); > jcs.setPrettyPrint(true); > StringWriter sw = new StringWriter(); > CAS cas = jcas.getCas(); > jcs.serialize(cas, sw); > System.out.println(sw.toString()); > } catch (Exception ex) { > } > } > } > > > delivers properly formatted JSON CAS: > > {"_context" : { > "_types" : { > "DocumentAnnotation" : {"_id" : "uima.tcas.DocumentAnnotation", > "_feature_types" : {"sofa" : "_ref" } }, > "Sofa" : {"_id" : "uima.cas.Sofa", > "_feature_types" : {"sofaArray" : "_ref" } }, > "Annotation" : {"_id" : "uima.tcas.Annotation", > "_feature_types" : {"sofa" : "_ref" }, > "_subtypes" : ["DocumentAnnotation" ] }, > "AnnotationBase" : {"_id" : "uima.cas.AnnotationBase", > "_feature_types" : {"sofa" : "_ref" }, > "_subtypes" : ["Annotation" ] }, > "TOP" : {"_id" : "uima.cas.TOP", > "_subtypes" : ["AnnotationBase", "Sofa" ] } } }, > "_views" : { > "_InitialView" : { > "DocumentAnnotation" : [ > {"sofa" : 1, "begin" : 0, "end" : 55, "language" : "x-unspecified" > } ] } }, > "_referenced_fss" : { > "1" : {"_type" : "Sofa", "sofaNum" : 1, "sofaID" : "_InitialView", > "mimeType" : "text", "sofaString" : "Lorem ipsum incididunt ut labore et > dolore magna aliqua" } } } > > How to deserialize that back into CAS object ? >
Re: Deserialization of CAS object back from JSON
I’ve been trying Roman’s (windj007) https://github.com/windj007/uima-serialization/ and Jackson serialisation/deserialisation (http://www.baeldung.com/jackson-deserialization) On 2017-05-04 12:15 (+0200), Richard Eckart de Castilho wrote: > Hi Luca, > > On 04.05.2017, at 10:00, Luca Toldo wrote: > > > > Help would be appreciated how to convert back the serialized JSON into a > > CAS object. > > Unfortunately, there is no org.apache.uima.json.JsonCasDeSerialize... > > That's unfortunately correct: the JSON support in UIMA presently only > supports serialization, but not deserialization. > > We have support for serializing to/from XML (XMI) and for various binary > formats. > > For sending data across a network connection, I would personally choose a > compressed binary format (COMPRESSED_TSI). > > I would suggest you have a look at CasIOUtils and SerialFormat present in > recent UIMA releases. > > Cheers, > > -- Richard
Re: Deserialization of CAS object back from JSON
Hi Luca, On 04.05.2017, at 10:00, Luca Toldo wrote: > > Help would be appreciated how to convert back the serialized JSON into a CAS > object. > Unfortunately, there is no org.apache.uima.json.JsonCasDeSerialize... That's unfortunately correct: the JSON support in UIMA presently only supports serialization, but not deserialization. We have support for serializing to/from XML (XMI) and for various binary formats. For sending data across a network connection, I would personally choose a compressed binary format (COMPRESSED_TSI). I would suggest you have a look at CasIOUtils and SerialFormat present in recent UIMA releases. Cheers, -- Richard
Using REST api with UIMA
The following Java code (inspired from http://stackoverflow.com/questions/40838999/getting-output-in-json-format-in-uima ) import java.io.*; import org.apache.uima.fit.factory.JCasFactory; import org.apache.uima.jcas.JCas; import org.apache.uima.cas.CAS; import org.apache.uima.json.JsonCasSerializer; public class test { public static void main(String [] args ) throws IOException { try { String note="Lorem ipsum incididunt ut labore et dolore magna aliqua"; JCas jcas = JCasFactory.createJCas(); jcas.setDocumentText(note); JsonCasSerializer jcs = new JsonCasSerializer(); jcs.setPrettyPrint(true); StringWriter sw = new StringWriter(); CAS cas = jcas.getCas(); jcs.serialize(cas, sw); System.out.println(sw.toString()); } catch (Exception ex) { } } } delivers properly formatted JSON CAS: {"_context" : { "_types" : { "DocumentAnnotation" : {"_id" : "uima.tcas.DocumentAnnotation", "_feature_types" : {"sofa" : "_ref" } }, "Sofa" : {"_id" : "uima.cas.Sofa", "_feature_types" : {"sofaArray" : "_ref" } }, "Annotation" : {"_id" : "uima.tcas.Annotation", "_feature_types" : {"sofa" : "_ref" }, "_subtypes" : ["DocumentAnnotation" ] }, "AnnotationBase" : {"_id" : "uima.cas.AnnotationBase", "_feature_types" : {"sofa" : "_ref" }, "_subtypes" : ["Annotation" ] }, "TOP" : {"_id" : "uima.cas.TOP", "_subtypes" : ["AnnotationBase", "Sofa" ] } } }, "_views" : { "_InitialView" : { "DocumentAnnotation" : [ {"sofa" : 1, "begin" : 0, "end" : 55, "language" : "x-unspecified" } ] } }, "_referenced_fss" : { "1" : {"_type" : "Sofa", "sofaNum" : 1, "sofaID" : "_InitialView", "mimeType" : "text", "sofaString" : "Lorem ipsum incididunt ut labore et dolore magna aliqua" } } } How to deserialize that back into CAS object ?
Deserialization of CAS object back from JSON
Inspired by https://goo.gl/zg9Bs3 following code import java.io.*; import org.apache.uima.fit.factory.JCasFactory; import org.apache.uima.jcas.JCas; import org.apache.uima.cas.CAS; import org.apache.uima.json.JsonCasSerializer; public class test { public static void main(String [] args ) throws IOException { try { String note="Lorem ipsum incididunt ut labore et dolore magna aliqua"; JCas jcas = JCasFactory.createJCas(); jcas.setDocumentText(note); JsonCasSerializer jcs = new JsonCasSerializer(); StringWriter sw = new StringWriter(); CAS cas = jcas.getCas(); jcs.serialize(cas, sw); System.out.println(sw.toString()); } catch (Exception ex) { } } } compiles fine with javac -cp .:uima-core.jar:uimafit-core-2.3.0.jar:uimaj-core-2.9.0.jar:uimaj-json.jar test.java and executed delivers properly formatted JSON serialization. {"_context":{"_types":{"DocumentAnnotation":{"_id":"uima.tcas.DocumentAnnotation","_feature_types":{"sofa":"_ref"}},"Sofa":{"_id":"uima.cas.Sofa","_feature_types":{"sofaArray":"_ref"}},"Annotation":{"_id":"uima.tcas.Annotation","_feature_types":{"sofa":"_ref"},"_subtypes":["DocumentAnnotation"]},"AnnotationBase":{"_id":"uima.cas.AnnotationBase","_feature_types":{"sofa":"_ref"},"_subtypes":["Annotation"]},"TOP":{"_id":"uima.cas.TOP","_subtypes":["AnnotationBase","Sofa"]}}},"_views":{"_InitialView":{"DocumentAnnotation":[{"sofa":1,"begin":0,"end":55,"language":"x-unspecified"}]}},"_referenced_fss":{"1":{"_type":"Sofa","sofaNum":1,"sofaID":"_InitialView","mimeType":"text","sofaString":"Lorem ipsum incididunt ut labore et dolore magna aliqua"}}} Help would be appreciated how to convert back the serialized JSON into a CAS object. Unfortunately, there is no org.apache.uima.json.JsonCasDeSerialize... Thanks.