Re: Using REST api with UIMA

2017-05-04 Thread Marshall Schor
I am unfamiliar with this project.

I took a quick look and see that:

a) it's old - not been worked on for more than 3 years

b) The serialization technique is to serialize using UIMA's CAS -> XCAS (an
older form of xml serialization), followed by a transform from the xml to json. 
I didn't look, but I'd guess that the json -> cas path did a json -> xcas,
followed by a deserialization of the XCAS format by UIMA.

This would be relatively inefficient I think.   I would guess this would be
unlikely to be included into UIMA.

-Marshall


On 5/4/2017 10:43 AM, Luca Toldo wrote:
> Thankyou Marshall for your fast and authoritative reply.
>
> The deserialization from JSON to CAS is really important since this will 
> „bridge“ the UIMA community with the micro service REST world.
>
> I've found the following project 
> https://github.com/windj007/uima-serialization/ and I am interested on your 
> opinion about its level of „maturity“ / likelihood of inclusion in the UIMA 
> releases.
>
> Thanks
> Luca
>
>
>> Am 04.05.2017 um 16:37 schrieb Marshall Schor :
>>
>> The core UIMA support has not (yet) implemented a deserializer back into CAS
>> form, for this. 
>>
>> The main idea behind CAS -> JSON conversion was to provide the CAS info to 
>> JSON
>> Consumers.
>>
>> We have multiple other serializations (CAS -> XMI, etc.) that are designed 
>> for
>> "transport" and include deserializers.
>>
>> Of course, it is quite possible to create an addition to UIMA which can
>> deserialize the JSON format(s) - that just hasn't yet been done. 
>> Contributions
>> welcome!
>>
>> -Marshall
>>
>>
>> On 5/4/2017 3:49 AM, Luca Toldo wrote:
>>> The following Java code (inspired from 
>>> http://stackoverflow.com/questions/40838999/getting-output-in-json-format-in-uima
>>>  ) 
>>>
>>> import java.io.*;
>>> import org.apache.uima.fit.factory.JCasFactory;
>>> import org.apache.uima.jcas.JCas;
>>> import org.apache.uima.cas.CAS;
>>> import org.apache.uima.json.JsonCasSerializer;
>>> public class  test {
>>>public static void main(String [] args ) throws IOException {
>>>try {
>>>String note="Lorem ipsum incididunt ut labore et 
>>> dolore magna aliqua";
>>>JCas jcas = JCasFactory.createJCas();
>>>jcas.setDocumentText(note);
>>>JsonCasSerializer jcs = new JsonCasSerializer();
>>>jcs.setPrettyPrint(true); 
>>>StringWriter sw = new StringWriter();
>>>CAS cas = jcas.getCas();
>>>jcs.serialize(cas, sw); 
>>>System.out.println(sw.toString());
>>>} catch (Exception ex) {
>>>}
>>>}
>>> }
>>>
>>>
>>> delivers properly formatted JSON CAS:
>>>
>>> {"_context" : {
>>>"_types" : {
>>>  "DocumentAnnotation" : {"_id" : "uima.tcas.DocumentAnnotation", 
>>>"_feature_types" : {"sofa" : "_ref" } }, 
>>>  "Sofa" : {"_id" : "uima.cas.Sofa", 
>>>"_feature_types" : {"sofaArray" : "_ref" } }, 
>>>  "Annotation" : {"_id" : "uima.tcas.Annotation", 
>>>"_feature_types" : {"sofa" : "_ref" }, 
>>>"_subtypes" : ["DocumentAnnotation" ] }, 
>>>  "AnnotationBase" : {"_id" : "uima.cas.AnnotationBase", 
>>>"_feature_types" : {"sofa" : "_ref" }, 
>>>"_subtypes" : ["Annotation" ] }, 
>>>  "TOP" : {"_id" : "uima.cas.TOP", 
>>>"_subtypes" : ["AnnotationBase",  "Sofa" ] } } }, 
>>>  "_views" : {
>>>"_InitialView" : {
>>>  "DocumentAnnotation" : [
>>>{"sofa" : 1,  "begin" : 0,  "end" : 55,  "language" : 
>>> "x-unspecified" } ] } }, 
>>>  "_referenced_fss" : {
>>>"1" : {"_type" : "Sofa",  "sofaNum" : 1,  "sofaID" : "_InitialView",  
>>> "mimeType" : "text",  "sofaString" : "Lorem ipsum incididunt ut labore et 
>>> dolore magna aliqua" } } }
>>>
>>> How to deserialize that back into CAS object ?
>>>
>



Re: Using REST api with UIMA

2017-05-04 Thread Luca Toldo
Thankyou Marshall for your fast and authoritative reply.

The deserialization from JSON to CAS is really important since this will 
„bridge“ the UIMA community with the micro service REST world.

I've found the following project 
https://github.com/windj007/uima-serialization/ and I am interested on your 
opinion about its level of „maturity“ / likelihood of inclusion in the UIMA 
releases.

Thanks
Luca


> Am 04.05.2017 um 16:37 schrieb Marshall Schor :
> 
> The core UIMA support has not (yet) implemented a deserializer back into CAS
> form, for this. 
> 
> The main idea behind CAS -> JSON conversion was to provide the CAS info to 
> JSON
> Consumers.
> 
> We have multiple other serializations (CAS -> XMI, etc.) that are designed for
> "transport" and include deserializers.
> 
> Of course, it is quite possible to create an addition to UIMA which can
> deserialize the JSON format(s) - that just hasn't yet been done. Contributions
> welcome!
> 
> -Marshall
> 
> 
> On 5/4/2017 3:49 AM, Luca Toldo wrote:
>> The following Java code (inspired from 
>> http://stackoverflow.com/questions/40838999/getting-output-in-json-format-in-uima
>>  ) 
>> 
>> import java.io.*;
>> import org.apache.uima.fit.factory.JCasFactory;
>> import org.apache.uima.jcas.JCas;
>> import org.apache.uima.cas.CAS;
>> import org.apache.uima.json.JsonCasSerializer;
>> public class  test {
>>public static void main(String [] args ) throws IOException {
>>try {
>>String note="Lorem ipsum incididunt ut labore et 
>> dolore magna aliqua";
>>JCas jcas = JCasFactory.createJCas();
>>jcas.setDocumentText(note);
>>JsonCasSerializer jcs = new JsonCasSerializer();
>>jcs.setPrettyPrint(true); 
>>StringWriter sw = new StringWriter();
>>CAS cas = jcas.getCas();
>>jcs.serialize(cas, sw); 
>>System.out.println(sw.toString());
>>} catch (Exception ex) {
>>}
>>}
>> }
>> 
>> 
>> delivers properly formatted JSON CAS:
>> 
>> {"_context" : {
>>"_types" : {
>>  "DocumentAnnotation" : {"_id" : "uima.tcas.DocumentAnnotation", 
>>"_feature_types" : {"sofa" : "_ref" } }, 
>>  "Sofa" : {"_id" : "uima.cas.Sofa", 
>>"_feature_types" : {"sofaArray" : "_ref" } }, 
>>  "Annotation" : {"_id" : "uima.tcas.Annotation", 
>>"_feature_types" : {"sofa" : "_ref" }, 
>>"_subtypes" : ["DocumentAnnotation" ] }, 
>>  "AnnotationBase" : {"_id" : "uima.cas.AnnotationBase", 
>>"_feature_types" : {"sofa" : "_ref" }, 
>>"_subtypes" : ["Annotation" ] }, 
>>  "TOP" : {"_id" : "uima.cas.TOP", 
>>"_subtypes" : ["AnnotationBase",  "Sofa" ] } } }, 
>>  "_views" : {
>>"_InitialView" : {
>>  "DocumentAnnotation" : [
>>{"sofa" : 1,  "begin" : 0,  "end" : 55,  "language" : "x-unspecified" 
>> } ] } }, 
>>  "_referenced_fss" : {
>>"1" : {"_type" : "Sofa",  "sofaNum" : 1,  "sofaID" : "_InitialView",  
>> "mimeType" : "text",  "sofaString" : "Lorem ipsum incididunt ut labore et 
>> dolore magna aliqua" } } }
>> 
>> How to deserialize that back into CAS object ?
>> 
> 



Re: Using REST api with UIMA

2017-05-04 Thread Marshall Schor
The core UIMA support has not (yet) implemented a deserializer back into CAS
form, for this. 

The main idea behind CAS -> JSON conversion was to provide the CAS info to JSON
Consumers.

We have multiple other serializations (CAS -> XMI, etc.) that are designed for
"transport" and include deserializers.

Of course, it is quite possible to create an addition to UIMA which can
deserialize the JSON format(s) - that just hasn't yet been done. Contributions
welcome!

-Marshall


On 5/4/2017 3:49 AM, Luca Toldo wrote:
> The following Java code (inspired from 
> http://stackoverflow.com/questions/40838999/getting-output-in-json-format-in-uima
>  ) 
>
> import java.io.*;
> import org.apache.uima.fit.factory.JCasFactory;
> import org.apache.uima.jcas.JCas;
> import org.apache.uima.cas.CAS;
> import org.apache.uima.json.JsonCasSerializer;
> public class  test {
> public static void main(String [] args ) throws IOException {
> try {
> String note="Lorem ipsum incididunt ut labore et 
> dolore magna aliqua";
> JCas jcas = JCasFactory.createJCas();
> jcas.setDocumentText(note);
> JsonCasSerializer jcs = new JsonCasSerializer();
> jcs.setPrettyPrint(true); 
> StringWriter sw = new StringWriter();
> CAS cas = jcas.getCas();
> jcs.serialize(cas, sw); 
> System.out.println(sw.toString());
> } catch (Exception ex) {
> }
> }
> }
>
>
> delivers properly formatted JSON CAS:
>
> {"_context" : {
> "_types" : {
>   "DocumentAnnotation" : {"_id" : "uima.tcas.DocumentAnnotation", 
> "_feature_types" : {"sofa" : "_ref" } }, 
>   "Sofa" : {"_id" : "uima.cas.Sofa", 
> "_feature_types" : {"sofaArray" : "_ref" } }, 
>   "Annotation" : {"_id" : "uima.tcas.Annotation", 
> "_feature_types" : {"sofa" : "_ref" }, 
> "_subtypes" : ["DocumentAnnotation" ] }, 
>   "AnnotationBase" : {"_id" : "uima.cas.AnnotationBase", 
> "_feature_types" : {"sofa" : "_ref" }, 
> "_subtypes" : ["Annotation" ] }, 
>   "TOP" : {"_id" : "uima.cas.TOP", 
> "_subtypes" : ["AnnotationBase",  "Sofa" ] } } }, 
>   "_views" : {
> "_InitialView" : {
>   "DocumentAnnotation" : [
> {"sofa" : 1,  "begin" : 0,  "end" : 55,  "language" : "x-unspecified" 
> } ] } }, 
>   "_referenced_fss" : {
> "1" : {"_type" : "Sofa",  "sofaNum" : 1,  "sofaID" : "_InitialView",  
> "mimeType" : "text",  "sofaString" : "Lorem ipsum incididunt ut labore et 
> dolore magna aliqua" } } }
>
> How to deserialize that back into CAS object ?
>



Re: Deserialization of CAS object back from JSON

2017-05-04 Thread Luca Toldo
I’ve been trying  

Roman’s (windj007) https://github.com/windj007/uima-serialization/

and Jackson serialisation/deserialisation 
(http://www.baeldung.com/jackson-deserialization) 


On 2017-05-04 12:15 (+0200), Richard Eckart de Castilho  
wrote: 
> Hi Luca,
> 
> On 04.05.2017, at 10:00, Luca Toldo  wrote:
> > 
> > Help would be appreciated how to convert back the serialized JSON into a 
> > CAS object.
> > Unfortunately, there is no org.apache.uima.json.JsonCasDeSerialize...
> 
> That's unfortunately correct: the JSON support in UIMA presently only 
> supports serialization, but not deserialization.
> 
> We have support for serializing to/from XML (XMI) and for various binary 
> formats.
> 
> For sending data across a network connection, I would personally choose a 
> compressed binary format (COMPRESSED_TSI).
> 
> I would suggest you have a look at CasIOUtils and SerialFormat present in 
> recent UIMA releases.
> 
> Cheers,
> 
> -- Richard

Re: Deserialization of CAS object back from JSON

2017-05-04 Thread Richard Eckart de Castilho
Hi Luca,

On 04.05.2017, at 10:00, Luca Toldo  wrote:
> 
> Help would be appreciated how to convert back the serialized JSON into a CAS 
> object.
> Unfortunately, there is no org.apache.uima.json.JsonCasDeSerialize...

That's unfortunately correct: the JSON support in UIMA presently only supports 
serialization, but not deserialization.

We have support for serializing to/from XML (XMI) and for various binary 
formats.

For sending data across a network connection, I would personally choose a 
compressed binary format (COMPRESSED_TSI).

I would suggest you have a look at CasIOUtils and SerialFormat present in 
recent UIMA releases.

Cheers,

-- Richard

Using REST api with UIMA

2017-05-04 Thread Luca Toldo
The following Java code (inspired from 
http://stackoverflow.com/questions/40838999/getting-output-in-json-format-in-uima
 ) 

import java.io.*;
import org.apache.uima.fit.factory.JCasFactory;
import org.apache.uima.jcas.JCas;
import org.apache.uima.cas.CAS;
import org.apache.uima.json.JsonCasSerializer;
public class  test {
public static void main(String [] args ) throws IOException {
try {
String note="Lorem ipsum incididunt ut labore et dolore 
magna aliqua";
JCas jcas = JCasFactory.createJCas();
jcas.setDocumentText(note);
JsonCasSerializer jcs = new JsonCasSerializer();
jcs.setPrettyPrint(true); 
StringWriter sw = new StringWriter();
CAS cas = jcas.getCas();
jcs.serialize(cas, sw); 
System.out.println(sw.toString());
} catch (Exception ex) {
}
}
}


delivers properly formatted JSON CAS:

{"_context" : {
"_types" : {
  "DocumentAnnotation" : {"_id" : "uima.tcas.DocumentAnnotation", 
"_feature_types" : {"sofa" : "_ref" } }, 
  "Sofa" : {"_id" : "uima.cas.Sofa", 
"_feature_types" : {"sofaArray" : "_ref" } }, 
  "Annotation" : {"_id" : "uima.tcas.Annotation", 
"_feature_types" : {"sofa" : "_ref" }, 
"_subtypes" : ["DocumentAnnotation" ] }, 
  "AnnotationBase" : {"_id" : "uima.cas.AnnotationBase", 
"_feature_types" : {"sofa" : "_ref" }, 
"_subtypes" : ["Annotation" ] }, 
  "TOP" : {"_id" : "uima.cas.TOP", 
"_subtypes" : ["AnnotationBase",  "Sofa" ] } } }, 
  "_views" : {
"_InitialView" : {
  "DocumentAnnotation" : [
{"sofa" : 1,  "begin" : 0,  "end" : 55,  "language" : "x-unspecified" } 
] } }, 
  "_referenced_fss" : {
"1" : {"_type" : "Sofa",  "sofaNum" : 1,  "sofaID" : "_InitialView",  
"mimeType" : "text",  "sofaString" : "Lorem ipsum incididunt ut labore et 
dolore magna aliqua" } } }

How to deserialize that back into CAS object ?


Deserialization of CAS object back from JSON

2017-05-04 Thread Luca Toldo
Inspired by https://goo.gl/zg9Bs3 following code 

import java.io.*;
import org.apache.uima.fit.factory.JCasFactory;
import org.apache.uima.jcas.JCas;
import org.apache.uima.cas.CAS;
import org.apache.uima.json.JsonCasSerializer;
public class  test {
public static void main(String [] args ) throws IOException {
try {
String note="Lorem ipsum incididunt ut labore et dolore 
magna aliqua";
JCas jcas = JCasFactory.createJCas();
jcas.setDocumentText(note);
JsonCasSerializer jcs = new JsonCasSerializer();
StringWriter sw = new StringWriter();
CAS cas = jcas.getCas();
jcs.serialize(cas, sw); 
System.out.println(sw.toString());
} catch (Exception ex) {
}
}
}
compiles fine with javac -cp 
.:uima-core.jar:uimafit-core-2.3.0.jar:uimaj-core-2.9.0.jar:uimaj-json.jar 
test.java and executed delivers properly formatted JSON serialization.

{"_context":{"_types":{"DocumentAnnotation":{"_id":"uima.tcas.DocumentAnnotation","_feature_types":{"sofa":"_ref"}},"Sofa":{"_id":"uima.cas.Sofa","_feature_types":{"sofaArray":"_ref"}},"Annotation":{"_id":"uima.tcas.Annotation","_feature_types":{"sofa":"_ref"},"_subtypes":["DocumentAnnotation"]},"AnnotationBase":{"_id":"uima.cas.AnnotationBase","_feature_types":{"sofa":"_ref"},"_subtypes":["Annotation"]},"TOP":{"_id":"uima.cas.TOP","_subtypes":["AnnotationBase","Sofa"]}}},"_views":{"_InitialView":{"DocumentAnnotation":[{"sofa":1,"begin":0,"end":55,"language":"x-unspecified"}]}},"_referenced_fss":{"1":{"_type":"Sofa","sofaNum":1,"sofaID":"_InitialView","mimeType":"text","sofaString":"Lorem
 ipsum incididunt ut labore et dolore magna aliqua"}}}

Help would be appreciated how to convert back the serialized JSON into a CAS 
object.
Unfortunately, there is no org.apache.uima.json.JsonCasDeSerialize...

Thanks.