[ 
https://issues.apache.org/jira/browse/UIMA-5041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin De Boe updated UIMA-5041:
----------------------------------
    Description: 
Our type system includes a type named "com.intersys.uima.annotation.iknow.TOP", 
which inherits directly from "uima.cas.TOP" and then has a number of subtypes 
specific to our AE. When serializing this through the JsonCasSerializer, it 
generates the shortname TOP twice:

{"_context": 
  {"_types": [
    ...
    "TOP": {"_id":"com.intersys.uima.annotation.iknow.TOP",
                 "_subtypes":["Entity","ProximityScore"]},
    "TOP": {"_id":"uima.cas.TOP",
                 "_subtypes":["TOP","AnnotationBase","ArrayBase","Sofa"]},
    ...]
  }
}

While we can work around this by renaming our top type, the documentation 
explicitly states this shouldn't pose a problem and shortnames would be 
de-duplicated automatically:
https://uima.apache.org/d/uimaj-2.8.1/references.html#ugr.ref.json.overview 
Section 9.2.2:
In the _types section, the key (e.g. "Sofa" or 
"A_Typical_User_or_built_in_Type") is the "short" name for the type used in the 
serialization. It is either just the last segment of the full type name (e.g. 
for the type x.y.z.TypeName, it's TypeName), or, if name would collide with 
another type name if just the last segment was used (example: 
some.package.cname.Foo, and some.other.package.cname.Foo), then the key is made 
up of the next-to-last segment, with an optional suffixed incrementing integer 
in case of collisions on that name, a colon (:) and then the last name.

I see there are unit test checking for this, but maybe it's because 
uima.cas.TOP is sort of a special case? Or because neither uima.cas.TOP nor our 
custom TOP is actually used directly (only subtypes are). 
So before I go ahead and change our root type name, I'd like to make sure this 
isn't something the framework should have taken care of itself.

  was:
Our type system includes a type named "com.intersys.uima.annotation.iknow.TOP", 
which inherits directly from "uima.cas.TOP" and then has a number of subtypes 
specific to our AE. When serializing this through the JsonCasSerializer, it 
generates the shortname TOP twice:

{"_context": 
  {"_types": [
    ...
    "TOP": {"_id":"com.intersys.uima.annotation.iknow.TOP",
                 "_subtypes":["Entity","ProximityScore"]},
    "TOP": {"_id":"uima.cas.TOP",
                 "_subtypes":["TOP","AnnotationBase","ArrayBase","Sofa"]},
    ...]
  }
}

While we can work around this by renaming our top type, the documentation 
explicitly states this shouldn't pose a problem and shortnames would be 
de-duplicated automatically:
https://uima.apache.org/d/uimaj-2.8.1/references.html#ugr.ref.json.overview 
Section 9.2.2:
In the _types section, the key (e.g. "Sofa" or 
"A_Typical_User_or_built_in_Type") is the "short" name for the type used in the 
serialization. It is either just the last segment of the full type name (e.g. 
for the type x.y.z.TypeName, it's TypeName), or, if name would collide with 
another type name if just the last segment was used (example: 
some.package.cname.Foo, and some.other.package.cname.Foo), then the key is made 
up of the next-to-last segment, with an optional suffixed incrementing integer 
in case of collisions on that name, a colon (:) and then the last name.

I see there are unit test checking for this, but maybe it's because 
uima.cas.TOP is sort of a special case? Or because neither uima.cas.TOP nor our 
custom TOP is actually used directly (only subtypes are). 
While I ican definitely change our type system to use a different root type 
name, I'd like to make sure this isn't something the framework should have 
taken care of itself.


> JsonCasSerializer creates duplicate shortname
> ---------------------------------------------
>
>                 Key: UIMA-5041
>                 URL: https://issues.apache.org/jira/browse/UIMA-5041
>             Project: UIMA
>          Issue Type: Bug
>          Components: Core Java Framework
>    Affects Versions: 2.8.1SDK
>            Reporter: Benjamin De Boe
>            Priority: Minor
>
> Our type system includes a type named 
> "com.intersys.uima.annotation.iknow.TOP", which inherits directly from 
> "uima.cas.TOP" and then has a number of subtypes specific to our AE. When 
> serializing this through the JsonCasSerializer, it generates the shortname 
> TOP twice:
> {"_context": 
>   {"_types": [
>     ...
>     "TOP": {"_id":"com.intersys.uima.annotation.iknow.TOP",
>                  "_subtypes":["Entity","ProximityScore"]},
>     "TOP": {"_id":"uima.cas.TOP",
>                  "_subtypes":["TOP","AnnotationBase","ArrayBase","Sofa"]},
>     ...]
>   }
> }
> While we can work around this by renaming our top type, the documentation 
> explicitly states this shouldn't pose a problem and shortnames would be 
> de-duplicated automatically:
> https://uima.apache.org/d/uimaj-2.8.1/references.html#ugr.ref.json.overview 
> Section 9.2.2:
> In the _types section, the key (e.g. "Sofa" or 
> "A_Typical_User_or_built_in_Type") is the "short" name for the type used in 
> the serialization. It is either just the last segment of the full type name 
> (e.g. for the type x.y.z.TypeName, it's TypeName), or, if name would collide 
> with another type name if just the last segment was used (example: 
> some.package.cname.Foo, and some.other.package.cname.Foo), then the key is 
> made up of the next-to-last segment, with an optional suffixed incrementing 
> integer in case of collisions on that name, a colon (:) and then the last 
> name.
> I see there are unit test checking for this, but maybe it's because 
> uima.cas.TOP is sort of a special case? Or because neither uima.cas.TOP nor 
> our custom TOP is actually used directly (only subtypes are). 
> So before I go ahead and change our root type name, I'd like to make sure 
> this isn't something the framework should have taken care of itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to