Hi Diman,
Indeed the problem is Xmi deserialization. It looks like ByteArray
features must be defined with
<multipleReferencesAllowed>true</multipleReferencesAllowed>
At first glance this is missing from our documentation. Will double check that.
Eddie
On Fri, Sep 17, 2010 at 10:06 AM, Diman Karagiozov <[email protected]> wrote:
> Hi Eddie,
>
> thanks for the fast reply!
>
> What I've tried is:
>
> byte[] bytes = new byte[ ( ( ByteArrayFS ) fs
> ).size() ];
> for ( int i = 0; i < bytes.length; i++ ) {
> bytes[i] = ( ( ByteArrayFS ) fs ).get( i );
> }
> // byte[] bytes = ( ( ByteArrayFS ) fs ).toArray();
>
> System.out.println( fs + "; " + bytes.length + "; " +
> Arrays.toString( bytes ) );
>
> This is the output of on the stdout:
> ByteArray
> Array length: 2
> Array elements: [16, 41]
> ; 2; [16, 41]
>
> Unfortunately, the bytes length is only 2...
>
> Is it possible that there is something wrong with the deserialization of the
> ByteArray feature? I've check in the UIMA source code and it appears that
> there is different handling of the ByteArray than other array features...
>
> greetings
> Diman
>
> On 09/17/2010 04:52 PM, Eddie Epstein wrote:
>>
>> Hi,
>>
>> The implementation of ByteArrayFS.toArray() doesn't look so good. As a
>> workaround, please try accessing the byte values one at a time using
>> the get(int i) method.
>>
>> By the way, an easier way to get the feature should be
>> Feature feature = anno.getType().getFeatureByBaseName("hypernyms");
>>
>> Eddie
>>
>> On Fri, Sep 17, 2010 at 2:49 AM, Diman Karagiozov<[email protected]>
>> wrote:
>>>
>>> Hi,
>>>
>>> I am developing an NLP tool based on latest UIMA version (2.3.0
>>> incubating).
>>> Basically, the tool utilizes an aggregate engine with several primitive
>>> engines.
>>>
>>> Currently, I am having a problem with the handling of ByteArray values of
>>> one of the annotation features.
>>> Here is how am I setting the value of the feature:
>>>
>>> final ByteArray ba = new ByteArray( cas,
>>> wordHypernymsAsBytes.length );
>>> ba.copyFromArray( wordHypernymsAsBytes, 0,
>>> 0,
>>> wordHypernymsAsBytes.length );
>>> token.setHypernyms( ba );
>>>
>>> And here is how am I reading the ByteArray features (there will be more,
>>> so
>>> I need a generic way of reading them):
>>> List< Feature> annoFeatures =
>>> anno.getCAS().getJCas().getTypeSystem().getType( anno.getType().getName()
>>> ).getFeatures();
>>> Iterator< Feature> featIter = annoFeatures.iterator();
>>> ....
>>> while ( featIter.hasNext() ) {
>>>
>>> Feature feature = featIter.next();
>>> ...
>>> if ( feature.getRange().getName().equalsIgnoreCase(
>>> "uima.cas.ByteArray" ) ) {
>>>
>>> FeatureStructure fs = anno.getFeatureValue( feature );
>>>
>>> if ( fs != null ) {
>>> byte[] bytes = ( ( ByteArrayFS ) fs ).toArray();
>>>
>>> ----> here is the problem actually. The bytes array contains a single
>>> byte
>>> (length of the array is 1). The "setter" above is setting something much
>>> longer.
>>>
>>> }
>>> }
>>> }
>>>
>>> My question is how to read&write the ByteArray features?
>>>
>>> Attached you can find the serialized CAS file as well...
>>>
>>> greetings
>>> Diman
>>>
>
>