The byte array that the constructor to TokenNameFinderCrossValidator is
asking for is the feature generators as XML, such as (and borrowed from
[1]):
<generators>
<cache>
<generators>
<window prevLength = "2" nextLength = "2">
<tokenclass/>
</window>
<window prevLength = "2" nextLength = "2">
<token/>
</window>
<definition/>
<prevmap/>
<bigram/>
<sentence begin="true" end="false"/>
<window prevLength = "2" nextLength = "2">
<brownclustertoken dict="brownCluster" />
</window>
<brownclustertokenclass dict="brownCluster" />
<brownclusterbigram dict="brownCluster" />
<wordcluster dict="word2vec.cluster" />
<wordcluster dict="clark.cluster" />
</generators>
</cache>
</generators>
An an example, in TokenNameFinderFactory you can see
in loadDefaultFeatureGeneratorBytes() how the default feature generator is
loaded from XML to a byte array when no feature generators are provided.
Jeff
[1]
https://opennlp.apache.org/documentation/1.7.0/manual/opennlp.html#tools.namefind.training.featuregen
On Fri, Apr 21, 2017 at 9:17 AM, Saurabh Jain <[email protected]>
wrote:
> Hi All
>
> I have defined feature generator for OpenNLP name finder in java source
> code as an object of *CachedFeatureGenerator *. I have to cross validate
> NameFinder and whatever api I am able to find in code accepts feature
> generators as byte array. Problem is *CachedFeatureGenerator *is not
> serializable (as far as I came to know). Is there any api in OpenNLP
> NameFinder for cross validation which accept *CachedFeatureGenerator *as
> feature generator or is there any other way ?
>
> --
> *Thanks & Regards*
>
>
> *Saurabh Jain *
> *AI Developer*
>
> *Active Intelligence *
>
> *"*
> *To do a thing yesterday was the best time . Second best time is today .” *
>