Ok, thanks, I may look into that object.  Glad to know there is an elegant
way to do it.


On Wed, Nov 27, 2013 at 12:33 AM, Jörn Kottmann <[email protected]> wrote:

> The NameSampleTypeFilter can be used in the training data stream to filter
> out NameSamples object which don't have a certain type.
>
> Jörn
>
>
> On 11/26/2013 11:43 AM, Jörn Kottmann wrote:
>
>> Hello,
>>
>> the command line trainer util has an option to only used a specified set
>> of types.
>>
>> I am not sure if we ever made this available as part of the API, but it
>> should be really easy to do.
>>
>> Jörn
>>
>>
>> On 11/21/2013 08:43 PM, Walrus theCat wrote:
>>
>>> Hi,
>>>
>>> I'm using the training API, and I want to create a bunch of different
>>> models.  My training data has various entities in it. Unsurprisingly (at
>>> least to the people on this list), when I train a model on my training
>>> data, passing it a name for the entity I'm trying train, it creates a
>>> model
>>> that can detect all the entities in the input data.  This is the line of
>>> code I'm using to do the training, pardon my Scala:
>>>
>>>        NameFinderME.train("en", entityName, sampleStream,
>>> TrainingParameters.defaultParams(),
>>>              null:Array[Byte], Collections.emptyMap[String, Object]());
>>>
>>> The docs say this is how it will behave:
>>>
>>> "A training file can contain multiple types. If the training file
>>> contains
>>> multiple types the created model will also be able to detect these
>>> multiple
>>> types. For now its recommended to only train single type models, since
>>> multi type support is stil experimental. "
>>>
>>> What I was hoping would happen is that the trainer would just ignore the
>>> other entities not matching entityName, and just train the model for
>>> entityName.  This seems like useful functionality, as the user could just
>>> do multiple passes over the training data training for different
>>> entities.
>>>
>>> I guess my question is, can OpenNLP already do what I'm trying to do?
>>> Would it be easier to script new data for each model I want to train
>>> (ugh)
>>> or modify OpenNLP to be able to do this?
>>>
>>> Cheers
>>>
>>>
>>
>

Reply via email to