Hi Yakov, 

Yes, you can use the POS tagger to tag with whatever categories you choose.

If we take your original example "*The quick brown fox_animal jumps_action over 
the lazy dog_animal"
make sure you tag all tokens,e.g. "The_NA quick_NA brown_NA fox_animal 
jumps_action over_NA the_NA lazy dog_animal", where NA just means 
not-applicable. You choose your categories... Then you can only extract those 
words and categories you are interested in.

Then you'll need to tag some data you can train on, about 15 000 examples or 
more. You can have a POS tagging dictionary in addition, which will help 
diminish the search space of possible tags for a token. 

You can have the same tags across languages but each language should have its 
own training data and dictionary.

However, I am not sure about how successful the approach will be, where you 
only need to do partial annotation.

What do you want to use it for? Maybe there are better options...

Svetoslav
________________________________________
Från: Yakov Keranchuk <[email protected]>
Skickat: den 29 augusti 2013 12:44
Till: [email protected]
Ämne: Re: category tagging

So I found simple example in sources:

WordTagSampleStreamTest.java, it parses string "This_x1 is_x2 a_x3 test_x4
sentence_x5 ._x6" using POSSample.

As I understand, with normal approach there are few steps for each language:
1. collect data for model
2. create POS dictionary like this:
<dictionary>
<entry tags="x1">
<token>This</token>
</entry>
<entry tags="x2">
<token>is</token>
</entry>
<entry tags="x3">
<token>a</token>
</entry>
...

3. learn model with this dictionary

Is it right approach? Is POS Tagger appropriate for this task?

Thanks in advance,
Yakov

On Tue, Aug 27, 2013 at 6:31 PM, Yakov Keranchuk
<[email protected]>wrote:

> Hi
>
> Is it possible to make tagging for tokens with own rules?
> Example: *The quick brown fox_animal jumps_action over the lazy dog_animal
> *
> *
> *
> Do we need to create custom dictionary for POS tagger?
> If it so can there be only one dictionary for a few languages?
>
> Best regards,
> Yakov
>

Reply via email to