Hi James,
Now, this is truly bizzare!!! I am 200% sure that a couple of weeks ago,
after your response about using the command-line tool to produce my
dictionary, all my dictionary problems disappeared !!! I specifically
remember not being able to find multi-word entities with the old
dictionary which encoded 2 tokens as one. But now i have produced a
correct dictionary and i am indeed passing it tokenized sentences - but
with no success!!! it only finds single word entities!!! I do understand
this is probably not an openNLP issue simply because i do remember
sorting it!!! In any case thanks for confirming...
Debugged every single line of my code related with the dictionary and
nothing! - this is TRULY BEYOND MY UNDERSTANDING!!! the funny thing is i
only re-run the dictionary name finder in order to check if i could use
it with the evaluator!!! Unbelievable coincidence don't you think?
Jim
p.s: can i see the test sources you mentioned without checking out the
repo? maybe online somewhere?
On 14/03/12 02:47, James Kosin wrote:
Jim,
Check to be sure you are still running the text through the tokenizer
before using the Dictionary Name Finder? Just a thought.
James
On 3/13/2012 7:09 PM, Jim - FooBar(); wrote:
Could you please confirm that the dictionary name-finder can identify
multi-word entities after the recent case-sensitivity changes that
were made? All of a sudden my dictionary stopped recognising
multi-word entities even though they are encoded correctly as 2 in
the .xml dictionary file. I thought i had this sorted - my old problem
had to do with how i was creating the dictionary but now i 'm using
the cmd-line tool...
Jim
p.s: case sensitivity is indeed done correctly now!