Patrick, Ted,
I added use locale; in line 83 but this can't improve my results:
words containing the character l·l (like intel·ligència)are not
included in the results list.
But it is important to say that I add as a tokens all accents,
diaeresis and apostrophes that are used in Catalan corpus
Ted,
I have two suggestions to improve the new version.
1. I have problems to extract bigrams using Fishers exact test - left
sided and Fishers exact test - right sided. Could you fix this two
measures?
The error message:
Can't locate Text/NSP/Measures/2D/left.pm in @INC (@INC contains:
On Thu, Feb 14, 2008 at 03:51:40PM -, Ted Pedersen wrote:
1) Incorporate use locale throughout package (suggested by Patrick
Drouin long ago)This will make for more convenient handling of
non-English text.
Wrong idea, wrong solution.
To make handling of non-Latin1 text more convenient,
Richard Jelinek wrote:
Ths advantage is illusional - unfortunately. llusional in the sense,
as the some problems it seems to solve rely on a well set up
environment on the OS side. Which isn't always the case. Moreover,
Well, an improperly set up system locale is bound to give you all
kinds