Re: StopWords problem

Doron Cohen Wed, 26 Dec 2007 13:42:22 -0800

On Dec 26, 2007 10:33 PM, Liaqat Ali <[EMAIL PROTECTED]> wrote:

> Using javac -encoding UTF-8 still raises the following error.
>
> urduIndexer.java : illegal character: \65279
> ?
> ^
> 1 error
>
> What I am doing wrong?
>


If you have the stop-words in a file, say one word in a line,
they can be read like this:

    BufferedReader r = new BufferedReader(new InputStreamReader(new
FileInputStream("Urdu.txt"),"UTF8"));
    String word = r.readLine();    // loop this line, you get the picture

(Make sure to specify encoding "UTF8" when saving the file from notepad).

Regards,
Doron

Re: StopWords problem

Reply via email to