On Thursday 20 April 2006 12:24, you wrote:
> will this work using unicode on a bengali e-book
>
> ..peekay
Dear Peekay

Listening to you after a long time. 

No, i don't think it will work like that. This works on pure text. 

In fact after you sent me this mail, i tried it on a Unicode Bangla document, 
b-odt that i am now writing with OOo 2.0, and saved it as a text file 
b-oo.txt. Then ran the script on it. It is always giving a queer result: '1'. 

I think this is a problem in interpreting the non-alphabet characters. In the 
script 'tr' changes all such characters into newline, thus making the whole 
text into a long list of words only. 

I tried the first step of the script on the file:
tr -cs A-Za-z\' '\n' <b-oo.txt > b-oo-list.txt

And this list, when i opened with 'less' or 'cat' was showing nothing and the 
octal dump ('od') gave only:
0000000 000012
0000001

So, 'tr' in the first step is sending actually nothing for the later steps to 
work on. 

This is outside my very scanty knowledge on text and its format. Sending it 
both to you and our learned friends on LUG, if someone can help. If anybody 
needs i can send the odt file and the text file.

Thank you for giving an interesting problem.

dipankar das

--
To unsubscribe, send mail to [EMAIL PROTECTED] with the body
"unsubscribe ilug-cal" and an empty subject line.
FAQ: http://www.ilug-cal.org/node.php?id=3

Reply via email to