stead of:
$s = encode("utf8", $s) if Encode::is_utf8($s);
which seems superior to avoid double utf8 encodings,
should the utf8-flag be lost. And it's faster.
Or even simply: Encode::_utf8_off($s)
The problem is that I'm usually wrong. Am I this time?
Am I missing
cter (not byte) (i.e. one large number for the smiling
face character).
Or do you want one number for each utf8-encoded byte (I.e. three
hexpairs for the smiling face character).
perl -le '$s="Smile \x{263A}"; print unpack("H*",$s)'
Perl Version : 5.6.1
Apparently, I
-CSD -ne 'print if /^[\p{Hiragana}\p{Katakana}\p{Kanji}]+$/' f >
f-clean.tok
but replacing the [...] class with a group (?:...) does work:
perl -CSD -ne 'print if /^(?:\p{Hiragana}|\p{Katakana}|\p{Kanji})+$/' f
> f-clean.tok
--
Paul Bijnens, Xplanation
alone.
Why do I have to force the utf8 flag using decode("utf8",..) ?
One of my guesses is that the problem lies in XS-processing of strings
where the utf8 flag is not set correctly. True?
Why does nobody else complain then?
Is my setup wrong? (Tried this on different instal
Nick Ing-Simmons wrote:
Paul Bijnens <[EMAIL PROTECTED]> writes:
Can anyone explain what I'm doing wrong?
I was about to contact the author of HTML::Entities, when
I noticed HTML::Parser 3.45 was released on 6 Jan 2005.
Installed it -- and guess what? Now it works as expected!
I gue
is just the "no-break
space".
What exactly is your problem with that character?
--
Paul Bijnens, xplanation Technology ServicesTel +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax +32 16 397.512
http://www.xpl
; would raise an error in
"UTF-8".
For input, both get the correct characters, assuming the input
bytestream was indeed correct. Or am I missing something?
--
Paul Bijnens, xplanation Technology ServicesTel +32 16 397.511
Technologielaan 21 bus 2
on of iconv, I ran perl on Windows, so, perhaps there is
a problem only with the Windows port? Otherwise:
1) Please be aware of this error
2) Any suggestions (other than pre-translating via "iconv" ;-)
--
Paul Bijnens, xplanation Technology ServicesTel +32 16 397.511
Techno