Dan Muey wrote:
Thank you Michael, I'll have a look at Juerd's page as it's the only thing I
haven't yet :)
On Jul 29, 2010, at 6:17 PM, Michael Ludwig wrote:
Hi Dan,
[Silence “Wide character” warning globally one time]
Dan Muey schrieb am 29.07.2010 um 16:59 (-0500):
I've a situation
You need to have a 'use utf8;' statement at the beginning of your
program to tell Perl that it is encoded in utf8.
I tested it with that, and it works.
Pierre Nugues wrote:
Dear All,
I wrote a simple tokenizer for texts containing Latin9 characters. It does not
behave as expected with the
Pierre Nugues wrote:
Dear Michael,
Pierre Nugues schrieb am 06.09.2010 um 11:09 (+0200):
I wrote a simple tokenizer for texts containing Latin9 characters.
You probably mean non-ASCII characters. Latin9 alias ISO-8859-15 is
the encoding. It's worth while making a distinction here.
I meant
Jonathan Pool wrote:
Let's say the character NO-BREAK SPACE (U+00A0) appears in a UTF8-encoded text
file (so it appears there as C2A0), and I want to match strings that contain
this character.
I write a script (itself encoded with UTF8) in Perl 5.10.0 (on OS X 10.6.5)
with:
use encoding
karl williamson wrote:
Jonathan Pool wrote:
Let's say the character NO-BREAK SPACE (U+00A0) appears in a
UTF8-encoded text file (so it appears there as C2A0), and I want to
match strings that contain this character.
I write a script (itself encoded with UTF8) in Perl 5.10.0 (on OS X
10.6.5
best to
document it (or them). Could you advise me on this?
On 30 Nov 2010, at 10:25, karl williamson wrote:
Jonathan Pool wrote:
Let's say the character NO-BREAK SPACE (U+00A0) appears in a UTF8-encoded text
file (so it appears there as C2A0), and I want to match strings that contain
On 06/27/2011 08:26 AM, BobH wrote:
A project I'm working on needs to build a list of all Unicode characters
that have canonical decompositions. The most efficient ways I can think
of to get such a list are from unicore/Decomposition.pl or by scanning
unicore/UnicodeData.txt. However:
Re
On 06/29/2011 09:06 AM, BobH wrote:
Karl Williamson wrote:
If I did this, I would be tempted to have it return an inversion
list, instead of an array of every code point that matches the
property. ...
My question to you is would that be acceptable to you, do you think?
I hate to return
On 07/01/2011 10:40 AM, BobH wrote:
Karl Williamson wrote:
I'm trying to think of a good name. Best so far is
UCD::get_prop_invlist()
Hm, get normally isn't needed.
How about something simpler such as UCD::charlist()
Bob
I think not having prop in the name is potentially misleading
On 07/01/2011 11:49 AM, Karl Williamson wrote:
On 07/01/2011 10:40 AM, BobH wrote:
Karl Williamson wrote:
I'm trying to think of a good name. Best so far is
UCD::get_prop_invlist()
Hm, get normally isn't needed.
How about something simpler such as UCD::charlist()
Bob
I think
On 07/07/2011 01:17 AM, Dave Saunders wrote:
Dear Encode Developers,
I am migrating a perl application from Solaris 2.10 to Linux Fedora Core
14 (2.6.35.13-92.fc14.x86_64), which is running perl 5.12.3. The app
uses SDBM and I'm encountering a problem which looks related to the
Encode module
Some applications are finding it necessary to read in the Unicode files
that mktables generates. For example, grepping through CPAN indicates
that Text::Unicode::Equivalents reads Decomposition.pl. This, and most
of the other generated files are marked for internal use only, because
we wish
Here's a new version of the API for comment, with the addition of 2
extra functions:
prop_invlist()
prop_invlist returns an inversion list (described below)
that defines all the code points for the Unicode property
given by the input parameter string:
use
Perl 5.15.5, now available, has additions to Unicode::UCD in it to allow
unfettered programmatic access to the Unicode character data base. The
API is quite similar to what was sent out for comment on this list
several months ago; several changes were required as a result of lessons
learned
On 12/29/2011 10:48 AM, FORREST COPLEY wrote:
Is it possible to write a perl script to print a completely custom
character on a console text terminal?
Say a D rotated 90 degrees or something.
or an A with the innards filled in.
--
The Unicode Consortium is seeking feedback on options for fixing
anomalies involving a few characters with the Sc and Scx properties.
For details and to comment, see
http://www.unicode.org/review/pri277/
On 07/23/2015 11:13 AM, Bright Dadson wrote:
Hi Guys,
I am trying to create a perl embed application which expose
WWW::Mechanize into my Cython extension project.
I compile and link my extension using:
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall
-Wstrict-prototypes -g
On 05/09/2016 08:53 AM, Daniel Dehennin wrote:
Hello,
I tried to make my Perl5 code unicode compliant after reading a post on
stackoverflow[1].
As suggested in the post:
“always run incoming stuff through NFD and outbound stuff from NFC.”
I got a hard time finding why my Test::More was
On 05/11/2016 02:04 AM, Daniel Dehennin wrote:
Karl Williamson <pub...@khwilliamson.com> writes:
On 05/09/2016 08:53 AM, Daniel Dehennin wrote:
Hello,
I tried to make my Perl5 code unicode compliant after reading a post on
stackoverflow[1].
As suggested in the post:
“alwa
On 05/05/2016 08:37 AM, Pali Rohár wrote:
Hi!
I though that I understand UTF-8 encoding/decoding done in perl until I
looked into source code of Encode package... (exactly sub encode_utf8)
Before... I only read description of Encode package (not source code):
On 07/09/2016 05:12 PM, p...@cpan.org wrote:
Hi! As we know utf8::encode() does not provide correct UTF-8 encoding
and Encode::encode("UTF-8", ...) should be used instead. Also opening
file should be done by :encoding(UTF-8) layer instead :utf8.
But UTF-8 strict implementation in Encode module
On 08/12/2016 09:31 AM, p...@cpan.org wrote:
On Thursday 11 August 2016 17:41:23 Karl Williamson wrote:
On 07/09/2016 05:12 PM, p...@cpan.org wrote:
Hi! As we know utf8::encode() does not provide correct UTF-8 encoding
and Encode::encode("UTF-8", ...) should be used instead. Also op
On 08/20/2016 08:33 PM, Aristotle Pagaltzis wrote:
* Karl Williamson <pub...@khwilliamson.com> [2016-08-21 03:12]:
That should be done anyway to make sure we've got less buggy Unicode
handling code available to older modules.
I think you meant “available to older perls”?
Yes, thanks
of these that are
missing or buggy in previous perls can and will be dealt with by the
Devel::PPPort mechanism.
On 08/19/2016 02:42 AM, p...@cpan.org wrote:
On Thursday 18 August 2016 23:06:27 Karl Williamson wrote:
On 08/12/2016 09:31 AM, p...@cpan.org wrote:
On Thursday 11 August 2016 17:41:23 Karl Williamson wrote
On 08/22/2016 07:05 AM, p...@cpan.org wrote:
On Sunday 21 August 2016 08:49:08 Karl Williamson wrote:
On 08/21/2016 02:34 AM, p...@cpan.org wrote:
On Sunday 21 August 2016 03:10:40 Karl Williamson wrote:
Top posting.
Attached is my alternative patch. It effectively uses a different
On 08/22/2016 02:47 PM, p...@cpan.org wrote:
> And I think you misunderstand when is_utf8_char_slow() is called. It is
> called only when the next byte in the input indicates that the only
> legal UTF-8 that might follow would be for a code point that is at least
> U+20, almost twice as
On 08/22/2016 03:19 PM, Karl Williamson wrote:
On 08/22/2016 02:47 PM, p...@cpan.org wrote:
> And I think you misunderstand when is_utf8_char_slow() is called.
It is
> called only when the next byte in the input indicates that the only
> legal UTF-8 that might follow would be for a c
On 8/2/23 21:42, Marc Lehmann wrote:
Hi!
Both Encode and Encode::Byte export a symbol called "cp1252_encoding",
which can cause linker errors.
It would be great if that could be changed by e.g. prepending some unique
prefix to exported symbols (such as encode_ and encodebyte_ or somesuch),
28 matches
Mail list logo