On 8/2/23 21:42, Marc Lehmann wrote:
Hi!
Both Encode and Encode::Byte export a symbol called "cp1252_encoding",
which can cause linker errors.
It would be great if that could be changed by e.g. prepending some unique
prefix to exported symbols (such as encode_ and encodebyte_ or somesuch),
whic
On 09/25/2016 04:06 AM, p...@cpan.org wrote:
On Thursday 01 September 2016 09:30:08 p...@cpan.org wrote:
On Wednesday 31 August 2016 21:27:37 Karl Williamson wrote:
We may change Encode in blead too, since it already differs from
cpan. I'll have to get Sawyer's opinion on that. Bu
On 08/31/2016 03:43 PM, p...@cpan.org wrote:
On Monday 29 August 2016 17:00:00 Karl Williamson wrote:
If you'd be willing to test this out, especially the performance
parts that would be great!
[snip]
There are 2 experimental performance commits. If you want to see if
they actually im
On 08/25/2016 01:48 AM, p...@cpan.org wrote:
Anyway, if you need some help with Encode module or something different,
let me know. As I want to have UTF-8 support in Encode correctly
working...
I now have a branch with my proposed changes at:
http://perl5.git.perl.org/perl.git/shortlog/refs/hea
On 08/22/2016 02:47 PM, p...@cpan.org wrote:
snip
I added some tests for overlong sequences. Only for ASCII platforms, tests for
EBCDIC
are missing (sorry, I do not have access to any EBCDIC platform for testing).
It's fine to skip those tests on EBCDIC.
> > Anyway, how it behave on EBCDI
On 08/22/2016 03:19 PM, Karl Williamson wrote:
On 08/22/2016 02:47 PM, p...@cpan.org wrote:
> And I think you misunderstand when is_utf8_char_slow() is called.
It is
> called only when the next byte in the input indicates that the only
> legal UTF-8 that might follow would be for a c
On 08/22/2016 02:47 PM, p...@cpan.org wrote:
> And I think you misunderstand when is_utf8_char_slow() is called. It is
> called only when the next byte in the input indicates that the only
> legal UTF-8 that might follow would be for a code point that is at least
> U+20, almost twice as high
On 08/22/2016 07:05 AM, p...@cpan.org wrote:
On Sunday 21 August 2016 08:49:08 Karl Williamson wrote:
On 08/21/2016 02:34 AM, p...@cpan.org wrote:
On Sunday 21 August 2016 03:10:40 Karl Williamson wrote:
Top posting.
Attached is my alternative patch. It effectively uses a different
On 08/21/2016 02:34 AM, p...@cpan.org wrote:
On Sunday 21 August 2016 03:10:40 Karl Williamson wrote:
Top posting.
Attached is my alternative patch. It effectively uses a different
algorithm to avoid decoding the input into code points, and to copy
all spans of valid input at once, instead of
On 08/20/2016 08:33 PM, Aristotle Pagaltzis wrote:
* Karl Williamson [2016-08-21 03:12]:
That should be done anyway to make sure we've got less buggy Unicode
handling code available to older modules.
I think you meant “available to older perls”?
Yes, thanks
missing or buggy in previous perls can and will be dealt with by the
Devel::PPPort mechanism.
On 08/19/2016 02:42 AM, p...@cpan.org wrote:
On Thursday 18 August 2016 23:06:27 Karl Williamson wrote:
On 08/12/2016 09:31 AM, p...@cpan.org wrote:
On Thursday 11 August 2016 17:41:23 Karl Williamson wrote
On 08/12/2016 09:31 AM, p...@cpan.org wrote:
On Thursday 11 August 2016 17:41:23 Karl Williamson wrote:
On 07/09/2016 05:12 PM, p...@cpan.org wrote:
Hi! As we know utf8::encode() does not provide correct UTF-8 encoding
and Encode::encode("UTF-8", ...) should be used instead. Also op
On 07/09/2016 05:12 PM, p...@cpan.org wrote:
Hi! As we know utf8::encode() does not provide correct UTF-8 encoding
and Encode::encode("UTF-8", ...) should be used instead. Also opening
file should be done by :encoding(UTF-8) layer instead :utf8.
But UTF-8 strict implementation in Encode module i
On 05/11/2016 02:04 AM, Daniel Dehennin wrote:
Karl Williamson writes:
On 05/09/2016 08:53 AM, Daniel Dehennin wrote:
Hello,
I tried to make my Perl5 code unicode compliant after reading a post on
stackoverflow[1].
As suggested in the post:
“always run incoming stuff through NFD and
On 05/09/2016 08:53 AM, Daniel Dehennin wrote:
Hello,
I tried to make my Perl5 code unicode compliant after reading a post on
stackoverflow[1].
As suggested in the post:
“always run incoming stuff through NFD and outbound stuff from NFC.”
I got a hard time finding why my Test::More was f
On 05/05/2016 08:37 AM, Pali Rohár wrote:
Hi!
I though that I understand UTF-8 encoding/decoding done in perl until I
looked into source code of Encode package... (exactly sub encode_utf8)
Before... I only read description of Encode package (not source code):
https://metacpan.org/pod/Encode#UTF
On 07/23/2015 11:13 AM, Bright Dadson wrote:
Hi Guys,
I am trying to create a perl embed application which expose
WWW::Mechanize into my Cython extension project.
I compile and link my extension using:
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall
-Wstrict-prototypes -g -fstack-
The Unicode Consortium is seeking feedback on options for fixing
anomalies involving a few characters with the Sc and Scx properties.
For details and to comment, see
http://www.unicode.org/review/pri277/
On 12/29/2011 10:48 AM, FORREST COPLEY wrote:
Is it possible to write a perl script to print a completely custom
character on a console text terminal?
Say a D rotated 90 degrees or something.
or an A with the innards filled in.
--
http://search.cpan.org/~bdfoy/Unicode-Tussle-1.03/lib/Unicode/
Perl 5.15.5, now available, has additions to Unicode::UCD in it to allow
unfettered programmatic access to the Unicode character data base. The
API is quite similar to what was sent out for comment on this list
several months ago; several changes were required as a result of lessons
learned du
Here's a new version of the API for comment, with the addition of 2
extra functions:
prop_invlist()
"prop_invlist" returns an inversion list (described below)
that defines all the code points for the Unicode property
given by the input parameter string:
use Uni
Some applications are finding it necessary to read in the Unicode files
that mktables generates. For example, grepping through CPAN indicates
that Text::Unicode::Equivalents reads Decomposition.pl. This, and most
of the other generated files are marked for internal use only, because
we wish t
On 07/07/2011 01:17 AM, Dave Saunders wrote:
Dear Encode Developers,
I am migrating a perl application from Solaris 2.10 to Linux Fedora Core
14 (2.6.35.13-92.fc14.x86_64), which is running perl 5.12.3. The app
uses SDBM and I'm encountering a problem which looks related to the
Encode module (wh
On 07/01/2011 11:49 AM, Karl Williamson wrote:
On 07/01/2011 10:40 AM, BobH wrote:
Karl Williamson wrote:
I'm trying to think of a good name. Best so far is
UCD::get_prop_invlist()
Hm, "get" normally isn't needed.
How about something simpler such as UCD::charlist()
On 07/01/2011 10:40 AM, BobH wrote:
Karl Williamson wrote:
I'm trying to think of a good name. Best so far is
UCD::get_prop_invlist()
Hm, "get" normally isn't needed.
How about something simpler such as UCD::charlist()
Bob
I think not having prop in the name is po
On 06/29/2011 09:06 AM, BobH wrote:
Karl Williamson wrote:
If I did this, I would be tempted to have it return an inversion
list, instead of an array of every code point that matches the
property. ...
My question to you is would that be acceptable to you, do you think?
I hate to return an
On 06/27/2011 08:04 PM, BobH wrote:
Karl Williamson wrote:
> I'm presuming you need this not for a one-time only thing, but to be
> able to run this program over and over.
Yes -- this is for a module that will be usable in a number of
situations. See
http://search.cpan.org/~bha
On 06/27/2011 08:26 AM, BobH wrote:
A project I'm working on needs to build a list of all Unicode characters
that have canonical decompositions. The most efficient ways I can think
of to get such a list are from unicore/Decomposition.pl or by scanning
unicore/UnicodeData.txt. However:
Re unicore
quot;<:encoding(utf8)".)
So, I'm confused as to whether this is 1 bug or more than 1, and how best to
document it (or them). Could you advise me on this?
On 30 Nov 2010, at 10:25, karl williamson wrote:
Jonathan Pool wrote:
Let's say the character NO-BREAK SPACE (U+00A0) appea
karl williamson wrote:
Jonathan Pool wrote:
Let's say the character NO-BREAK SPACE (U+00A0) appears in a
UTF8-encoded text file (so it appears there as C2A0), and I want to
match strings that contain this character.
I write a script (itself encoded with UTF8) in Perl 5.10.0 (on OS X
1
Jonathan Pool wrote:
Let's say the character NO-BREAK SPACE (U+00A0) appears in a UTF8-encoded text
file (so it appears there as C2A0), and I want to match strings that contain
this character.
I write a script (itself encoded with UTF8) in Perl 5.10.0 (on OS X 10.6.5)
with:
use encoding 'utf
Pierre Nugues wrote:
Dear Michael,
Pierre Nugues schrieb am 06.09.2010 um 11:09 (+0200):
I wrote a simple tokenizer for texts containing Latin9 characters.
You probably mean "non-ASCII characters". Latin9 alias ISO-8859-15 is
the encoding. It's worth while making a distinction here.
I meant
You need to have a 'use utf8;' statement at the beginning of your
program to tell Perl that it is encoded in utf8.
I tested it with that, and it works.
Pierre Nugues wrote:
Dear All,
I wrote a simple tokenizer for texts containing Latin9 characters. It does not
behave as expected with the Sw
Dan Muey wrote:
Thank you Michael, I'll have a look at Juerd's page as it's the only thing I
haven't yet :)
On Jul 29, 2010, at 6:17 PM, Michael Ludwig wrote:
Hi Dan,
[Silence “Wide character” warning globally one time]
Dan Muey schrieb am 29.07.2010 um 16:59 (-0500):
I've a situation where
34 matches
Mail list logo