[perl #51718] Consolidate test infrastructure in t/codingstd and t/distro

2008-03-14 Thread via RT
# New Ticket Created by  Mark Glines 
# Please include the string:  [perl #51718]
# in the subject line of all future correspondence about this issue. 
# URL: http://rt.perl.org/rt3/Ticket/Display.html?id=51718 


There's a lot of common code in the tests in these directories, for
looping across a bunch of files, doing tests, and collating results.
Also, there's no consistency in the output of these tests at all.
(Some of them have a count of total errors afterwards, most don't.
Some of them list line numbers in parens, most don't.)

I've got a start at consolidating this into a helper module; I've
attached my current diff.  I'm posting it to RT in case someone wants
to take it and run with it, or in case someone can think of a better
way.

Mark
Index: lib/Parrot/Test/Util/Runloop.pm
===
--- lib/Parrot/Test/Util/Runloop.pm	(revision 0)
+++ lib/Parrot/Test/Util/Runloop.pm	(revision 0)
@@ -0,0 +1,127 @@
+# Copyright (C) 2008, The Perl Foundation.
+# $Id$
+
+=head1 NAME
+
+Parrot::Test::Util::Runloop - consolidated test for lots of files
+
+=head1 SYNOPSIS
+
+use Parrot::Test::Util::Runloop;
+
+Parrot::Test::Util::Runloop-testloop(
+name = 'No trailing spaces or tabs',
+files= [ $DIST-get_c_language_files() ],
+skips= { 'lib/Parrot/Test/Util/Runloop.pm' = 'devel' },
+per_line = sub { !/.?[ \t]+$/ };
+
+
+=head1 DESCRIPTION
+
+This module provides a basic runloop for test scripts which perform the same
+test, over and over, on lots of files.  It is intended to consolidate some code
+to handle loops, skips etc, replicated many times in the t/distro/ and
+t/codingstd/ test directories.
+
+You can specify a callback routine to get called back once per line (with the
+per_line attribute), or once per file (with the per_file attribute).  The
+per_line callback gets passed the line as a text string.  The per_file callback
+gets passed the whole file as a text string.  If the callback function returns
+positive, the test passed, otherwise the test failed.  Failures are tallied,
+and later reported to the test harness once, as a single test.  On failure,
+some informational diagnostics are also generated, showing the user which
+file(s) and which line(s) (if applicable) had the failure.
+
+
+=head1 AUTHOR
+
+Written by Mark Glines, based on an idea (and lots of enthusiasm) from
+Jerry Gay and Will Coleda.
+
+
+=cut
+
+package Parrot::Test::Util::Runloop;
+
+use strict;
+use warnings;
+
+use Carp;
+use Test::More;
+use IO::File;
+
+sub testloop {
+my ($self, %args) = @_;
+# sanity
+my $usage = Usage: Parrot::Test::Util::Runloop-testloop(\n
+   .name = 'foo',\n
+   .files = [ ... ],\n
+   .per_line = sub { ... });\n;
+croak $usage unless exists $args{name};
+croak $usage unless exists $args{files};
+croak 'files' is not an array reference! unless ref($args{files}) eq 'ARRAY';
+croak no per_file or per_line test callback was provided!
+unless exists($args{per_file}) || exists($args{per_line});
+
+my @failures;
+my $failed_files = 0;
+
+foreach my $path (sort @{$args{files}}) {
+$path = $path-path if ref $path;
+next if exists($args{skips})  exists($args{skips}{$path});
+
+my $file = IO::File-new($path)
+or die Cannot open '$path' for reading: $!\n;
+
+my @lines = $file-getlines();
+my $error_line = $path;
+my $have_errors = 0;
+
+if(exists($args{per_file})) {
+my $cb = $args{per_file};
+my $buf = join('', @lines);
+# do the per-file test
+unless($cb-($buf)) {
+push(@failures, $error_line);
+}
+}
+
+if(exists($args{per_line})) {
+my $cb = $args{per_line};
+
+# do the test, once for each line
+foreach my $n ([EMAIL PROTECTED]) {
+my $line = $lines[$n];
+unless($cb-($line)) {
+$error_line .= , if $have_errors;
+$error_line .=   . ($n+1);
+$have_errors = 1;
+}
+}
+
+push(@failures, $error_line) if $have_errors;
+$failed_files++ if $have_errors;
+}
+}
+local $Test::Builder::Level = $Test::Builder::Level + 1;
+ok(!scalar @failures, $args{name});
+if(scalar @failures) {
+diag($args{diag_prefix} .  in the following files:)
+if exists $args{diag_prefix};
+foreach my $failure (@failures) {
+diag($failure);
+}
+my $failures = scalar @failures;
+my $total_files = scalar @{$args{files}};
+diag(That's $failures failed files out of $total_files files total.);
+}
+}
+
+1;
+
+# Local Variables:
+#   mode: cperl
+#   cperl-indent-level: 4
+#   fill-column: 100
+# End:
+# vim: expandtab 

Re: Character sets PDD ready for review

2008-03-14 Thread Will Coleda
On Thu, Mar 13, 2008 at 5:46 AM, Simon Cozens [EMAIL PROTECTED] wrote:
 Simon Cozens wrote:
   I think I've finished doing what I can with
   docs/pdds/draft/pdd28_character_sets.pod for the time being.
   Please have a look at it, and let me know if there's anything wrong,
   anything unclear, anything missing or anything objectionable about it

  Warnock Warnock Warnock. Can I get a witness, even if it's Looks good
  but I don't understand it or Good luck, pal, but who do you think's
  going to implement it??

  --
  Twofish Pokemon seems an evil concept. Kid hunts animals, and takes
  them from the wild into captivity, where he trains them to fight, and
  then fights them to the death against other people's pokemon. Doesn't
  this remind you of say, cock fighting?



I am still trying to digest it, here are some questions on top of James's.

- Which language targeting parrot requires graphemes? You say, A
grapheme is our concept., but then say, Parrot must support
languages which manipulate strings grapheme-by-grapheme ... but if
it's our own concept, surely there aren't any languages that can be
forcing us to require it.
- Can we get some discussion of the scope of the grapheme table
entries? Is this for a single running instance of parrot? How can
multiple running copies of parrot share strings if they have different
grapheme table entries? How does this impact bytecode generation?
freeze/thaw? What happens when someone constructs a string that blows
the table size?
- Instead of saying This PDD assumes for the moment that the current
string functions will on the whole be maintained, I would much rather
see the the current API included in the document and reviewed as part
of the design. (Or point to another PDD that contains this API)
- In the same vein, I would also be curious to see a gap analysis (not
as part of this document); what is the scope of change to meet the
goals in the PDD?

I may have more questions in the coming days.

Thanks for tackling this, Simon.
-- 
Will Coke Coleda


[perl #51732] Parrot crash in SVN tip

2008-03-14 Thread Ted Neward
# New Ticket Created by  Ted Neward 
# Please include the string:  [perl #51732]
# in the subject line of all future correspondence about this issue. 
# URL: http://rt.perl.org/rt3/Ticket/Display.html?id=51732 


Visual Studio 2008, ActiveState Perl running configure.pl

 

..\..\parrot.exe ..\..\compilers\tge\tgc.pir
--output=POST\Grammar_gen.p

ir POST\Grammar.tg

Parrot VM: PANIC: Null vtable used; did you add a new PMC?!

C file src\pmc.c, line 186

Parrot file (not available), line (not available)

 

We highly suggest you notify the Parrot team if you have not been working on

Parrot.  Use parrotbug (located in parrot's root directory) or send an

e-mail to [EMAIL PROTECTED]

Include the entire text of this error message and the text of the script
that

generated the error.  If you've made any modifications to Parrot, please

describe them as well.

 

Version : 0.5.3-devel

Configured  : Thu Mar 13 18:40:55 2008 GMT

Architecture: i386-MSWin32

JIT Capable : Yes

Interp Flags: 0

Exceptions  : (missing from core)

 

Dumping Core...

Sorry, coredump is not yet implemented for this platform.

 

NMAKE : fatal error U1077: '..\..\parrot.exe' : return code '0x1'

Stop.

NMAKE : fatal error U1077: 'c:\prg\ActivePerl\bin\perl.exe' : return code
'0x2'

Stop.

 

 

 

 

Ted Neward

Java, .NET, XML Services

Consulting, Teaching, Speaking, Writing

HYPERLINK http://www.tedneward.comhttp://www.tedneward.com

 


No virus found in this outgoing message.
Checked by AVG. 
Version: 7.5.519 / Virus Database: 269.21.7/1328 - Release Date: 3/13/2008
11:31 AM
 


Release of Parrot 0.6.0 is coming up

2008-03-14 Thread Bernhard Schmalhofer

Hi,

another milestone release of Parrot is coming up this Tuesday, March 
18th. I'm proud to host

the release procedures this time.

All contributors are encouraged to look at their contributions from the 
last month and
to add their bits to NEWS and CREDITS. Also updates to PLATFORMS and to 
LANGUAGES_STATUS

are very much appreciated.

Please put experiments and your nifty new features on hold until 
Tuesday. Let's concentrate

on testing and bugfixing. 'make fulltest' rules.

---
Barney, Bernhard Schmalhofer




[perl #51732] Parrot crash in SVN tip

2008-03-14 Thread James Keenan via RT
On Fri Mar 14 06:31:08 2008, [EMAIL PROTECTED] wrote:
 Visual Studio 2008, ActiveState Perl running configure.pl
 
  
 
 ..\..\parrot.exe ..\..\compilers\tge\tgc.pir
 --output=POST\Grammar_gen.p
 
 ir POST\Grammar.tg
 
 Parrot VM: PANIC: Null vtable used; did you add a new PMC?!
 
 C file src\pmc.c, line 186
 
 Parrot file (not available), line (not available)
 
  


I'm a bit confused about your report because it appears to report an
error while running 'nmake' rather than in running 'perl Configure.pl'.

Would it be possible to repost the complete output of 'nmake' (and, if
you get that far, 'nmake test') *as an attachment* to a new post?

Thank you very much.
kid51


Re: [perl #51732] Parrot crash in SVN tip

2008-03-14 Thread chromatic
On Friday 14 March 2008 15:15:07 James Keenan via RT wrote:

 On Fri Mar 14 06:31:08 2008, [EMAIL PROTECTED] wrote:

  ..\..\parrot.exe ..\..\compilers\tge\tgc.pir
  --output=POST\Grammar_gen.p
 
  ir POST\Grammar.tg
 
  Parrot VM: PANIC: Null vtable used; did you add a new PMC?!
 
  C file src\pmc.c, line 186
 
  Parrot file (not available), line (not available)

 I'm a bit confused about your report because it appears to report an
 error while running 'nmake' rather than in running 'perl Configure.pl'.



 Would it be possible to repost the complete output of 'nmake' (and, if
 you get that far, 'nmake test') *as an attachment* to a new post?

nmake test won't run, as PCT has failed to build.

I suspect that 'nmake realclean' and then a reconfigure and rebuild will have 
more success.

-- c


Re: Character sets PDD ready for review

2008-03-14 Thread Gianni Ceccarelli
(Here follows various comments and opinions on PDD28 draft, written
while reading it)

As has been pointed out, the expression «A grapheme is our concept» is
not really clear. I think «The term grapheme in this document
defines a concept local to Parrot» or some such.

I'm not sure that UTF-16 can be called a fixed-width encoding (what
with surrogate pairs and all that...)

«we don’t standardize on Unicode internally»: the intent is clear, but
the expression feels ambiguous to me. Do you mean we don't fixate on
a UTF-*, we don't use Unicode-specified semantics and tables, or
what? (I think the text is simply referring to encodings for internal
representations)

«Parrot_Rune»: whoever came up with this short-form for grapheme can
collect a beer from me at the next YAPC::Europe. Brilliant!

«out-of-band» usually does not mean using special values in the same
stream as normal values... again, the intent is clear enough, but the
terminology is misleading.

«0x0438 0x00030F» is not a byte-stream, it's an int-stream.

«need to take the overload of peeking» s/overload/overhead/ ?

Stupid serialization of Parrot_Rune arrays are not portable between
Parrot runs, right? That is, Parrot_Rune(-1) can refer to different
graphemes from one run to the next. Better bang it into the heads of
everyone from the earliest possible moment...

I've always defined an encoding as a function from streams of
characters to strings of bytes (and back, for decoding). Why not
include a similar definition at the beginning of the IMPLEMENTATION
section?

«encoding_get_codepoint» may return something which is not, strictly
speaking, what Unicode calls a codepoint. Ok, calling it runepoint
might be seen as a pun, but confusion is (sadly) the norm whet dealing
with text nowadays, and overloading such a badly-understood term may
not help clear the issue...

Warnings to add to the checklist:

- arithmetical comparison of string data elements is a red flag
- string sorting is ill-defined generally, but it's well-defined
  inside a locale (that is, it's dependent on the language of the
  user, which may or may not have any relation with the language of
  the data, which in turn may or may not have any relation with the
  script of a character)
- tr/// or similar simple-minded table-based transformations are a red
  flag
- the Parrot_Rune value-space is not connected (that is, given that $a
  and $b are valid Parrot_Rune values, there may be a value $c ($a 
  $c  $b) that is not a valid Parrot_Rune), so don't use Parrot_Rune
  in for-loops
- string element count (length) and string display width are quite
  unrelated (Han characters are wider than Latin characters almost
  always, for example)

Hope this helps, and is not too jumbled (I tend to brain-dump)

-- 
Dakkar - Mobilis in mobile
GPG public key fingerprint = A071 E618 DD2C 5901 9574
 6FE2 40EA 9883 7519 3F88
key id = 0x75193F88

To save a single life is better than to build a seven story pagoda.


Parrot_readbc and fread

2008-03-14 Thread Bob Rogers
   Parrot_readbc declares read_result as INTVAL, but assigns to it the
result of fread, which (on my system) is declared to return a size_t.
But later there is a check for a negative result, which makes no sense.
Are there systems on which fread returns a signed value, or should the
declaration of read_result be changed?  TIA,

-- Bob not a C hacker Rogers
   http://rgrjr.dyndns.org/


Diffs between last version checked in and current workfile(s):

Index: src/embed.c
===
--- src/embed.c (revision 26369)
+++ src/embed.c (working copy)
@@ -401,7 +401,7 @@
 if (io) {
 size_t chunk_size;
 char *cursor;
-INTVAL read_result;
+size_t read_result;
 INTVAL wanted;
 
 chunk_size = program_size  0 ? program_size : 1024;

End of diffs.


[perl #51002] [BUG] t/src/io.t failures

2008-03-14 Thread James Keenan via RT
Attached is where things stand on Linux as of r26369.
t/src/io..
1..20
ok 1 - hello world
ok 2 - write
ok 3 - file content
ok 4 - read
ok 5 - append
ok 6 - file content
ok 7 - readline
ok 8 - PIO_parse_open_flags
ok 9 - PIO_open
ok 10 - PIO_read
ok 11 - PIO_read larger file
ok 12 - PIO_read larger chunk when the buffer is not-empty
ok 13 - PIO_tell: read larger chunk when the buffer is not-empty
ok 14 - PIO_write
ok 15 - PIO_close
ok 16 - PIO_make_offset # TODO Symbols not exported; see RT #43056
ok 17 - PIO_seek # TODO Symbols not exported; see RT #43056
ok 18 - PIO_fdopen
# 'cc  -L/usr/local/lib -Wl,-E  t/src/io_19.o src/parrot_config.o -o 
t/src/io_19 -Wl,-rpath=/home/jimk/work/parrot/blib/lib -Lblib/lib -lparrot  
-lnsl -ldl -lm -lcrypt -lutil -lpthread -lrt -lcrypto' failed with exit code 1
# Failed to build 't/src/io_19': t/src/io_19.o: In function `the_test':
# t/src/io_19.c:28: undefined reference to `pio_stdio_layer'
# collect2: ld returned 1 exit status
not ok 19 - stdio-layer # TODO Symbols not exported; see RT #43056

#   Failed (TODO) test 'stdio-layer'
#   at t/src/io.t line 631.
ok 20 - peek
ok
All tests successful.

Test Summary Report
---
t/src/io.t (Wstat: 0 Tests: 20 Failed: 0)
  TODO passed:   16-17
Files=1, Tests=20, 20 wallclock secs ( 0.00 usr  0.00 sys +  2.12 cusr  0.67 
csys =  2.79 CPU)
Result: PASS


Re: Parrot_readbc and fread

2008-03-14 Thread chromatic
On Friday 14 March 2008 15:57:44 Bob Rogers wrote:

Parrot_readbc declares read_result as INTVAL, but assigns to it the
 result of fread, which (on my system) is declared to return a size_t.
 But later there is a check for a negative result, which makes no sense.
 Are there systems on which fread returns a signed value, or should the
 declaration of read_result be changed?  TIA,

My manpage suggests that fread returns size_t, and that's part of C89 and 
POSIX.1-2000, so go ahead and apply.

-- c


[perl #51750] [BUG]: languages/perl6/src/pmc/perl6str.pmc fails to cast correctly

2008-03-14 Thread via RT
# New Ticket Created by  James Keenan 
# Please include the string:  [perl #51750]
# in the subject line of all future correspondence about this issue. 
# URL: http://rt.perl.org/rt3/Ticket/Display.html?id=51750 


Observed on i386-Linux at r 26369:

$ prove -v t/codingstd/check_isxxx.t
t/codingstd/check_isxxx..
1..1
not ok 1 - isxxx() functions cast correctly

#   Failed test 'isxxx() functions cast correctly'
#   at t/codingstd/check_isxxx.t line 82.
# isxxx() function not cast to unsigned char 1 files:
# /home/jimk/work/parrot/languages/perl6/src/pmc/perl6str.pmc (171,  
172, 192, 194, 209, 219, 234, 258, 260, 281, 291)
# Looks like you failed 1 test of 1.
  Dubious, test returned 1 (wstat 256, 0x100)
  Failed 1/1 subtests

Test Summary Report
---
t/codingstd/check_isxxx.t (Wstat: 256 Tests: 1 Failed: 1)
   Failed test:  1
   Non-zero exit status: 1
Files=1, Tests=1,  1 wallclock secs ( 0.00 usr  0.00 sys +  0.80  
cusr  0.04 csys =  0.84 CPU)
Result: FAIL



Re: Parrot_readbc and fread

2008-03-14 Thread Bob Rogers
   From: chromatic [EMAIL PROTECTED]
   Date: Fri, 14 Mar 2008 16:06:32 -0700

   My manpage suggests that fread returns size_t, and that's part of C89 and 
   POSIX.1-2000, so go ahead and apply.

   -- c

Thanks; done as r26370.

-- Bob


[perl #51718] Consolidate test infrastructure in t/codingstd and t/distro

2008-03-14 Thread James Keenan via RT
Mark:

This looks good to me.  However, one of the tests being revised is
currently failing, probably for different reasons.  See: 
http://rt.perl.org/rt3/Ticket/Display.html?id=51750.  I recommend
holding off applying it to trunk until after that bug is resolved --
which I suspect means holding off until after this coming Tuesday's
release as well.

kid51


Re: [perl #51718] Consolidate test infrastructure in t/codingstd and t/distro

2008-03-14 Thread chromatic
On Friday 14 March 2008 16:19:49 James Keenan via RT wrote:

 This looks good to me.  However, one of the tests being revised is
 currently failing, probably for different reasons.  See:
 http://rt.perl.org/rt3/Ticket/Display.html?id=51750.  I recommend
 holding off applying it to trunk until after that bug is resolved --
 which I suspect means holding off until after this coming Tuesday's
 release as well.

It's just a coding standards test, so there's little risk.

-- c


Re: [perl #51718] Consolidate test infrastructure in t/codingstd and t/distro

2008-03-14 Thread Mark Glines
On Fri, 14 Mar 2008 16:19:48 -0700
James Keenan via RT [EMAIL PROTECTED] wrote:

 Mark:
 
 This looks good to me.  However, one of the tests being revised is
 currently failing, probably for different reasons.  See: 
 http://rt.perl.org/rt3/Ticket/Display.html?id=51750.

Thanks for the code review!

Yep, that's a preexisting failure.  The recent changes to that file are
the reason I started looking at the codingstd tests to begin with.  On
Wednesday morning, it was failing 3 of the codingstd tests, one of
which (trailing_spaces.t) didn't even give line numbers.


 I recommend holding off applying it to trunk until after that bug is
 resolved -- which I suspect means holding off until after this coming
 Tuesday's release as well.

I'm going to hold off until after Tuesday's release, in any case, just
to make sure I don't break anything else.

Mark


Re: [perl #51750] [BUG]: languages/perl6/src/pmc/perl6str.pmc fails to cast correctly

2008-03-14 Thread chromatic
On Friday 14 March 2008 16:10:46 James Keenan wrote:

 Observed on i386-Linux at r 26369:

 $ prove -v t/codingstd/check_isxxx.t
 t/codingstd/check_isxxx..
 1..1
 not ok 1 - isxxx() functions cast correctly

 #   Failed test 'isxxx() functions cast correctly'
 #   at t/codingstd/check_isxxx.t line 82.
 # isxxx() function not cast to unsigned char 1 files:
 # /home/jimk/work/parrot/languages/perl6/src/pmc/perl6str.pmc (171,
 172, 192, 194, 209, 219, 234, 258, 260, 281, 291)
 # Looks like you failed 1 test of 1.
   Dubious, test returned 1 (wstat 256, 0x100)
   Failed 1/1 subtests

 Test Summary Report
 ---
 t/codingstd/check_isxxx.t (Wstat: 256 Tests: 1 Failed: 1)
Failed test:  1
Non-zero exit status: 1
 Files=1, Tests=1,  1 wallclock secs ( 0.00 usr  0.00 sys +  0.80
 cusr  0.04 csys =  0.84 CPU)
 Result: FAIL

Fixed in r26371 and r26732.

-- c


Re: Character sets PDD ready for review

2008-03-14 Thread Leopold Toetsch
Am Samstag, 8. März 2008 13:59 schrieb Simon Cozens:
 Hi folks,
   I think I've finished doing what I can with
 docs/pdds/draft/pdd28_character_sets.pod for the time being.
   Please have a look at it, and let me know if there's anything wrong,
 anything unclear, anything missing or anything objectionable about it.
 Character set and encoding support is an absolute nightmare to get
 right, but I feel the stuff in this PDD gives us a good basis to work from.
   If there's no major problems with it, I'll pass it on to Allison for
 editing.

1) The Parrot internal character type

«Strings in Parrot's native string format will probably be an array of 
Parrot_Runes.»

or iso-8859-1 or UCS-2.

Why: 

iso-8859-1 is an 1-byte-charset/encoding, where these 256 chars are matching 
unicode U+ - U+00FF codepoints. CPAN's BIO::folks and a lot more will 
like to have the speed and memory improvements of an 1-byte-encoding.

UCS-2 is a fixed-width 16-bit charset, which includes the Basic Multilingual 
Plane [¹] of unicode. It is sufficient to represent some very high 
percentage of used codepoints. When Wikepedia [²] states ...

cite
UCS-2 (2-byte Universal Character Set) is an obsolete character encoding which 
is a predecessor to UTF-16.
/cite

..., it's already mixing the concepts of charset and encoding. Anyway for 
efficiency reasons, I'd like to see this as an alternative.

2) the concept of Parrot_Rune or

cite
Unicode codepoint where values = 0x8000 are
   understood to be entries into the global Parrot_grapheme_table array.
/cite

seems to be implying that we are gonna starting to:

a) rewrite / improve the now used ICU library
b) inventing a new standard
c) will do a lot of future hiring work to keep in sync with unicode folks ;-)

Basically I have some concerns who will implement and maintain it.

I wrote the one and only (AFAIK) test showing the ugliness of decomposed 
unicode [4] codepoints and I'd be glad if there would be a better solution. 

OTOH I don't know the impact of not having it. East European or other maybe 
involved folks should speak up now.

 Simon

leo's 2¢

[1] http://en.wikipedia.org/wiki/Basic_Multilingual_Plane
[2] http://en.wikipedia.org/wiki/UTF-16
[3] [EMAIL PROTECTED]:~/svn/parrot/leo find t -name '*.t' | xargs grep -w 
compose
t/op/string_cs.t:compose S1, S1
t/pmc/object-mro.t:# ... now some tests which fail to compose the class
[4] [EMAIL PROTECTED]:~/svn/parrot/leo ./parrot t/op/string_cs_46.pasm
___ǰ___
7 8 8 7


[perl #51750] [BUG]: languages/perl6/src/pmc/perl6str.pmc fails to cast correctly

2008-03-14 Thread James Keenan via RT
Fixes confirmed.  c++ for the quick work.


Re: Character sets PDD ready for review

2008-03-14 Thread Mark J. Reed
As a ref point, AppleScript 2.0 (not that I know if anyone wants to
port that to Parrot) characters are defined as Unicode  grapheme
clusters, e.g. the base grapheme and its diacriticals... Is that
similar to the concept of a Parrot_Rune?

On 3/14/08, Leopold Toetsch [EMAIL PROTECTED] wrote:
 Am Samstag, 8. März 2008 13:59 schrieb Simon Cozens:
  Hi folks,
  I think I've finished doing what I can with
  docs/pdds/draft/pdd28_character_sets.pod for the time being.
  Please have a look at it, and let me know if there's anything wrong,
  anything unclear, anything missing or anything objectionable about it.
  Character set and encoding support is an absolute nightmare to get
  right, but I feel the stuff in this PDD gives us a good basis to work
 from.
  If there's no major problems with it, I'll pass it on to Allison for
  editing.

 1) The Parrot internal character type

 «Strings in Parrot's native string format will probably be an array of
 Parrot_Runes.»

 or iso-8859-1 or UCS-2.

 Why:

 iso-8859-1 is an 1-byte-charset/encoding, where these 256 chars are matching
 unicode U+ - U+00FF codepoints. CPAN's BIO::folks and a lot more will
 like to have the speed and memory improvements of an 1-byte-encoding.

 UCS-2 is a fixed-width 16-bit charset, which includes the Basic
 Multilingual
 Plane [¹] of unicode. It is sufficient to represent some very high
 percentage of used codepoints. When Wikepedia [²] states ...

 cite
 UCS-2 (2-byte Universal Character Set) is an obsolete character encoding
 which
 is a predecessor to UTF-16.
 /cite

 ..., it's already mixing the concepts of charset and encoding. Anyway for
 efficiency reasons, I'd like to see this as an alternative.

 2) the concept of Parrot_Rune or

 cite
 Unicode codepoint where values = 0x8000 are
understood to be entries into the global Parrot_grapheme_table
 array.
 /cite

 seems to be implying that we are gonna starting to:

 a) rewrite / improve the now used ICU library
 b) inventing a new standard
 c) will do a lot of future hiring work to keep in sync with unicode folks
 ;-)

 Basically I have some concerns who will implement and maintain it.

 I wrote the one and only (AFAIK) test showing the ugliness of decomposed
 unicode [4] codepoints and I'd be glad if there would be a better solution.

 OTOH I don't know the impact of not having it. East European or other maybe
 involved folks should speak up now.

  Simon

 leo's 2¢

 [1] http://en.wikipedia.org/wiki/Basic_Multilingual_Plane
 [2] http://en.wikipedia.org/wiki/UTF-16
 [3] [EMAIL PROTECTED]:~/svn/parrot/leo find t -name '*.t' | xargs grep -w 
 compose
 t/op/string_cs.t:compose S1, S1
 t/pmc/object-mro.t:# ... now some tests which fail to compose the class
 [4] [EMAIL PROTECTED]:~/svn/parrot/leo ./parrot t/op/string_cs_46.pasm
 ___ǰ___
 7 8 8 7



-- 
Mark J. Reed [EMAIL PROTECTED]


Re: [svn:parrot] r26370 - trunk/src

2008-03-14 Thread chromatic
On Friday 14 March 2008 16:11:54 [EMAIL PROTECTED] wrote:

 Modified:
trunk/src/embed.c

 Log:
 * src/embed.c:
+ (Parrot_readbc):  fread returns size_t, so change the decalaration
  of read_result to match.

As size_t is always positive, the conditional on line 431 can never be true:

if (read_result  0) {
PIO_eprintf(interp,
Parrot VM: Problem reading packfile from PIO.\n);
return NULL;
}

My man page suggests that fread returns either a short item count or zero on 
error, never a negative value, so this conditional was probably never 
correct.

-- c