[perl.git] branch blead, updated. v5.19.9-213-g47a9c25

Karl Williamson Tue, 18 Mar 2014 11:52:43 -0700

In perl.git, the branch blead has been updated

<http://perl5.git.perl.org/perl.git/commitdiff/47a9c2580a689eb6252f710c8ff39c9c9fb2c7cb?hp=abda4963ba7186d442f9f005c2a04308f0e13418>


- Log -----------------------------------------------------------------
commit 47a9c2580a689eb6252f710c8ff39c9c9fb2c7cb
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 18 12:45:49 2014 -0600

    Porting/todo.pod: Use F<> instead of C<>
    
    A file path is supposed to be enclosed in F<>

M       Porting/todo.pod

commit 3b4a0b12402af05544d538a5e3f23b6e83cf565f
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 18 12:17:16 2014 -0600

    mktables: In-line defns for tables up to 3 ranges
    
    eb0925341cc65ce6ce57503ec0ab97cdad39dc98 caused the definitions for
    about 45% of the Unicode tables to be placed in-line in Heavy.pl instead
    of them having to be read-in from disk.  This new commit extends that so
    that about 55% are in-lined, by in-lining tables which consist of up to
    3 ranges.
    
    This is a no-brainer to do, as the memory usage does not increase by
    doing it, and disk accesses go down.  I used the delta in the disk size
    of Heavy.pl as a proxy for the delta in the memory size that it uses,
    as what this commit does is to change various scalar strings in it.
    Doing this measurement indicates that this commit results in a slightly
    smaller Heavy.pl than what was there before eb092534.  The amounts will
    vary between Unicode releases.  I also checked for Unicode beta 7.0, and
    the sizes are again comparable, with a slightly larger Heavy.pl for the
    3-range version there.
    
    For 4-, 5-, ... range tables, doing this results in slowly increasing
    Heavy.pl size (and hence more and more memory use), and that is
    something we may wish to look at in the future, trading memory for fewer
    files and less disk start-up cost.  But for the imminent v5.20, doing it
    for 3-range tables doesn't cost us anything, and gains us fewer disk
    files and accesses.

M       lib/unicore/mktables
M       pod/perldelta.pod

commit cf13ddc5efb78ce6d588441978947303e5baf1d8
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 18 12:10:46 2014 -0600

    mktables: Remove obsolete sort constraint
    
    Zero-length tables are no longer expressed in terms of the Perl 'All'
    property, so the sort no longer has to make sure the latter is processed
    before the former.

M       lib/unicore/mktables

commit 1b6b3fa9c9eacac5564167b012ee3c0bd215d89f
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 18 12:09:25 2014 -0600

    mktables: Add comments, reorder a gen'd file
    
    This adds some clarifying comments, and reorders Heavy.pl so the array
    referred to by two hashes occurs before both hashes in the output.

M       lib/unicore/mktables

commit d0d250ca67bc95d33e27befe918b74aee7e11c8b
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 18 12:07:55 2014 -0600

    mktables: White-space only
    
    This indents code to conform to the new block created in the previous
    commit

M       lib/unicore/mktables

commit 3854b4b8c7222f0112e93a0e5dd89abcb5f7518e
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 18 11:43:59 2014 -0600

    utf8_heavy.pl: Change data structure for in-lined definitions
    
    This commit puts the in-lined definitions introduced by
    eb0925341cc65ce6ce57503ec0ab97cdad39dc98 into a separate array,
    where each element is a unique definition.  This can result in slightly
    smaller current memory usage in utf8_heavy.pl, as the strings giving the
    file names now only appear once, and what replaces them in the hash
    values are strings of digits indices, which are shorter.
    
    But doing this allows us in a later commit to increase the number of
    ranges in the tables that get in-lined, without increasing memory usage.
    
    This commit also changes the code that generates the definitions to be
    more general, so that it becomes trivial to change things to generate
    in-line definitions for tables with more than one range.

M       lib/Unicode/UCD.t
M       lib/unicore/mktables
M       lib/utf8_heavy.pl

commit b72d82dc1770487df30e3939e26d423334f3f6aa
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 18 10:39:53 2014 -0600

    mktables: Fix overlooked in-line table defns code
    
    Commit eb0925341cc65ce6ce57503ec0ab97cdad39dc98 introduced the idea of a
    pseudo-directory as a way to store table definitions in-line in
    Heavy.pl, but conform to the expectations of the code in regard to
    objects being files within directories.  This kept the needed changes to
    a minimum.  The code changed by the current commit  was overlooked then
    as something that also needed to change, because there are no current
    instances of it needing to.  But this could change with future Unicode
    versions, or as in the next few commits, in extending the in-line
    definitions.

M       lib/unicore/mktables

commit f0aff2daa44923f600f6e1f2429a2add2214a9d5
Author: Matthew Horsfall (via RT) <[email protected]>
Date:   Thu Mar 13 05:39:48 2014 -0700

    Add support for test.valgrind parallel testing
    
    # New Ticket Created by  Matthew Horsfall
    # Please include the string:  [perl #121431]
    # in the subject line of all future correspondence about this issue.
    # <URL: https://rt.perl.org/Ticket/Display.html?id=121431 >
    
    This is a bug report for perl from [email protected],
    generated with the help of perlbug 1.39 running under perl 5.14.2.
    
    -----------------------------------------------------------------
    [Please describe your issue here]
    
    The included patch allows test.valgrind to run tests in parallel.
    
    Valgrind output for each test will be printed out after the test
    completes, with the name of the test prefixing every line.
    
    Example usage might be:
    
      TEST_JOBS=8 make test.valgrind VALGRIND='valgrind -q' 2>&1 | tee out.txt
    
    -q is needed to ensure only *errors* are captured, otherwise the output will
    be much louder than it already is. (Perhaps this should be the default 
mode?)
    
    [Please do not change anything below this line]
    -----------------------------------------------------------------

M       Makefile.SH
M       t/TEST
M       t/harness
-----------------------------------------------------------------------

Summary of changes:
 Makefile.SH          |  2 +-
 Porting/todo.pod     |  2 +-
 lib/Unicode/UCD.t    |  7 ++--
 lib/unicore/mktables | 97 +++++++++++++++++++++++++++++++++++-----------------
 lib/utf8_heavy.pl    |  2 +-
 pod/perldelta.pod    |  2 +-
 t/TEST               | 11 +++---
 t/harness            | 20 +++++++++++
 8 files changed, 99 insertions(+), 44 deletions(-)

diff --git a/Makefile.SH b/Makefile.SH
index 6e9df1a..3b5d023 100755
--- a/Makefile.SH
+++ b/Makefile.SH
@@ -1510,7 +1510,7 @@ test.valgrind check.valgrind:     test_prep
        @grep "^usemymalloc='n'" config.sh >/dev/null || exit 1
        @echo "And of course you have to have valgrind..."
        $(VALGRIND) $(VG_TEST) || exit 1
-       PERL_VALGRIND=1 VALGRIND='$(VALGRIND)' $(RUN_TESTS) choose
+       PERL_VALGRIND=1 VALGRIND='$(VALGRIND)' TESTFILE=harness $(RUN_TESTS) 
choose
 !NO!SUBS!
        ;;
 esac
diff --git a/Porting/todo.pod b/Porting/todo.pod
index e890236..5d8f68c 100644
--- a/Porting/todo.pod
+++ b/Porting/todo.pod
@@ -856,7 +856,7 @@ if available-- but B<only> if available, all platforms will 
B<not>
 have catgets().
 
 For the really pure at heart, consider extending this item to cover
-also the warning messages (see L<warnings>, C<regen/warnings.pl>).
+also the warning messages (see L<warnings>, F<regen/warnings.pl>).
 
 =head1 Tasks that need a knowledge of the interpreter
 
diff --git a/lib/Unicode/UCD.t b/lib/Unicode/UCD.t
index c6b50fd..2abb05a 100644
--- a/lib/Unicode/UCD.t
+++ b/lib/Unicode/UCD.t
@@ -1057,9 +1057,10 @@ foreach my $set_of_tables (\%utf8::stricter_to_file_of, 
\%utf8::loose_to_file_of
 
         # If the file's directory is '#', it is a special case where the
         # contents are in-lined with semi-colons meaning new-lines, instead of
-        # it being an actual file to read.
+        # it being an actual file to read.  The file is an index in to the
+        # array of the definitions
         if ($file =~ s!^#/!!) {
-            $official = $file =~ s/;/\n/gr;
+            $official = $utf8::inline_definitions[$file];
         }
         else {
             $official = do "unicore/lib/$file.pl";
@@ -1493,7 +1494,7 @@ foreach my $prop (sort(keys %props), sort keys 
%legacy_props) {
             # special case where the contents are in-lined with semi-colons
             # meaning new-lines, instead of it being an actual file to read.
             if ($base_file =~ s!^#/!!) {
-                $official = $base_file =~ s/;/\n/gr;
+                $official = $utf8::inline_definitions[$base_file];
             }
             else {
                 $official = do "unicore/$base_file.pl";
diff --git a/lib/unicore/mktables b/lib/unicore/mktables
index d92c69d..cde1922 100644
--- a/lib/unicore/mktables
+++ b/lib/unicore/mktables
@@ -1365,12 +1365,18 @@ my $EXTRACTED_DIR = (-d 'extracted') ? 'extracted' : "";
 my $EXTRACTED = ($EXTRACTED_DIR) ? "$EXTRACTED_DIR/" : "";
 my $AUXILIARY = 'auxiliary';
 
-# Hashes that will eventually go into Heavy.pl for the use of utf8_heavy.pl
-# and into UCD.pl for the use of UCD.pm
+# Hashes and arrays that will eventually go into Heavy.pl for the use of
+# utf8_heavy.pl and into UCD.pl for the use of UCD.pm
 my %loose_to_file_of;       # loosely maps table names to their respective
                             # files
 my %stricter_to_file_of;    # same; but for stricter mapping.
 my %loose_property_to_file_of; # Maps a loose property name to its map file
+my @inline_definitions = "V0"; # Each element gives a definition of a unique
+                            # inversion list.  When a definition is inlined,
+                            # its value in the hash it's in (one of the two
+                            # defined just above) will include an index into
+                            # this array.  The 0th element is initialized to
+                            # the definition for a zero length invwersion list
 my %file_to_swash_name;     # Maps the file name to its corresponding key name
                             # in the hash %utf8::SwashInfo
 my %nv_floating_to_rational; # maps numeric values floating point numbers to
@@ -1851,7 +1857,8 @@ package main;
 { # Closure
 
     # This program uses the inside-out method for objects, as recommended in
-    # "Perl Best Practices".  This closure aids in generating those.  There
+    # "Perl Best Practices".  (This is the best solution still, since this has
+    # to run under miniperl.)  This closure aids in generating those.  There
     # are two routines.  setup_package() is called once per package to set
     # things up, and then set_access() is called for each hash representing a
     # field in the object.  These routines arrange for the object to be
@@ -16306,6 +16313,9 @@ sub make_Heavy () {
     my $stricter_to_file_of = simple_dumper(\%stricter_to_file_of, ' ' x 4);
     chomp $stricter_to_file_of;
 
+    my $inline_definitions = simple_dumper(\@inline_definitions, " " x 4);
+    chomp $inline_definitions;
+
     my $loose_to_file_of = simple_dumper(\%loose_to_file_of, ' ' x 4);
     chomp $loose_to_file_of;
 
@@ -16321,7 +16331,13 @@ sub make_Heavy () {
     # for the alternate table figured out at that time.
     foreach my $cased (keys %caseless_equivalent_to) {
         my @path = $caseless_equivalent_to{$cased}->file_path;
-        my $path = join '/', @path[1, -1];
+        my $path;
+        if ($path[0] eq "#") {  # Pseudo-directory '#'
+            $path = join '/', @path;
+        }
+        else {  # Gets rid of lib/
+            $path = join '/', @path[1, -1];
+        }
         $caseless_equivalent_to{$cased} = $path;
     }
     my $caseless_equivalent_to
@@ -16347,12 +16363,22 @@ $INTERNAL_ONLY_HEADER
 $loose_property_name_of
 );
 
-# Maps property, table to file for those using stricter matching
+# Gives the definitions (in the form of inversion lists) for those properties
+# whose definitions aren't kept in files
+\@utf8::inline_definitions = (
+$inline_definitions
+);
+
+# Maps property, table to file for those using stricter matching.  For paths
+# whose directory is '#', the file is in the form of a numeric index into
+# \@inline_definitions
 \%utf8::stricter_to_file_of = (
 $stricter_to_file_of
 );
 
-# Maps property, table to file for those using loose matching
+# Maps property, table to file for those using loose matching.  For paths
+# whose directory is '#', the file is in the form of a numeric index into
+# \@inline_definitions
 \%utf8::loose_to_file_of = (
 $loose_to_file_of
 );
@@ -16902,17 +16928,15 @@ sub write_all_tables() {
     # (sort so that if there is an immutable file name, it has precedence, so
     # some other property can't come in and take over its file name.  (We
     # don't care if both defined, as they had better be different anyway.)
-    # The property named 'Perl' needs to be first (it doesn't have any
-    # immutable file name) because empty properties are defined in terms of
-    # it's table named 'All'.)   We also sort by the property's name.  This is
-    # just for repeatability of the outputs between runs of this program, but
-    # does not affect correctness.
+    # We also sort by the property's name.  This is just for repeatability of
+    # the outputs between runs of this program, but does not affect
+    # correctness.
     PROPERTY:
-    foreach my $property ($perl,
-                          sort { return -1 if defined $a->file;
+    foreach my $property (sort { return -1 if defined $a->file;
                                  return 1 if defined $b->file;
                                  return $a->name cmp $b->name;
-                                } grep { $_ != $perl } property_ref('*'))
+                                }
+                                  property_ref('*'))
     {
         my $type = $property->type;
 
@@ -17322,31 +17346,42 @@ sub write_all_tables() {
         # '#' is used to signal this.  This significantly cuts down the number
         # of files written at little extra cost to the hashes in Heavy.pl.
         # And it means, no run-time files to read to get the definitions.
-        # But short deprecated tables are written anyway, because e.g.,
-        # Gc=Surrogate is the same exact code points as LB=Surrogate, and only
-        # the latter generates a deprecated warning, and so we want to have a
-        # way to distinguish the two.
         if (! $is_property
-            && $table->status ne $DEPRECATED
             && ! $annotate  # For annotation, we want to explicitly show
                             # everything, so keep in files
-            && $table->ranges <= 1)
+            && $table->ranges <= 3)
         {
             my @ranges = $table->ranges;
             my $count = @ranges;
-            if ($count == 0) {
-                $filename = "V0";
+            if ($count == 0) {  # 0th index reserved for 0-length lists
+                $filename = 0;
             }
-            else {
-                my $end = $ranges[0]->end;
-                if ($end < $MAX_WORKING_CODEPOINT) {
-                    $count++;
-                    $end = ";" . ($end + 1);
-                }
-                else {  # Extends to infinity, hence no 'end'
-                    $end = "";
+            elsif ($table->leader != $table) {
+
+                # Here, is a table that is equivalent to another; code
+                # in register_file_for_name() causes its leader's definition
+                # to be used
+
+                next;
+            }
+            else {  # No equivalent table so far.
+
+                # Build up its definition range-by-range.
+                my $definition = "";
+                while (defined (my $range = shift @ranges)) {
+                    my $end = $range->end;
+                    if ($end < $MAX_WORKING_CODEPOINT) {
+                        $count++;
+                        $end = "\n" . ($end + 1);
+                    }
+                    else {  # Extends to infinity, hence no 'end'
+                        $end = "";
+                    }
+                    $definition .= "\n" . $range->start . $end;
                 }
-                $filename = "V$count;" . $ranges[0]->start . $end;
+                $definition = "V$count" . $definition;
+                $filename = @inline_definitions;
+                push @inline_definitions, $definition;
             }
             @directory = "#";
             register_file_for_name($table, \@directory, $filename);
diff --git a/lib/utf8_heavy.pl b/lib/utf8_heavy.pl
index d8f1de3..0d2888f 100644
--- a/lib/utf8_heavy.pl
+++ b/lib/utf8_heavy.pl
@@ -514,7 +514,7 @@ sub _loose_name ($) {
                 # new-lines.  Since it is in-line there is no advantage to
                 # caching the result
                 if ($file =~ s!^#/!!) {
-                    $list = $file =~ s/;/\n/gr;
+                    $list = $utf8::inline_definitions[$file];
                 }
                 else {
                     # Here, we have an actual file to read in and load, but it
diff --git a/pod/perldelta.pod b/pod/perldelta.pod
index ba59a42..548cb32 100644
--- a/pod/perldelta.pod
+++ b/pod/perldelta.pod
@@ -135,7 +135,7 @@ of the implementation, including subroutine invocation and 
scope exit.
 =item *
 
 Perl now does less disk I/O when dealing with Unicode properties that cover
-only a single range of consecutive code points.
+up to three ranges of consecutive code points.
 
 =back
 
diff --git a/t/TEST b/t/TEST
index 96eb6a4..356bdc2 100755
--- a/t/TEST
+++ b/t/TEST
@@ -284,16 +284,15 @@ sub _cmd {
         if ($ENV{PERL_VALGRIND}) {
             my $perl_supp = $options->{return_dir} ? 
"$options->{return_dir}/perl.supp" : "perl.supp";
             my $valgrind_exe = $ENV{VALGRIND} // 'valgrind';
+            if ($options->{run_dir}) {
+                $Valgrind_Log = "$options->{run_dir}/$Valgrind_Log";
+            }
             my $vg_opts = $ENV{VG_OPTS}
-              // '--log-fd=3 '
+              //   "--log-file=$Valgrind_Log "
                  . "--suppressions=$perl_supp --leak-check=yes "
                  . "--leak-resolution=high --show-reachable=yes "
-                  . "--num-callers=50 --track-origins=yes";
+                 . "--num-callers=50 --track-origins=yes";
             $perl = "$valgrind_exe $vg_opts $perl";
-            $redir = "3>$Valgrind_Log";
-            if ($options->{run_dir}) {
-                $Valgrind_Log = "$options->{run_dir}/$Valgrind_Log";
-            }
         }
 
         my $args = "$options->{testswitch} $options->{switch} 
$options->{utf8}";
diff --git a/t/harness b/t/harness
index 1ed70cb..845b270 100644
--- a/t/harness
+++ b/t/harness
@@ -16,6 +16,7 @@ use Config;
 
 $::do_nothing = $::do_nothing = 1;
 require './TEST';
+our $Valgrind_Log;
 
 my $Verbose = 0;
 $Verbose++ while @ARGV && $ARGV[0] eq '-v' && shift;
@@ -224,10 +225,29 @@ my $h = TAP::Harness->new({
            $options = $options{$test} = _scan_test($test, $type);
        }
 
+       (local $Valgrind_Log = "$test.valgrind-current") =~ s/^.*\///;
+
        return [ split ' ', _cmd($options, $type) ];
     },
 });
 
+# Print valgrind output after test completes
+if ($ENV{PERL_VALGRIND}) {
+    $h->callback(
+                after_test => sub {
+                    my ($job) = @_;
+                    my $test = $job->[0];
+                    my $vfile = "$test.valgrind-current";
+                    $vfile =~ s/^.*\///;
+
+                    open(my $voutput, '<', $vfile) or return;
+                    print "$test: Valgrind output:\n";
+                    print "$test: $_" for <$voutput>;
+                    close($voutput);
+                }
+                );
+}
+
 if ($state) {
     $h->callback(
                 after_test => sub {

--
Perl5 Master Repository

[perl.git] branch blead, updated. v5.19.9-213-g47a9c25

Reply via email to