Re: apr_fnmatch deltas

2011-05-03 Thread Jeff Trawick
On Tue, May 3, 2011 at 1:20 AM, William A. Rowe Jr. wr...@rowe-clan.net wrote:
 On 5/2/2011 6:24 PM, Jeff Trawick wrote:
 BTW, checked performance against previous version?

 For 1:1 testing of the patterns that exist in test/testfnmatch.c,
 100,000 iterations here on my box, 8626494 usec for the new vs.
 3674210 usec for the previous.

that proportion is in the ballpark of what I got with a different set
of testcases

but they were contrived, the before/after builds weren't optimized, etc.

maybe I'll have time to play more after 1.4.3



 This seems consistent with the retests of null, /, \/ in pattern
 which are repeated all to frequently; this should be able to be
 simplified with the use of a couple cautiously placed gotos and
 some code coverage analysis (necessary to ensure we hit the entire
 pattern space in our test patterns).  Also the fact that there is
 so little recursion in the test cases, and so few calls to match
 very common *.txt style patterns, which the new code should beat
 the old hands down as trailing length is worked out, and we are
 not redundantly testing failed pattern spaces.

 The code is designed to move fairly painlessly to MBCS encodings.
 This incurs a design penalty in speed, although not nearly as bad
 as it will be once mbr*() calls are used in place of bytewise
 comparisons to advance characters, particularly for case insensitive
 comparison.

 Two things would radically affect the speed, which I am unsure
 offhand what msvc did 10 yrs ago, inline expansion of **pattern
 to *pattern (it is an alias reference which should not be double
 de-referenced), and aliasing all of the setup (slash in fnmatch_ch
 is slash in fnmatch).  If the compiler is smart, there should be
 no penalty for inlining the code as a function; if it is not, we
 may want to consider duplicating [at least some of] the code (it
 is invoked in two places, one for counting the trailing pattern
 following a wildcard, one for performing the match).

 I'm rather more curious how it compares to real world BSD and Linux
 implementations today using SBCS C locale... as committed, as well
 as integrating fnmatch_ch() into the mainline code.  I will pull out
 the test patterns and test them against the BSD port (posted in
 p.a.o/~wrowe/ space) when some free time returns, I have some work-
 work I have to complete first.

 If anyone wants to spend cycles, the best place to start is to make
 a test suite list that corresponds to the real world, and clocks it
 in cpu spin instead of world clock time.  This also needs various
 flavors of range tests and false ranges (for which the old code
 disagreed with the modern fnmatch implementations).

 In any case, this closes a real world flaw, so I'd have no issue
 going with it for now as a correctness release once reviewers are
 satisfied, with more optimization work in the near term for the next
 release.

yeah


Re: apr 1.4.3

2011-05-03 Thread Jeff Trawick
On Mon, May 2, 2011 at 12:45 PM, William A. Rowe Jr.
wr...@rowe-clan.net wrote:
 On 5/2/2011 8:11 AM, Jeff Trawick wrote:
 Anticipated timeline for release:

 * Jeff to review/test wrowe's fnmatch rewrite today (Monday)
 * someone else do the same???

 Stefan was interested in it.  I'm building upon a list of patterns in
 test/testfnmatch.c designed to tickle bugs, not looking at the code
 again.  I'll commit Bert's root dir and my url path 'case-ignore'
 patch tonight, and spend just 20 min looking at the apr_file_stat's
 behavior w.r.t. new win7 softlinks with and without APR_FSTAT_LINK,
 just to make sure we are on the right page.

 I like the single announce idea.

 FYI there are two remaining warnings in apr_fnmatch.c, one is the
 assignment within conditional that is trivial to fix (it was paren
 delimited so the compiler shouldn't be complaining anyways), but
 the other is a fixed loop with no useful break expression, e.g.
 for (matchptr = pattern, matchlen = 0; 1; ++matchlen) termination
 conditions all had specific additional processing.  If there was
 a worthwhile alternative to avoid the warning, I didn't see it.
 If there are any other warnings at -Wall I don't use the same
 compilers as you had; please share.

You fixed one of those in the meantime, and I'm not seeing anything
here with my particular flavor of gcc or Sun Studio; clang is happy
too.

I'm planning to roll 1.4.3 in an hour or so; if the code is ready to
ship, great; if having a tarball to decide on brings others out of the
woodwork, great :)


Re: svn commit: r1099173 - /apr/apr/branches/1.4.x/file_io/win32/filepath.c

2011-05-03 Thread William A. Rowe Jr.
On 5/3/2011 1:22 PM, wr...@apache.org wrote:
 Remove assumption that drive letters are always uppercase.
 Patch by: Bert Huijben bert {at} qqmail.nl
 Backports: r960665

We needed this on 1.4, I'll add to 1.5 later tonight, and I'll refactor
once again for the //Machine/Share/ syntax later (since we believe it
can also fall out of case-synchronization).


[Vote] Release apr-1.4.3

2011-05-03 Thread Jeff Trawick
Tarballs/zipballs are at http://apr.apache.org/dev/dist/.

+/-1
[  ]  Release apr 1.4.3 as GA


Re: apr_fnmatch deltas

2011-05-03 Thread William A. Rowe Jr.
On 5/3/2011 12:12 PM, Jeff Trawick wrote:
 On Tue, May 3, 2011 at 1:20 AM, William A. Rowe Jr. wr...@rowe-clan.net 
 wrote:
 On 5/2/2011 6:24 PM, Jeff Trawick wrote:
 BTW, checked performance against previous version?

 For 1:1 testing of the patterns that exist in test/testfnmatch.c,
 100,000 iterations here on my box, 8626494 usec for the new vs.
 3674210 usec for the previous.
 
 that proportion is in the ballpark of what I got with a different set
 of testcases
 
 but they were contrived, the before/after builds weren't optimized, etc.

I was at Visual Studio 6, /O2 level, clean stack handling (not optimized).
I'll create a nice test utility timing fnmatch against old apr_fnmatch or
system fnmatch, for people to experiment with, once I'm caught up a bit.
It's amusing that your unoptimized build is near identical, I'm looking
forward to measuring against a modern compiler :)

 This seems consistent with the retests of null, /, \/ in pattern
 which are repeated all to frequently; this should be able to be
 simplified with the use of a couple cautiously placed gotos and
 some code coverage analysis (necessary to ensure we hit the entire
 pattern space in our test patterns).  Also the fact that there is
 so little recursion in the test cases, and so few calls to match
 very common *.txt style patterns, which the new code should beat
 the old hands down as trailing length is worked out, and we are
 not redundantly testing failed pattern spaces.

My test code allows pattern string flag... I'm thinking the util
should just consume @patternfile @stringfile n-n range to allow
the user to abuse and clock some real patterns (find /  biglist).

 Two things would radically affect the speed, which I am unsure
 offhand what msvc did 10 yrs ago, inline expansion of **pattern
 to *pattern (it is an alias reference which should not be double
 de-referenced), and aliasing all of the setup (slash in fnmatch_ch
 is slash in fnmatch).  If the compiler is smart, there should be
 no penalty for inlining the code as a function; if it is not, we
 may want to consider duplicating [at least some of] the code (it
 is invoked in two places, one for counting the trailing pattern
 following a wildcard, one for performing the match).

I detest the idea of duplicating, we open potential de-synchronization
of the pattern walking logic from the pattern matching logic once we've
duplicated the code :(  I expect we will need to do this based on your
comments above.  We might also be able to collapse one level of looping
(and might not) based on judicious use of goto.  The depth of this logic
is an illustration of why gotos considered harmful can sometimes be
a destructive philosophy :)



Re: [Vote] Release apr-1.4.3

2011-05-03 Thread Rainer Jung

On 03.05.2011 21:22, Jeff Trawick wrote:

Tarballs/zipballs are at http://apr.apache.org/dev/dist/.

+/-1
[  ]  Release apr 1.4.3 as GA


config.guess and config.sub are older in the release artefacts than in 
svn. The buildconf in 1.4.x still has --force for libtoolize, which 
replaces the apr config.(guess|sub) with the installed ones from 
libtoolize. In trunk we removed the force and changed the buildconf 
script to remove all temporary m4 files before calling libtoolize, so 
that force (i.e. overwrite existing files) is no longer necessary.


If buildconf from trunk works for 1.4.x we might copy it from there for 
the next release. Otherwise config.(guess|sub) will always be outdated 
if the rolling system is not up to date.


Regards,

Rainer


Re: [Vote] Release apr-1.4.3

2011-05-03 Thread Jeff Trawick
On Tue, May 3, 2011 at 5:28 PM, Rainer Jung rainer.j...@kippdata.de wrote:
 On 03.05.2011 21:22, Jeff Trawick wrote:

 Tarballs/zipballs are at http://apr.apache.org/dev/dist/.

 +/-1
 [  ]  Release apr 1.4.3 as GA

 config.guess and config.sub are older in the release artefacts than in svn.

Ouch.  At least they are newer than with 1.4.2.

 The buildconf in 1.4.x still has --force for libtoolize, which replaces
 the apr config.(guess|sub) with the installed ones from libtoolize. In trunk
 we removed the force and changed the buildconf script to remove all
 temporary m4 files before calling libtoolize, so that force (i.e. overwrite
 existing files) is no longer necessary.

 If buildconf from trunk works for 1.4.x we might copy it from there for the
 next release. Otherwise config.(guess|sub) will always be outdated if the
 rolling system is not up to date.

+1


Re: [Vote] Release apr-1.4.3

2011-05-03 Thread Rainer Jung

On 03.05.2011 21:22, Jeff Trawick wrote:

Tarballs/zipballs are at http://apr.apache.org/dev/dist/.

+/-1
[+1]  Release apr 1.4.3 as GA


- svn compared with gz, bz2 and zip only expected differences
  (except config.(guess|sub), which is not a blocker)

- files signed, checksums correct

- I built and made check on the following platforms:
  - Solaris 8 Sparc, gcc 4.1.2
  - Solaris 10 Sparc, gcc 4.1.2 and 4.5.3
  - SuSE Linux Enterprise 10 32 and 64 Bit
  - RedHat Enterprise Linux 5, 64 Bit

- all builds suceeded

- all make check ran fine
  (except for binding to ::1 on Solaris with only IPv4 active,
   not a regression).

Regards,

Rainer


Re: [Vote] Release apr-1.4.3

2011-05-03 Thread Jeff Trawick
On Tue, May 3, 2011 at 3:22 PM, Jeff Trawick traw...@gmail.com wrote:
 Tarballs/zipballs are at http://apr.apache.org/dev/dist/.

 +/-1
 [  ]  Release apr 1.4.3 as GA

this combo looks good for me on Solaris 10 with Sun Studio at -O3:

apr-1.4.3
apr-util-1.3.11
httpd 2.2.x-latest
make check for apr and apr-util
httpd test framework (t/modules/filter.t is failing; maybe I can find out why)

more testing tomorrow hopefully