Nice catch you two!!! Happy New Year -pd
> On 01 Jan 2016, at 22:06 , Simon Urbanek <simon.urba...@r-project.org> wrote: > > Ok, found the problem - on platforms that support it TRE uses wint_t (from > wchar.h) as its type for characters (tre_cint_t) which on AIX is *signed* > int. TRE uses liberally conversions between int and tre_cint_t apparently > assuming that the latter is unsigned so conversions back to int are suitable > for comparisons etc. On other platforms wint_t is unsigned so it works. > Manually defining tre_cint_t to unsigned int fixes the issue. > > Cheers, > Simon > > > On Jan 1, 2016, at 12:20 PM, Simon Urbanek <simon.urba...@r-project.org> > wrote: > >> Michael, >> >> thanks, I'll have a look once my PDP VMs are up again (later today). This >> may be a signedness issue although it's unclear why other platforms wouldn't >> be affected. >> >> Cheers, >> Simon >> >> >> On Dec 31, 2015, at 10:14 AM, Michael Felt <aixto...@gmail.com> wrote: >> >>> On 2015-12-30 09:58, Michael Felt wrote: >>>> On 2015-12-29 11:02, Michael Felt wrote: >>>>> This seems to be a problem that goes back a long time - and I hope >>>>> someone who understands what tre is suppossed to be doing will look at >>>>> this. >>>>> >>>>> A short history of other people who have reported on this on different >>>>> versions of AIX. I shall only add that I get the same results on AIX 5.3 >>>>> TL7, AIX 6.1 TL9 and AIX 7.1 TL3. >>>>> >>>>> Basically, with settings that work for AIX and 32-bit - the only changes >>>>> being >>>>> -maix32 becomes -maix64 >>>>> and >>>>> export OBJECT_MODE=32 becomes export OBJECT_MODE=64 >>>>> >>>>> Then to shorten the 'make' bla bla, first run just make, then >>>>> >>>>> cd src/library/tools >>>>> make -s sysdata >>>>> >>>>> http://article.gmane.org/gmane.comp.lang.r.devel/38817/match=package+tools+malformed >>>>> >>>>> http://article.gmane.org/gmane.comp.lang.r.devel/36886/match=package+tools+malformed >>>>> >>>>> http://article.gmane.org/gmane.comp.lang.r.devel/23372/match=package+tools+malformed >>>>> Date: 2010-01-25 06:55:41 GMT (5 years, 48 weeks, 1 day, 20 hours and 30 >>>>> minutes ago) >>>>> >>>>> To that, to get debug data, I have >>>>> >>>>> * added -DTRE_DUGUG to src/extra/tre/Makefile # ALL_CFLAGS = >>>>> $(ALL_CFLAGS_LO) -DTRE_DEBUG >>>>> * rm src/extra/tre/tre-match-parallel.o >>>>> * find . -name \*.so -exec rm {} \; >>>>> * make >>>>> * cd src/library/tools >>>>> * make -s sysdata >>>>> >>>>> Attached are the two script files of the screen output. The 32-bit one is >>>>> more verbose - and contains magically lines such as: >>>>> found match 3037fd14 (while "found" does not occur in the 64-bit output) >>>>> >>>>> root@x069:[/data/prj/cran/64/R-aix-3.2.3/src/library/tools]wc >>>>> /tmp/sysdata.??.* >>>>> 4730 14123 139916 /tmp/sysdata.32.text >>>>> 1312 3688 40528 /tmp/sysdata.64.text >>>>> 6042 17811 180444 total >>>>> >>>>> root@x069:[/data/prj/cran/64/R-aix-3.2.3/src/library/tools]grep -c found >>>>> /tmp/sysdata.??.* >>>>> /tmp/sysdata.32.text:19 >>>>> /tmp/sysdata.64.text:0 >>>>> >>>>> >>>>> Hope this brings us (or me), closer to a resolution to an old concern. >>>>> >>>>> And, best wishes for the new year! >>>>> >>>>> Michael >>>>> >>>>> >>>> Still hoping for someones curiosity/willingness. >>>> >>>> The differences show up in the first comparision that is made (of the >>>> string "3.2.3" it seems) - 32-bit is on the left, 64-bit on the right. >>>> >>>> Script command is started on Tue Dec 29 08:39:16 UTC 2015. >>>> | Script command is started on Tue Dec 29 08:39:56 UTC 2015. >>>> root@x069:[/data/prj/cran/32/R-aix-3.2.3/src/library/tools]make -s sysdata >>>> | root@x069:[/data/prj/cran/64/R-aix-3.2.3/src/library/tools]make -s >>>> sysdata >>>> installing 'sysdata.rda' >>>> | installing 'sysdata.rda' >>>> tre_tnfa_run_parallel, input type 1 >>>> | tre_tnfa_run_parallel, input type 1 >>>> length: -1 >>>> | length: -1 >>>> pos:chr/code | states and tags >>>> | pos:chr/code | states and tags >>>> -------------+------------------------------------------------ >>>> | -------------+------------------------------------------------ >>>> init > 30380200 3038014c 30380098 >>>> | init > 110cc3040 110cc2f28 110cc2e10 >>>> match end offset = -1 >>>> | match end offset = -1 >>>> tre_tnfa_run_parallel, input type 1 >>>> | tre_tnfa_run_parallel, input type 1 >>>> length: -1 >>>> | length: -1 >>>> pos:chr/code | states and tags >>>> | pos:chr/code | states and tags >>>> -------------+------------------------------------------------ >>>> | -------------+------------------------------------------------ >>>> init > 3037fb88 >>>> | init > 110cc3310 >>>> 0: 3/00051 | 3037fb88/0:0 >>>> | 0: 3/00051 | 110cc3310/0:0 >>>> 1: ./00046 | 3037fb88/0:0 >>>> | 1: ./00046 | 110cc3310/0:0 >>>> init > 3037fb88 >>>> | init > 110cc3310 >>>> 1: ./00046 | 3037fb88/0:1 >>>> | 1: ./00046 | 110cc3310/0:1 >>>> 2: 2/00050 | 3037fb88/0:1 >>>> | 2: 2/00050 | 110cc3310/0:1 >>>> assertion failed >>>> | assertion failed >>>> init > 3037fb88 >>>> | init > 110cc3310 >>>> 2: 2/00050 | 3037fc18/0:1 3037fb88/0:2 >>>> | 2: 2/00050 | 110cc33f0/0:1 110cc3310/0:2 >>>> 3: ./00046 | 3037fc18/0:1 3037fb88/0:2 >>>> | 3: ./00046 | 110cc33f0/0:1 110cc3310/0:2 >>>> assertion failed *** DIFFERENCE *** >>>> | init > 110cc3310 >>>> init > 3037fb88 >>>> | 3: ./00046 | 110cc3310/0:3 >>>> 3: ./00046 | 3037fc18/0:1 3037fb88/0:3 >>>> | 4: 3/00051 | 110cc3310/0:3 >>>> 4: 3/00051 | 3037fc18/0:1 3037fb88/0:3 >>>> | assertion failed >>>> assertion failed >>>> | init > 110cc3310 >>>> init > 3037fb88 >>>> | 4: 3/00051 | 110cc33f0/0:3 110cc3310/0:4 >>>> 4: 3/00051 | 3037fc18/0:3 3037fb88/0:4 >>>> | 5: /00000 | 110cc33f0/0:3 110cc3310/0:4 >>>> 5: /00000 | 3037fc18/0:3 3037fb88/0:4 | init > 110cc3310 >>>> found match 3037fd14 *** DIFFERENCE *** | match end >>>> offset = -1 >>>> match end offset = 5 *** DIFFERENCE *** | >>>> tre_tnfa_run_parallel, input type 1 >>>> tre_tnfa_run_parallel, input type 1 >>>> | length: -1 >>>> length: -1 >>>> | pos:chr/code | states and tags >>>> pos:chr/code | states and tags >>>> | -------------+------------------------------------------------ >>>> -------------+------------------------------------------------ >>>> | init > 110cc4780 110cc4668 110cc4550 >>>> init > 303811c0 3038110c 30381058 >>>> | match end offset = -1 >>>> match end offset = -1 >>>> | tre_tnfa_run_parallel, input type 1 >>>> tre_tnfa_run_parallel, input type 1 >>>> | length: -1 >>>> length: -1 >>>> | pos:chr/code | states and tags >>>> pos:chr/code | states and tags >>>> | -------------+------------------------------------------------ >>>> -------------+------------------------------------------------ >>>> | init > 110cc5700 110cc55e8 110cc54d0 >>>> >>> One day further - looks like tre_compile (or just before, after all). >>> >>> With TRE_DEBUG switched on in tre-compile.c and tre-ast.c I see (snip) >>> >>> --- /tmp/x.32 2015-12-31 15:09:44.000000000 +0000 >>> +++ /tmp/x.64 2015-12-31 15:09:30.000000000 +0000 >>> @@ -1,5 +1,5 @@ >>> - Script command is started on Thu Dec 31 15:04:39 2015. >>> - root@x069:[/data/prj/cran/32/R-aix-3.2.3/src/library/tools]make sysdata >>> + Script command is started on Thu Dec 31 15:08:43 2015. >>> + root@x069:[/data/prj/cran/64/R-aix-3.2.3/src/library/tools]make sysdata >>> installing 'sysdata.rda' >>> echo >>> "tools:::sysdata2LazyLoadDB(\"/data/prj/cran/R-3.2.3/src/library/tools/R/sysdata.rda\",\"../../../library/tools/R\")" >>> | \ >>> R_DEFAULT_PACKAGES=NULL LC_ALL=C ../../../bin/R --vanilla --slave >>> @@ -167,7 +167,7 @@ >>> initial: 1/1,0, assert 0 >>> initial: 0/0, assert 0 >>> initial: 0/0, assert 0 >>> - final state 30370718 >>> + final state 110cba530 >>> tre_compile: parsing '(^|[^%])(%%)*%V' >>> AST: >>> catenation, sub 0, 0 tags >>> @@ -177,7 +177,7 @@ >>> assertions: bol >>> union, sub -1, 0 tags >>> literal (, $) (0, 36), pos 0, sub -1, 0 tags >>> - literal (&, M-^?) (38, 65535), pos 0, sub -1, 0 tags >>> + literal (&, M-^?) (38, -1), pos 0, sub -1, 0 tags >>> iteration {0, -1}, sub -1, 0 tags, greedy >>> catenation, sub 2, 0 tags >>> literal (%, %) (37, 37), pos 1, sub -1, 0 tags >>> @@ -197,7 +197,7 @@ >>> Union >>> Literal 0-36 >>> After union left >>> - Literal 38-65535 >>> + Literal 38--1 >>> After union right >>> After union right >>> num_tags += 2 >>> @@ -231,7 +231,7 @@ >>> assertions: bol >>> union, sub -1, 0 tags >>> literal (, $) (0, 36), pos 0, sub -1, 0 tags >>> - literal (&, M-^?) (38, 65535), pos 0, sub -1, 0 tags >>> + literal (&, M-^?) (38, -1), pos 0, sub -1, 0 tags >>> iteration {0, -1}, sub -1, 2 tags, greedy >>> catenation, sub 2, 1 tags >>> literal (%, %) (37, 37), pos 1, sub -1, 1 tags >>> @@ -255,7 +255,7 @@ >>> Union >>> Literal 0-36 >>> After union left >>> - Literal 38-65535 >>> + Literal 38--1 >>> After union right >>> After union right >>> tre_add_tag_right: tag 3 >>> @@ -342,7 +342,7 @@ >>> catenation, sub -1, 0 tags >>> union, sub -1, 0 tags >>> literal (, $) (0, 36), pos 0, sub -1, 0 tags >>> - literal (&, M-^?) (38, 65535), pos 0, sub -1, 0 tags >>> + literal (&, M-^?) (38, -1), pos 0, sub -1, 0 tags >>> tag 4 >>> >>> It seems in 32-bit mode -1 is unsigned (65535) but -1 == -1 in 64-bit mode. >>> >>> I suspect I will "find it" - but a proposed change is appreciated. >>> >>> Happy New Year, >>> Michael >>> >>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel