[PATCH] dfa: remove unused the member of structure

2020-09-25 Thread Norihiro Tanaka
Hi, Now the member 'first_end' in struct dfa is used. It should be removed. Thanks, Norihiro From ce3f6337b651128d405137a58656e623579cf17d Mon Sep 17 00:00:00 2001 From: Norihiro Tanaka Date: Sat, 26 Sep 2020 09:50:01 +0900 Subject: [PATCH] dfa: remove unused the member of structure * lib

Re: [PATCH 1/3] dfa: fix dfa-heap-overrun failure

2020-09-15 Thread Norihiro Tanaka
ntended to reference? Sorry, it points da0e8454a8e68035ef4b87dbb9097f85df6ece27. Thanks, Norihiro From 9c5a68b9a3f84f2ec77fddfbbc07c617588d1e8a Mon Sep 17 00:00:00 2001 From: Norihiro Tanaka Date: Mon, 14 Sep 2020 22:21:05 +0900 Subject: [PATCH] dfa: fix failure in removal of epsilon clos

Re: [PATCH 1/3] dfa: fix dfa-heap-overrun failure

2020-09-14 Thread Norihiro Tanaka
On Mon, 14 Sep 2020 00:28:32 -0700 Paul Eggert wrote: > On 9/14/20 12:13 AM, Norihiro Tanaka wrote: > > > when (i >= d->follows[i].elems[j].index), it seems that > > map[d->follows[i].elems[j].index] has been already set a value more than 0. > > >

Re: [PATCH 1/3] dfa: fix dfa-heap-overrun failure

2020-09-14 Thread Norihiro Tanaka
On Sun, 13 Sep 2020 18:41:49 -0700 Paul Eggert wrote: > * lib/dfa.c (reorder_tokens): When setting > map[d->follows[i].elems[j].index], instead of incorrectly assuming > that (i < d->follows[i].elems[j].index), use two loops, one to set > the map array and the other to use it. The incorrect

Re: bug#41558: Regexp Bug

2020-05-28 Thread Norihiro Tanaka
On Tue, 26 May 2020 21:14:12 -0700 "anton.paras" wrote: > I posted to Stack Exchange, and they recommended that I file a bug. I'd > rather not copy+paste it all, so here's the link: > > > >

Re: bug#40634: Massive pattern list handling with -E format seems very slow since 2.28.

2020-04-18 Thread Norihiro Tanaka
On Sun, 19 Apr 2020 07:41:49 +0900 Norihiro Tanaka wrote: > > On Sat, 18 Apr 2020 00:22:26 +0900 > Norihiro Tanaka wrote: > > > > > On Fri, 17 Apr 2020 10:24:42 +0900 > > Norihiro Tanaka wrote: > > > > > > > > On Fri, 1

Re: bug#40634: Massive pattern list handling with -E format seems very slow since 2.28.

2020-04-18 Thread Norihiro Tanaka
On Sat, 18 Apr 2020 00:22:26 +0900 Norihiro Tanaka wrote: > > On Fri, 17 Apr 2020 10:24:42 +0900 > Norihiro Tanaka wrote: > > > > > On Fri, 17 Apr 2020 09:35:36 +0900 > > Norihiro Tanaka wrote: > > > > > > > > On Th

Re: bug#40634: Massive pattern list handling with -E format seems very slow since 2.28.

2020-04-17 Thread Norihiro Tanaka
On Fri, 17 Apr 2020 10:24:42 +0900 Norihiro Tanaka wrote: > > On Fri, 17 Apr 2020 09:35:36 +0900 > Norihiro Tanaka wrote: > > > > > On Thu, 16 Apr 2020 16:00:29 -0700 > > Paul Eggert wrote: > > > > > On 4/16/20 3:53 PM, Norihiro Tanaka wrot

Re: bug#40634: Massive pattern list handling with -E format seems very slow since 2.28.

2020-04-16 Thread Norihiro Tanaka
On Fri, 17 Apr 2020 09:35:36 +0900 Norihiro Tanaka wrote: > > On Thu, 16 Apr 2020 16:00:29 -0700 > Paul Eggert wrote: > > > On 4/16/20 3:53 PM, Norihiro Tanaka wrote: > > > > > I have had no idea to solve the problem yet. If we revert it, bug#33357 > &

Re: dfa MT-safe?

2019-12-15 Thread Norihiro Tanaka
On Sun, 15 Dec 2019 05:43:52 -0700 arn...@skeeve.com wrote: > Hi. > > Bruno Haible wrote: > > > > In any case, gawk's use of it is (and will remain) single-threaded. > > > It'd be nice if your fix did not pull in more libraries, like libpthread > > > or whatever, since that would

Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher

2019-03-22 Thread Norihiro Tanaka
On Sat, 23 Mar 2019 08:06:35 +0900 Norihiro Tanaka wrote: > A kwset matcher is not built in a grep matcher after token re-order is > introduced in commit 5c7a0371823876cca7a1347fa09ca26bbbff0c98 in dfa. > It caused performance degradation in some typical cases. This bug is > introd

[PATCH] grep: a kwset matcher not work in a grep matcher

2019-03-22 Thread Norihiro Tanaka
rom fca6a4c3b9e0757637b7a2009ca8b9070a6874f5 Mon Sep 17 00:00:00 2001 From: Norihiro Tanaka Date: Sat, 23 Mar 2019 07:18:37 +0900 Subject: [PATCH] dfa: separate parse and compile phase DFAMUST() must be called after parse and before tokens re-order which is introduced in commit 5c7a0371823876cca7a1347fa09ca26bbbff0

[PATCH 1/3] dfa: count the number of states reset

2018-11-09 Thread Norihiro Tanaka
From 09a76ca1ee331a566cb1097f4b3dd2b8c4b13639 Mon Sep 17 00:00:00 2001 From: Norihiro Tanaka Date: Sun, 4 Nov 2018 15:10:51 +0900 Subject: [PATCH 1/3] dfa: count the number of states reset Count the number of states reset. It may be used to determine whether dfa is enough fast or not. * lib/dfa.c

[PATCH 1/6] dfa: remove unneeded code

2018-10-22 Thread Norihiro Tanaka
env LC_ALL=C src/grep -vf in in real 39.20 user 20.35 sys 18.78 (After) $ time -p env LC_ALL=C src/grep -vf in in real 6.87 user 6.38 sys 0.48 Thanks, Norihiro From 65f156cd0e605c11a40877d8c070a185def699e5 Mon Sep 17 00:00:00 2001 From: Norihiro Tanaka Date: Mon, 22 Oct 2018 23:22:40 +0900 Subj

Re: dfa is slower than regex in some cases

2017-12-02 Thread Norihiro Tanaka
On Sat, 2 Dec 2017 21:48:34 -0800 Paul Eggert <egg...@cs.ucla.edu> wrote: > Norihiro Tanaka wrote: > > > - if (dfa_supported (d)) > > -{ > > - dfaoptimize (d); > > - dfaanalyze (d, searchflag); > > -} > > - else

Re: bug#25479: memory leaks in dfa

2017-01-18 Thread Norihiro Tanaka
(main.c:459) > > There may be other paths as well. > > Can y'all track this down and fix? > > Thanks, > > Arnold Thanks for the report. It is caused by temporarily allocated memory not freed. From 3479bce8542f75c11e6b0b9907e22b26d91865ca Mon Sep 17 00:00:00 2001 From: Norihiro Ta

Re: bug#25390: Segfault with sed 4.3

2017-01-09 Thread Norihiro Tanaka
On Sun, 8 Jan 2017 22:12:02 -0800 Paul Eggert <egg...@cs.ucla.edu> wrote: > Norihiro Tanaka wrote: > > I wrote two additional patches for dfa. First, derive number of > > allocation from not argument but number of state in transition table > > allocation.

Re: bug#25390: Segfault with sed 4.3

2017-01-09 Thread Norihiro Tanaka
On Mon, 09 Jan 2017 23:04:05 +0900 Norihiro Tanaka <nori...@kcn.ne.jp> wrote: > Thanks, I updated the test. The new test uses valgrind. Sorry, I adjusted commit log, New patch does not change testsuite/Makefile.tests. From 67a17d942beb942acd2b2e95eba2cc3d43a5e883 Mon Sep 17 00:00:00

Re: bug#25390: Segfault with sed 4.3

2017-01-08 Thread Norihiro Tanaka
have to be separated. I also wrote a simple test, but the issue are not always caused, as it depends on state of memory. Should we rely to complate the test on valgrind? From 2a4a8b2b75d08c05803ea0580de152058852ac04 Mon Sep 17 00:00:00 2001 From: Norihiro Tanaka <nori...@kcn.ne.jp> Date:

Re: bug#22357: grep -f not only huge memory usage, but also huge time cost

2016-12-20 Thread Norihiro Tanaka
On Mon, 19 Dec 2016 15:38:12 -0800 Paul Eggert wrote: > but the old 'replace' called 'delete' up to N times, Yes, but constraint == 0 does not happen mostly, so in delete() in "while" does not pass normally. > Anyway, I verified that the change improved performance on the

Re: bug#22357: grep -f not only huge memory usage, but also huge time cost

2016-12-19 Thread Norihiro Tanaka
On Sun, 18 Dec 2016 23:48:10 -0800 Paul Eggert wrote: > >> 'delete' is > >> O(N); 'replace' calls 'delete' in a loop and is therefore O(N**2). > >> 'epsclosure' calls 'replace' in a loop and so I suppose it is O(N**3). > >> I haven't looked into how likely the worst-case

Re: bug#22357: grep -f not only huge memory usage, but also huge time cost

2016-12-17 Thread Norihiro Tanaka
On Wed, 14 Dec 2016 17:19:27 -0800 Paul Eggert wrote: > I was referring to code with his proposed patch installed. 'delete' is > O(N); 'replace' calls 'delete' in a loop and is therefore O(N**2). > 'epsclosure' calls 'replace' in a loop and so I suppose it is O(N**3). > I

Re: bug#22357: grep -f not only huge memory usage, but also huge time cost

2016-12-12 Thread Norihiro Tanaka
On Sun, 11 Dec 2016 18:55:32 +0100 Bruno Haible <br...@clisp.org> wrote: > Norihiro Tanaka wrote: > > dfa matcher is not always slower than kws matcher. ... > > It's a trade-off. Can you have any idea to select the better > > matcher for both two cases? > >

Re: bug#22357: grep -f not only huge memory usage, but also huge time cost

2016-12-11 Thread Norihiro Tanaka
On Sun, 11 Dec 2016 05:28:56 -0600 Trevor Cordes wrote: > On my box the above runs for >2m (never completes before I ^C) on the > version **AFTER** the commits (v2.22). On the test build just *BEFORE* > the commits (2.21.73-8058), it runs in <2s. So for me, I had a

Re: bug#22357: grep -f not only huge memory usage, but also huge time cost

2016-12-10 Thread Norihiro Tanaka
/dev/null Thanks, Norihiro From 19502d13120d612fc89b922c9b28cc3030ea0674 Mon Sep 17 00:00:00 2001 From: Norihiro Tanaka <nori...@kcn.ne.jp> Date: Sun, 11 Dec 2016 09:35:50 +0900 Subject: [PATCH] dfa: performance improvement for removal of epsilon closure * lib/dfa.c (delete): Use binary search to find deleted index

Re: bug#24975: Matching issues with characters whose encoding ends in some other character

2016-11-28 Thread Norihiro Tanaka
haracters. Thanks, Norihiro From 67484a67d7d310d76a2eb80b68a8ec8eb5c6a7fc Mon Sep 17 00:00:00 2001 From: Norihiro Tanaka <nori...@kcn.ne.jp> Date: Mon, 28 Nov 2016 22:26:07 +0900 Subject: [PATCH] dfa: avoid match middle in multibyte character * lib/dfa.c (transit_state): If fails in matchin

Re: [PATCH] dfa: addition of new state on demand

2016-11-26 Thread Norihiro Tanaka
Thanks for the review. I confirmed your changes, and I found no problem. > CC'ing this to grep-devel since that's where the hardy band of dfa > consumers hang out. I will also do it in next post for dfa.

Re: [PATCH] dfa: addition of new state on demand

2016-11-21 Thread Norihiro Tanaka
On Mon, 17 Oct 2016 22:00:33 +0900 Norihiro Tanaka <nori...@kcn.ne.jp> wrote: > > On Mon, 17 Oct 2016 11:45:43 +0900 > Norihiro Tanaka <nori...@kcn.ne.jp> wrote: > > > When dfa builds a state, generates all next states. However, I believe > > most of th

Re: [PATCH] dfa: addition of new state on demand

2016-10-17 Thread Norihiro Tanaka
On Mon, 17 Oct 2016 11:45:43 +0900 Norihiro Tanaka <nori...@kcn.ne.jp> wrote: > When dfa builds a state, generates all next states. However, I believe > most of them are not used. > > This patch changes as that when dfa builds a state, generates a next > state including

[PATCH] dfa: addition of new state on demand

2016-10-16 Thread Norihiro Tanaka
/ld --with-system-zlib --enable-__cxa_atexit Thread model: posix gcc version 4.4.7 (GCC) $ uname -a Linux rhel6 2.6.32-642.el6.x86_64 #1 SMP Wed Apr 13 00:51:26 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux From 2d33060d77713bfefc3d82c031a7436dc205654b Mon Sep 17 00:00:00 2001 From: Norihiro Tanaka &l

Re: [PATCH] dfa: save memory for states

2016-10-10 Thread Norihiro Tanaka
Hi Bruno, On Tue, 11 Oct 2016 08:25:39 +0900 Norihiro Tanaka <nori...@kcn.ne.jp> wrote: > Intel(R) Core(TM)2 Duo CPU E7500 @ 2.93GHz > gcc version 4.4.7 (GCC) > Linux rhel6 2.6.32-642.el6.x86_64 #1 SMP Wed Apr 13 00:51:26 EDT 2016 x86_64 > x86_64 x86_64 GNU/Linux > &

Re: [PATCH] dfa: save memory for states

2016-10-10 Thread Norihiro Tanaka
On Mon, 10 Oct 2016 08:32:47 -0700 Paul Eggert wrote: > Thanks for that performance improvement. I installed it into gnulib and grep. Hi Paul, Thanks for installing it to both. Norihiro

Re: [PATCH] dfa: save memory for states

2016-10-10 Thread Norihiro Tanaka
On Mon, 10 Oct 2016 17:41:03 +0200 Bruno Haible wrote: > Hi Norihiro, > > I'm not maintainer of the 'dfa' module, but nevertheless I'd like to know > what is going on more precisely (because clearing a cache more often > deteriorates running times than improve running times). 3

[PATCH] dfa: save memory for states

2016-10-10 Thread Norihiro Tanaka
ate from caches of states. Thanks. Norihiro From 9f7d39e943187196b68a42d2880dfea9ad6c1d94 Mon Sep 17 00:00:00 2001 From: Norihiro Tanaka <nori...@kcn.ne.jp> Date: Mon, 10 Oct 2016 23:08:29 +0900 Subject: [PATCH] dfa: save memory for states * src/dfa (dfaexec_main): Beginning of dfa execution

[PATCH] chdir-long: avoid -Werror=unused-variable with -DNDEBUG

2014-12-20 Thread Norihiro Tanaka
directory '/usr/src/grep-2.21' Makefile:1195: recipe for target 'all' failed make: *** [all] Error 2 From 998546559e67e8922f70c2ec4be240912a0a37aa Mon Sep 17 00:00:00 2001 From: Norihiro Tanaka nori...@kcn.ne.jp Date: Sat, 20 Dec 2014 14:01:50 +0900 Subject: [PATCH] chdir-long: avoid -Werror

Re: [PATCH] chdir-long: avoid -Werror=unused-variable with -DNDEBUG

2014-12-20 Thread Norihiro Tanaka
On Sat, 20 Dec 2014 13:03:23 -0800 Paul Eggert egg...@cs.ucla.edu wrote: Norihiro Tanaka wrote: +#if NDEBUG + close (cdb-fd); +#else bool close_fail = close (cdb-fd); assert (! close_fail); +#endif That sort of thing looks like it'd be reasonably annoying

Re: [PATCH] chdir-long: avoid -Werror=unused-variable with -DNDEBUG

2014-12-20 Thread Norihiro Tanaka
On Sat, 20 Dec 2014 13:03:23 -0800 Paul Eggert egg...@cs.ucla.edu wrote: That sort of thing looks like it'd be reasonably annoying in the long run. How about the attached patch instead? I confirmed that it had already been committed. Thanks.

Re: [PATCH] chdir-long: avoid -Werror=unused-variable with -DNDEBUG

2014-12-20 Thread Norihiro Tanaka
On Sat, 20 Dec 2014 18:20:06 -0800 Paul Eggert egg...@cs.ucla.edu wrote: I'd rather not, as the idea is not gnulib-specific. Thanks, I understood.