Re: Question about substring match (*_sub)

Aleksandar Lazic Sat, 23 Jan 2021 03:31:43 -0800

On 23.01.21 07:36, Илья Шипицин wrote:

the following usually works for performance profiling.



1) setup work stand (similar to what you use in production)

2) use valgrind + callgrind for collecting traces

3) put workload

4) aggregate using kcachegrind

most probably you were going to do very similar things already :)


Thanks for the tips ;-)

The issue here is that for sub-string matching are several parameters
important like pattern, pattern length, text, text length and the
alphabet.

My question was focused to hear some "common" setups to be able to
create some valid tests for the different algorithms to compare it.

I think something like the examples below. As I don't used _sub
in the past it's difficult to me alone to create some valid use
cases which are used out there. It's okay to send examples only
to me, just in case for some security or privacy reasons.

acl allow_from_int hdr(x-forwarded-for) hdr_sub("192.168.4.5")
acl admin_access   hdr(user)            hdr_sub("admin")
acl test_url       path                 urlp_sub("test=1")

Should UTF-* be considered as valid Alphabet or only ASCII?

If _sub is a very rare case then it's okay as it is, isn't it?

Opinions?

сб, 23 янв. 2021 г. в 03:18, Aleksandar Lazic <[email protected] 
<mailto:[email protected]>>:

    Hi.

    I would like to take a look into the substring match implementation because 
of
    the comment there.

    
http://git.haproxy.org/?p=haproxy.git;a=blob;f=src/pattern.c;h=8729769e5e549bcd4043ae9220ceea440445332a;hb=HEAD#l767
 
<http://git.haproxy.org/?p=haproxy.git;a=blob;f=src/pattern.c;h=8729769e5e549bcd4043ae9220ceea440445332a;hb=HEAD#l767>

    "NB: Suboptimal, should be rewritten using a Boyer-Moore method."

    Now before I take a deeper look into the different algorithms about 
sub-string
    match I would like to know which pattern and length is a "common" use case
    for the user here?

    There are so many different algorithms which are mostly implemented in the
    Smart Tool ( https://github.com/smart-tool/smart ) therefore it would be
    interesting to know some metrics about the use cases.

    Thanks for sharing.
    Best regards

    Aleks

Re: Question about substring match (*_sub)

Reply via email to