Le 17/01/2022 à 13:27, Ruediger Pluem a écrit :
On 1/16/22 10:35 PM, William A Rowe Jr wrote:
4 day ago, you all saw this. 6 years ago, you all started using this on trunk.
Don't know what I can to do help this along and honor the library
author's wishes for
all of us to walk away from the dead fork, and use the maintained
fork. It's whatever
it is, I'm out of here and removing the backport application branch,
whoever 3rd upvotes
this be prepared to apply this for us all, thanks.
With regards to the de-optimization / stack usage probably stupid question:
Can't we use the TLS features that several compilers offer?
e.g. see at the end of the page at
https://stackoverflow.com/questions/18298280/how-to-declare-a-variable-as-thread-local-portably
If we have no thread_local we could fall back to the current unoptimized
implementation?
Regards
Rüdiger
What about something as stupid as the attached patch?
(just a POC to give the general idea)
Reserve some space on stack (in the spirit of the optimization for
PCRE1) and implement a malloc/free that is clever enough to use this
space instead a real malloc if pcre2 does not ask for too much space.
The only difference with what is done with PCRE1 is that in this case we
use on the stack the exact space needed for for POSIX_MALLOC_THRESHOLD
(10) matches. In the attach patch, we reserve "some space" on the stack
and we expect that it is large enough to avoid a malloc.
The space required by pcre2 is:
offsetof(pcre2_match_data, ovector) + 2*oveccount*sizeof(PCRE2_SIZE)
With:
typedef struct pcre2_real_match_data {
pcre2_memctl memctl;
const pcre2_real_code *code; /* The pattern used for the match */
PCRE2_SPTR subject; /* The subject that was matched */
PCRE2_SPTR mark; /* Pointer to last mark */
PCRE2_SIZE leftchar; /* Offset to leftmost code unit */
PCRE2_SIZE rightchar; /* Offset to rightmost code unit */
PCRE2_SIZE startchar; /* Offset to starting code unit */
uint8_t matchedby; /* Type of match (normal, JIT, DFA) */
uint8_t flags; /* Various flags */
uint16_t oveccount; /* Number of pairs */
int rc; /* The return code from the match */
PCRE2_SIZE ovector[131072]; /* Must be last in the structure */
} pcre2_real_match_data;
(my understanding is that pcre2_real_match_data and pcre2_match_data are
the same)
So we should be able to compute a MAX_PCRE2_STACK_DATA (see patch) that
is both large enough to store 10 or so matches, and not too big.
Not sure it is a clean solution, but I guess that it could work.
BTW, does someone have some numbers for the cost of the malloc we try to
avoid? (read: are we over engineering the solution?)
CJ
Index: util_pcre.c
===================================================================
--- util_pcre.c (révision 1896817)
+++ util_pcre.c (copie de travail)
@@ -272,6 +272,31 @@
eflags);
}
+//
+// Code called somewhere at server start to "register" our malloc/free functions
+// For PCRE2 memory allocation optimization
+//
+pcre2_general_context *global_pcre2_context =
+ pcre2_general_context_create(&pcre2_malloc, &pcre2_free, NULL)
+//
+//
+
+#define MAX_PCRE2_STACK_DATA 1024
+#define PCRE2_ON_STACK ((void *)(-1))
+
+void *pcre2_malloc(size_t size, void *data)
+{
+ if (size <= MAX_PCRE2_STACK_DATA)
+ return PCRE2_ON_STACK;
+
+ return malloc(size);
+}
+
+void pcre2_free(void *mem, void *data)
+{
+ return free(mem);
+}
+
AP_DECLARE(int) ap_regexec_len(const ap_regex_t *preg, const char *buff,
apr_size_t len, apr_size_t nmatch,
ap_regmatch_t *pmatch, int eflags)
@@ -282,11 +307,12 @@
#ifdef HAVE_PCRE2
pcre2_match_data *matchdata;
size_t *ovector;
+ char small_ovector[MAX_PCRE2_STACK_DATA];
#else
int small_ovector[POSIX_MALLOC_THRESHOLD * 3];
- int allocated_ovector = 0;
int *ovector = NULL;
#endif
+ int allocated_ovector = 0;
if ((eflags & AP_REG_NOTBOL) != 0)
options |= PCREn(NOTBOL);
@@ -304,9 +330,17 @@
*/
nlim = ((apr_size_t)preg->re_nsub + 1) > nmatch
? ((apr_size_t)preg->re_nsub + 1) : nmatch;
- matchdata = pcre2_match_data_create(nlim, NULL);
+
+ allocated_ovector = 1;
+ matchdata = pcre2_match_data_create(nlim, global_pcre2_context);
if (matchdata == NULL)
return AP_REG_ESPACE;
+ if (matchdata == PCRE2_ON_STACK)
+ {
+ allocated_ovector = 0;
+ matchdata = small_ovector;
+ }
+
ovector = pcre2_get_ovector_pointer(matchdata);
rc = pcre2_match((const pcre2_code *)preg->re_pcre,
(const unsigned char *)buff, len,
@@ -343,7 +377,8 @@
}
#ifdef HAVE_PCRE2
- pcre2_match_data_free(matchdata);
+ if (allocated_ovector)
+ pcre2_match_data_free(matchdata);
#else
if (allocated_ovector)
free(ovector);