Attached is a benchmark and Makefile that I used to test the speed of
svn_subst_translate_string() from trunk versus the new
svn_subst_translate_string2(). The program reads a text file named
`2600.txt` in the current working directory and repeatedly calls
svn_subst_translate_string() on the contents. For `2600.txt`, I used
the plain text version of War and Peace from Project Gutenberg
(http://www.gutenberg.org/ebooks/2600.txt.utf8).
The data that I generated for tr...@1040115 were:
trunk_at_1040115 <- c(7780000, 7910000, 7870000, 7660000, 7840000,
7760000, 7620000, 7500000, 7860000, 7800000, 7640000, 7740000,
7760000, 7850000, 8010000, 7800000, 7730000, 7700000, 7900000,
7760000, 7790000, 7970000, 7700000, 7710000, 7990000, 7830000,
7780000, 7810000, 7730000, 7600000)
The data for the "HEAD" sources (commit
6f828b0a4e07d1e14189b9b8c84bd0f884c59164 from my repo;
https://github.com/dtrebbien/subversion/tree/6f828b0a4e07d1e14189b9b8c84bd0f884c59164)
were:
HEAD <- c(8050000, 8230000, 7980000, 8150000, 7950000, 8600000,
8080000, 8420000, 8000000, 8020000, 8420000, 7960000, 8010000,
8200000, 8080000, 8490000, 8190000, 7920000, 7820000, 7780000,
7880000, 8540000, 7970000, 8250000, 8830000, 8540000, 8310000,
8270000, 8010000, 7990000)
Note: This is not "version 3" of the patch. It is essentially
tr...@1040115 plus "version 3" plus this changeset:
https://github.com/dtrebbien/subversion/commit/d22329a54dcf58cddc2b618f913597c6defbcb2d
The t-test allows us to conclude with high confidence that the mean
time to run the benchmark with libsvn_subr-1 compiled from
tr...@1040115 is less than the mean time to run the benchmark with
libsvn_subr-1 compiled from the HEAD sources:
> t.test(trunk_at_1040115, HEAD, alternative = "less", var.equal = TRUE,
> conf.level = 0.90)
Two Sample t-test
data: trunk_at_1040115 and HEAD
t = -7.473, df = 58, p-value = 2.350e-10
alternative hypothesis: true difference in means is less than 0
90 percent confidence interval:
-Inf -317939.7
sample estimates:
mean of x mean of y
7780000 8164667
I realized, however, that this is not a fair comparison because the
HEAD sources simply call svn_subst_translate_string2() within
svn_subst_translate_string(), meaning that there is an extra layer of
indirection. After modifying the benchmark to call
svn_subst_translate_string2() directly, I generated these timings:
HEAD_new <- c(7850000, 7890000, 8080000, 7980000, 7820000, 7880000,
7850000, 7540000, 8470000, 8230000, 8410000, 7880000, 7410000,
7490000, 7420000, 7650000, 7430000, 7430000, 7530000, 7720000,
7940000, 7780000, 8070000, 7840000, 7870000, 7970000, 7690000,
7910000, 7860000, 7620000)
Now we cannot reject the null hypothesis that the mean time to run the
benchmark with libsvn_subr-1 compiled from tr...@1040115 is greater
than or equal to the mean time to run the modified benchmark with
libsvn_subr-1 compiled from the HEAD sources:
> t.test(trunk_at_1040115, HEAD_new, alternative = "less", var.equal = TRUE,
> conf.level = 0.90)
Two Sample t-test
data: trunk_at_1040115 and HEAD_new
t = -0.6839, df = 58, p-value = 0.2484
alternative hypothesis: true difference in means is less than 0
90 percent confidence interval:
-Inf 33129.55
sample estimates:
mean of x mean of y
7780000 7817000
One other set of timings that I generated were for the modified
benchmark running with libsvn_subr-1 compiled from the HEAD sources,
slightly modified to set `repair` to TRUE:
HEAD_new_repair <- c(7660000, 7560000, 7570000, 7540000, 7670000,
7790000, 7460000, 7840000, 8060000, 7790000, 8000000, 7830000,
8370000, 8010000, 7730000, 7800000, 7900000, 7730000, 7730000,
7790000, 7750000, 7930000, 7860000, 7810000, 7930000, 7840000,
7890000, 7460000, 7790000, 7730000)
We cannot reject the null hypothesis that the mean time to run the
modified benchmark with libsvn_subr-1 compiled from the HEAD sources
is the same as the mean time to run the modified benchmark with
libsvn_subr-1 compiled from slightly-modified HEAD sources (`repair`
is set to TRUE):
> t.test(HEAD_new, HEAD_new_repair, var.equal = TRUE, conf.level = 0.90)
Two Sample t-test
data: HEAD_new and HEAD_new_repair
t = 0.3815, df = 58, p-value = 0.7042
alternative hypothesis: true difference in means is not equal to 0
90 percent confidence interval:
-77774.74 123774.74
sample estimates:
mean of x mean of y
7817000 7794000
> t.test(trunk_at_1040115, HEAD_new_repair, alternative = "less", var.equal =
> TRUE, conf.level = 0.90)
Two Sample t-test
data: trunk_at_1040115 and HEAD_new_repair
t = -0.3501, df = 58, p-value = 0.3638
alternative hypothesis: true difference in means is less than 0
90 percent confidence interval:
-Inf 37836.36
sample estimates:
mean of x mean of y
7780000 7794000
Therefore, I do not have evidence to support my earlier claim: "3.)
This penalizes repair translations."
My conclusion from all of this is that regardless of the value of
`repair`, my changes do not appear to decrease the performance of
svn_subst_translate_string() as long as svn_subst_translate_string2()
is called directly.
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <sys/stat.h>
#include <apr.h>
#include <apr_errno.h>
#include <apr_general.h>
#include <apr_pools.h>
#include <apr_strings.h>
#include <apr_tables.h>
#include "svn_types.h"
#include "svn_error_codes.h"
#include "svn_error.h"
#include "svn_string.h"
#include "svn_io.h"
#include "svn_subst.h"
#ifndef APR_STATUS_IS_SUCCESS
#define APR_STATUS_IS_SUCCESS(s) ((s) == APR_SUCCESS)
#endif
static char s_1KiB_buf[1024];
int main(int argc, const char *const *argv, const char *const *env)
{
apr_status_t apr_status = apr_app_initialize(&argc, &argv, &env);
if (! APR_STATUS_IS_SUCCESS(apr_status)) {
fprintf(stderr, "`apr_app_initialize` failed: %s\n",
apr_strerror(apr_status, s_1KiB_buf, sizeof s_1KiB_buf/sizeof s_1KiB_buf[0]));
return EXIT_FAILURE;
}
atexit(apr_terminate);
apr_pool_t *root_pool = NULL;
apr_status = apr_pool_create(&root_pool, NULL);
if (! APR_STATUS_IS_SUCCESS(apr_status)) {
fprintf(stderr, "`apr_pool_create` failed: %s\n",
apr_strerror(apr_status, s_1KiB_buf, sizeof s_1KiB_buf/sizeof s_1KiB_buf[0]));
return EXIT_FAILURE;
}
struct stat st;
if (-1 == stat("2600.txt", &st)) {
fprintf(stderr, "Failed to stat `2600.txt`. errno = %d\n",
(int) errno);
apr_pool_destroy(root_pool);
return EXIT_FAILURE;
}
char *const data = (char*) malloc((size_t) (st.st_size + 1));
if (data == NULL) {
fprintf(stderr, "`malloc` failed to allocate %d bytes.\n",
(int) (st.st_size + 1));
apr_pool_destroy(root_pool);
return EXIT_FAILURE;
}
FILE *fp = fopen("2600.txt", "r");
if (fp == NULL) {
fprintf(stderr, "Failed to open `2600.txt` for reading\n");
free(data);
apr_pool_destroy(root_pool);
return EXIT_FAILURE;
}
off_t bytes_read = 0;
char *p = data + bytes_read;
size_t size = fread(p, 1, st.st_size - bytes_read, fp);
bytes_read += size;
p += size;
while (bytes_read < st.st_size && ! ferror(fp) && ! feof(fp)) {
size = fread(p, 1, st.st_size - bytes_read, fp);
bytes_read += size;
p += size;
}
*p = '\0';
if (ferror(fp)) {
fprintf(stderr, "An I/O error occurred.\n");
fclose(fp);
free(data);
apr_pool_destroy(root_pool);
return EXIT_FAILURE;
}
fclose(fp);
const clock_t before_clocks = clock();
for (size = 0; size < 100; ++size) {
apr_pool_t *pool = NULL;
apr_status = apr_pool_create(&pool, root_pool);
if (! APR_STATUS_IS_SUCCESS(apr_status)) {
fprintf(stderr, "`apr_pool_create` failed: %s\n",
apr_strerror(apr_status, s_1KiB_buf, sizeof s_1KiB_buf/sizeof s_1KiB_buf[0]));
free(data);
apr_pool_destroy(root_pool);
return EXIT_FAILURE;
}
svn_string_t *translated_string = NULL;
svn_error_t *svn_error =
svn_subst_translate_string(&translated_string, svn_string_create(data, pool),
"UTF-8", pool);
if (svn_error) {
fprintf(stderr, "`svn_subst_translate_string` failed:
%s\n", svn_err_best_message(svn_error, s_1KiB_buf, sizeof s_1KiB_buf/sizeof
s_1KiB_buf[0]));
apr_pool_destroy(pool);
free(data);
apr_pool_destroy(root_pool);
return EXIT_FAILURE;
}
apr_pool_destroy(pool);
}
const clock_t after_clocks = clock();
printf("(after_clocks - before_clocks) = %d\n", (int) (after_clocks -
before_clocks));
free(data);
apr_pool_destroy(root_pool);
return EXIT_SUCCESS;
}
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <sys/stat.h>
#include <apr.h>
#include <apr_errno.h>
#include <apr_general.h>
#include <apr_pools.h>
#include <apr_strings.h>
#include <apr_tables.h>
#include "svn_types.h"
#include "svn_error_codes.h"
#include "svn_error.h"
#include "svn_string.h"
#include "svn_io.h"
#include "svn_subst.h"
#ifndef APR_STATUS_IS_SUCCESS
#define APR_STATUS_IS_SUCCESS(s) ((s) == APR_SUCCESS)
#endif
static char s_1KiB_buf[1024];
int main(int argc, const char *const *argv, const char *const *env)
{
apr_status_t apr_status = apr_app_initialize(&argc, &argv, &env);
if (! APR_STATUS_IS_SUCCESS(apr_status)) {
fprintf(stderr, "`apr_app_initialize` failed: %s\n",
apr_strerror(apr_status, s_1KiB_buf, sizeof s_1KiB_buf/sizeof s_1KiB_buf[0]));
return EXIT_FAILURE;
}
atexit(apr_terminate);
apr_pool_t *root_pool = NULL;
apr_status = apr_pool_create(&root_pool, NULL);
if (! APR_STATUS_IS_SUCCESS(apr_status)) {
fprintf(stderr, "`apr_pool_create` failed: %s\n",
apr_strerror(apr_status, s_1KiB_buf, sizeof s_1KiB_buf/sizeof s_1KiB_buf[0]));
return EXIT_FAILURE;
}
struct stat st;
if (-1 == stat("2600.txt", &st)) {
fprintf(stderr, "Failed to stat `2600.txt`. errno = %d\n",
(int) errno);
apr_pool_destroy(root_pool);
return EXIT_FAILURE;
}
char *const data = (char*) malloc((size_t) (st.st_size + 1));
if (data == NULL) {
fprintf(stderr, "`malloc` failed to allocate %d bytes.\n",
(int) (st.st_size + 1));
apr_pool_destroy(root_pool);
return EXIT_FAILURE;
}
FILE *fp = fopen("2600.txt", "r");
if (fp == NULL) {
fprintf(stderr, "Failed to open `2600.txt` for reading\n");
free(data);
apr_pool_destroy(root_pool);
return EXIT_FAILURE;
}
off_t bytes_read = 0;
char *p = data + bytes_read;
size_t size = fread(p, 1, st.st_size - bytes_read, fp);
bytes_read += size;
p += size;
while (bytes_read < st.st_size && ! ferror(fp) && ! feof(fp)) {
size = fread(p, 1, st.st_size - bytes_read, fp);
bytes_read += size;
p += size;
}
*p = '\0';
if (ferror(fp)) {
fprintf(stderr, "An I/O error occurred.\n");
fclose(fp);
free(data);
apr_pool_destroy(root_pool);
return EXIT_FAILURE;
}
fclose(fp);
const clock_t before_clocks = clock();
for (size = 0; size < 100; ++size) {
apr_pool_t *pool = NULL;
apr_status = apr_pool_create(&pool, root_pool);
if (! APR_STATUS_IS_SUCCESS(apr_status)) {
fprintf(stderr, "`apr_pool_create` failed: %s\n",
apr_strerror(apr_status, s_1KiB_buf, sizeof s_1KiB_buf/sizeof s_1KiB_buf[0]));
free(data);
apr_pool_destroy(root_pool);
return EXIT_FAILURE;
}
svn_string_t *translated_string = NULL;
svn_boolean_t translated_eol = -1;
svn_error_t *svn_error =
svn_subst_translate_string2(&translated_string, NULL, &translated_eol,
svn_string_create(data, pool), "UTF-8", pool, pool);
if (svn_error) {
fprintf(stderr, "`svn_subst_translate_string` failed:
%s\n", svn_err_best_message(svn_error, s_1KiB_buf, sizeof s_1KiB_buf/sizeof
s_1KiB_buf[0]));
apr_pool_destroy(pool);
free(data);
apr_pool_destroy(root_pool);
return EXIT_FAILURE;
}
apr_pool_destroy(pool);
}
const clock_t after_clocks = clock();
printf("(after_clocks - before_clocks) = %d\n", (int) (after_clocks -
before_clocks));
free(data);
apr_pool_destroy(root_pool);
return EXIT_SUCCESS;
}
#SUBVERSION_1_PREFIX = /usr/local/stow/apache-subversion-tr...@1040115
SUBVERSION_1_PREFIX = /usr/local/stow/dtrebbien-subversion-HEAD
CPPFLAGS = $(shell pkg-config --cflags-only-I apr-1)
-I$(SUBVERSION_1_PREFIX)/include/subversion-1
CFLAGS = -g -O2 -Wall -Werror=implicit-function-declaration $(shell pkg-config
--cflags-only-other apr-1)
LDFLAGS = $(shell pkg-config --libs-only-L apr-1) $(shell pkg-config
--libs-only-other apr-1) -L$(SUBVERSION_1_PREFIX)/lib
LIBS = $(shell pkg-config --libs-only-l apr-1) -lsvn_subr-1
EXEEXT =
bench_svn_subst_translate_string$(EXEEXT): bench_svn_subst_translate_string.o
$(CC) -o $@ $(LDFLAGS) $+ $(LIBS)
bench_svn_subst_translate_string.o: Makefile
# export LD_LIBRARY_PATH=/usr/local/stow/apache-subversion-tr...@1040115/lib
# export LD_LIBRARY_PATH=/usr/local/stow/dtrebbien-subversion-HEAD/lib
# ldd bench_svn_subst_translate_string
# for j in $(seq 30); do ./bench_svn_subst_translate_string; done