Attached is a benchmark and Makefile that I used to test the speed of svn_subst_translate_string() from trunk versus the new svn_subst_translate_string2(). The program reads a text file named `2600.txt` in the current working directory and repeatedly calls svn_subst_translate_string() on the contents. For `2600.txt`, I used the plain text version of War and Peace from Project Gutenberg (http://www.gutenberg.org/ebooks/2600.txt.utf8).
The data that I generated for tr...@1040115 were: trunk_at_1040115 <- c(7780000, 7910000, 7870000, 7660000, 7840000, 7760000, 7620000, 7500000, 7860000, 7800000, 7640000, 7740000, 7760000, 7850000, 8010000, 7800000, 7730000, 7700000, 7900000, 7760000, 7790000, 7970000, 7700000, 7710000, 7990000, 7830000, 7780000, 7810000, 7730000, 7600000) The data for the "HEAD" sources (commit 6f828b0a4e07d1e14189b9b8c84bd0f884c59164 from my repo; https://github.com/dtrebbien/subversion/tree/6f828b0a4e07d1e14189b9b8c84bd0f884c59164) were: HEAD <- c(8050000, 8230000, 7980000, 8150000, 7950000, 8600000, 8080000, 8420000, 8000000, 8020000, 8420000, 7960000, 8010000, 8200000, 8080000, 8490000, 8190000, 7920000, 7820000, 7780000, 7880000, 8540000, 7970000, 8250000, 8830000, 8540000, 8310000, 8270000, 8010000, 7990000) Note: This is not "version 3" of the patch. It is essentially tr...@1040115 plus "version 3" plus this changeset: https://github.com/dtrebbien/subversion/commit/d22329a54dcf58cddc2b618f913597c6defbcb2d The t-test allows us to conclude with high confidence that the mean time to run the benchmark with libsvn_subr-1 compiled from tr...@1040115 is less than the mean time to run the benchmark with libsvn_subr-1 compiled from the HEAD sources: > t.test(trunk_at_1040115, HEAD, alternative = "less", var.equal = TRUE, > conf.level = 0.90) Two Sample t-test data: trunk_at_1040115 and HEAD t = -7.473, df = 58, p-value = 2.350e-10 alternative hypothesis: true difference in means is less than 0 90 percent confidence interval: -Inf -317939.7 sample estimates: mean of x mean of y 7780000 8164667 I realized, however, that this is not a fair comparison because the HEAD sources simply call svn_subst_translate_string2() within svn_subst_translate_string(), meaning that there is an extra layer of indirection. After modifying the benchmark to call svn_subst_translate_string2() directly, I generated these timings: HEAD_new <- c(7850000, 7890000, 8080000, 7980000, 7820000, 7880000, 7850000, 7540000, 8470000, 8230000, 8410000, 7880000, 7410000, 7490000, 7420000, 7650000, 7430000, 7430000, 7530000, 7720000, 7940000, 7780000, 8070000, 7840000, 7870000, 7970000, 7690000, 7910000, 7860000, 7620000) Now we cannot reject the null hypothesis that the mean time to run the benchmark with libsvn_subr-1 compiled from tr...@1040115 is greater than or equal to the mean time to run the modified benchmark with libsvn_subr-1 compiled from the HEAD sources: > t.test(trunk_at_1040115, HEAD_new, alternative = "less", var.equal = TRUE, > conf.level = 0.90) Two Sample t-test data: trunk_at_1040115 and HEAD_new t = -0.6839, df = 58, p-value = 0.2484 alternative hypothesis: true difference in means is less than 0 90 percent confidence interval: -Inf 33129.55 sample estimates: mean of x mean of y 7780000 7817000 One other set of timings that I generated were for the modified benchmark running with libsvn_subr-1 compiled from the HEAD sources, slightly modified to set `repair` to TRUE: HEAD_new_repair <- c(7660000, 7560000, 7570000, 7540000, 7670000, 7790000, 7460000, 7840000, 8060000, 7790000, 8000000, 7830000, 8370000, 8010000, 7730000, 7800000, 7900000, 7730000, 7730000, 7790000, 7750000, 7930000, 7860000, 7810000, 7930000, 7840000, 7890000, 7460000, 7790000, 7730000) We cannot reject the null hypothesis that the mean time to run the modified benchmark with libsvn_subr-1 compiled from the HEAD sources is the same as the mean time to run the modified benchmark with libsvn_subr-1 compiled from slightly-modified HEAD sources (`repair` is set to TRUE): > t.test(HEAD_new, HEAD_new_repair, var.equal = TRUE, conf.level = 0.90) Two Sample t-test data: HEAD_new and HEAD_new_repair t = 0.3815, df = 58, p-value = 0.7042 alternative hypothesis: true difference in means is not equal to 0 90 percent confidence interval: -77774.74 123774.74 sample estimates: mean of x mean of y 7817000 7794000 > t.test(trunk_at_1040115, HEAD_new_repair, alternative = "less", var.equal = > TRUE, conf.level = 0.90) Two Sample t-test data: trunk_at_1040115 and HEAD_new_repair t = -0.3501, df = 58, p-value = 0.3638 alternative hypothesis: true difference in means is less than 0 90 percent confidence interval: -Inf 37836.36 sample estimates: mean of x mean of y 7780000 7794000 Therefore, I do not have evidence to support my earlier claim: "3.) This penalizes repair translations." My conclusion from all of this is that regardless of the value of `repair`, my changes do not appear to decrease the performance of svn_subst_translate_string() as long as svn_subst_translate_string2() is called directly.
#include <stddef.h> #include <stdio.h> #include <stdlib.h> #include <time.h> #include <sys/stat.h> #include <apr.h> #include <apr_errno.h> #include <apr_general.h> #include <apr_pools.h> #include <apr_strings.h> #include <apr_tables.h> #include "svn_types.h" #include "svn_error_codes.h" #include "svn_error.h" #include "svn_string.h" #include "svn_io.h" #include "svn_subst.h" #ifndef APR_STATUS_IS_SUCCESS #define APR_STATUS_IS_SUCCESS(s) ((s) == APR_SUCCESS) #endif static char s_1KiB_buf[1024]; int main(int argc, const char *const *argv, const char *const *env) { apr_status_t apr_status = apr_app_initialize(&argc, &argv, &env); if (! APR_STATUS_IS_SUCCESS(apr_status)) { fprintf(stderr, "`apr_app_initialize` failed: %s\n", apr_strerror(apr_status, s_1KiB_buf, sizeof s_1KiB_buf/sizeof s_1KiB_buf[0])); return EXIT_FAILURE; } atexit(apr_terminate); apr_pool_t *root_pool = NULL; apr_status = apr_pool_create(&root_pool, NULL); if (! APR_STATUS_IS_SUCCESS(apr_status)) { fprintf(stderr, "`apr_pool_create` failed: %s\n", apr_strerror(apr_status, s_1KiB_buf, sizeof s_1KiB_buf/sizeof s_1KiB_buf[0])); return EXIT_FAILURE; } struct stat st; if (-1 == stat("2600.txt", &st)) { fprintf(stderr, "Failed to stat `2600.txt`. errno = %d\n", (int) errno); apr_pool_destroy(root_pool); return EXIT_FAILURE; } char *const data = (char*) malloc((size_t) (st.st_size + 1)); if (data == NULL) { fprintf(stderr, "`malloc` failed to allocate %d bytes.\n", (int) (st.st_size + 1)); apr_pool_destroy(root_pool); return EXIT_FAILURE; } FILE *fp = fopen("2600.txt", "r"); if (fp == NULL) { fprintf(stderr, "Failed to open `2600.txt` for reading\n"); free(data); apr_pool_destroy(root_pool); return EXIT_FAILURE; } off_t bytes_read = 0; char *p = data + bytes_read; size_t size = fread(p, 1, st.st_size - bytes_read, fp); bytes_read += size; p += size; while (bytes_read < st.st_size && ! ferror(fp) && ! feof(fp)) { size = fread(p, 1, st.st_size - bytes_read, fp); bytes_read += size; p += size; } *p = '\0'; if (ferror(fp)) { fprintf(stderr, "An I/O error occurred.\n"); fclose(fp); free(data); apr_pool_destroy(root_pool); return EXIT_FAILURE; } fclose(fp); const clock_t before_clocks = clock(); for (size = 0; size < 100; ++size) { apr_pool_t *pool = NULL; apr_status = apr_pool_create(&pool, root_pool); if (! APR_STATUS_IS_SUCCESS(apr_status)) { fprintf(stderr, "`apr_pool_create` failed: %s\n", apr_strerror(apr_status, s_1KiB_buf, sizeof s_1KiB_buf/sizeof s_1KiB_buf[0])); free(data); apr_pool_destroy(root_pool); return EXIT_FAILURE; } svn_string_t *translated_string = NULL; svn_error_t *svn_error = svn_subst_translate_string(&translated_string, svn_string_create(data, pool), "UTF-8", pool); if (svn_error) { fprintf(stderr, "`svn_subst_translate_string` failed: %s\n", svn_err_best_message(svn_error, s_1KiB_buf, sizeof s_1KiB_buf/sizeof s_1KiB_buf[0])); apr_pool_destroy(pool); free(data); apr_pool_destroy(root_pool); return EXIT_FAILURE; } apr_pool_destroy(pool); } const clock_t after_clocks = clock(); printf("(after_clocks - before_clocks) = %d\n", (int) (after_clocks - before_clocks)); free(data); apr_pool_destroy(root_pool); return EXIT_SUCCESS; }
#include <stddef.h> #include <stdio.h> #include <stdlib.h> #include <time.h> #include <sys/stat.h> #include <apr.h> #include <apr_errno.h> #include <apr_general.h> #include <apr_pools.h> #include <apr_strings.h> #include <apr_tables.h> #include "svn_types.h" #include "svn_error_codes.h" #include "svn_error.h" #include "svn_string.h" #include "svn_io.h" #include "svn_subst.h" #ifndef APR_STATUS_IS_SUCCESS #define APR_STATUS_IS_SUCCESS(s) ((s) == APR_SUCCESS) #endif static char s_1KiB_buf[1024]; int main(int argc, const char *const *argv, const char *const *env) { apr_status_t apr_status = apr_app_initialize(&argc, &argv, &env); if (! APR_STATUS_IS_SUCCESS(apr_status)) { fprintf(stderr, "`apr_app_initialize` failed: %s\n", apr_strerror(apr_status, s_1KiB_buf, sizeof s_1KiB_buf/sizeof s_1KiB_buf[0])); return EXIT_FAILURE; } atexit(apr_terminate); apr_pool_t *root_pool = NULL; apr_status = apr_pool_create(&root_pool, NULL); if (! APR_STATUS_IS_SUCCESS(apr_status)) { fprintf(stderr, "`apr_pool_create` failed: %s\n", apr_strerror(apr_status, s_1KiB_buf, sizeof s_1KiB_buf/sizeof s_1KiB_buf[0])); return EXIT_FAILURE; } struct stat st; if (-1 == stat("2600.txt", &st)) { fprintf(stderr, "Failed to stat `2600.txt`. errno = %d\n", (int) errno); apr_pool_destroy(root_pool); return EXIT_FAILURE; } char *const data = (char*) malloc((size_t) (st.st_size + 1)); if (data == NULL) { fprintf(stderr, "`malloc` failed to allocate %d bytes.\n", (int) (st.st_size + 1)); apr_pool_destroy(root_pool); return EXIT_FAILURE; } FILE *fp = fopen("2600.txt", "r"); if (fp == NULL) { fprintf(stderr, "Failed to open `2600.txt` for reading\n"); free(data); apr_pool_destroy(root_pool); return EXIT_FAILURE; } off_t bytes_read = 0; char *p = data + bytes_read; size_t size = fread(p, 1, st.st_size - bytes_read, fp); bytes_read += size; p += size; while (bytes_read < st.st_size && ! ferror(fp) && ! feof(fp)) { size = fread(p, 1, st.st_size - bytes_read, fp); bytes_read += size; p += size; } *p = '\0'; if (ferror(fp)) { fprintf(stderr, "An I/O error occurred.\n"); fclose(fp); free(data); apr_pool_destroy(root_pool); return EXIT_FAILURE; } fclose(fp); const clock_t before_clocks = clock(); for (size = 0; size < 100; ++size) { apr_pool_t *pool = NULL; apr_status = apr_pool_create(&pool, root_pool); if (! APR_STATUS_IS_SUCCESS(apr_status)) { fprintf(stderr, "`apr_pool_create` failed: %s\n", apr_strerror(apr_status, s_1KiB_buf, sizeof s_1KiB_buf/sizeof s_1KiB_buf[0])); free(data); apr_pool_destroy(root_pool); return EXIT_FAILURE; } svn_string_t *translated_string = NULL; svn_boolean_t translated_eol = -1; svn_error_t *svn_error = svn_subst_translate_string2(&translated_string, NULL, &translated_eol, svn_string_create(data, pool), "UTF-8", pool, pool); if (svn_error) { fprintf(stderr, "`svn_subst_translate_string` failed: %s\n", svn_err_best_message(svn_error, s_1KiB_buf, sizeof s_1KiB_buf/sizeof s_1KiB_buf[0])); apr_pool_destroy(pool); free(data); apr_pool_destroy(root_pool); return EXIT_FAILURE; } apr_pool_destroy(pool); } const clock_t after_clocks = clock(); printf("(after_clocks - before_clocks) = %d\n", (int) (after_clocks - before_clocks)); free(data); apr_pool_destroy(root_pool); return EXIT_SUCCESS; }
#SUBVERSION_1_PREFIX = /usr/local/stow/apache-subversion-tr...@1040115 SUBVERSION_1_PREFIX = /usr/local/stow/dtrebbien-subversion-HEAD CPPFLAGS = $(shell pkg-config --cflags-only-I apr-1) -I$(SUBVERSION_1_PREFIX)/include/subversion-1 CFLAGS = -g -O2 -Wall -Werror=implicit-function-declaration $(shell pkg-config --cflags-only-other apr-1) LDFLAGS = $(shell pkg-config --libs-only-L apr-1) $(shell pkg-config --libs-only-other apr-1) -L$(SUBVERSION_1_PREFIX)/lib LIBS = $(shell pkg-config --libs-only-l apr-1) -lsvn_subr-1 EXEEXT = bench_svn_subst_translate_string$(EXEEXT): bench_svn_subst_translate_string.o $(CC) -o $@ $(LDFLAGS) $+ $(LIBS) bench_svn_subst_translate_string.o: Makefile # export LD_LIBRARY_PATH=/usr/local/stow/apache-subversion-tr...@1040115/lib # export LD_LIBRARY_PATH=/usr/local/stow/dtrebbien-subversion-HEAD/lib # ldd bench_svn_subst_translate_string # for j in $(seq 30); do ./bench_svn_subst_translate_string; done