Hello Albert, please note that in the previous patch, I made a mistake with the move constructor (the ref count should not be moved/swapped since it is tied to the instance, not its value). Attached is a fixed patch.
I found the issue during further performance tests which however only continued to strengthen to zero sum result. In any case, I extend the Python-based perftest script with a C++-based driver to be able to reliably measure single-page runtimes for this, which I will also attach if anyone is interested in this. Best regards, Adam. Am 03.03.2018 um 20:29 schrieb Albert Astals Cid: > Ha! didn't realize you had already done all that, will read/answer the other > email. > > Cheers, > Albert > > El dissabte, 3 de març de 2018, a les 20:25:32 CET, Albert Astals Cid va > escriure: >> El dimecres, 21 de febrer de 2018, a les 7:13:57 CET, Adam Reichold va >> >> escriure: >>> Hello again, >>> >>> Am 21.02.2018 um 00:31 schrieb Albert Astals Cid: >>>> El dimarts, 20 de febrer de 2018, a les 8:58:24 CET, Adam Reichold va >>>> >>>> escriure: >>>>> Hello again, >>>>> >>>>> Am 18.02.2018 um 23:23 schrieb Adam Reichold: >>>>>> Am 18.02.2018 um 23:08 schrieb Albert Astals Cid: >>>>>>> El diumenge, 18 de febrer de 2018, a les 16:55:37 CET, Adam Reichold >>>>>>> va >>>>>>> >>>>>>> escriure: >>>>>>>> Hello, >>>>>>>> >>>>>>>> the attached patch replaced Poppler's internal hash table >>>>>>>> implementation >>>>>>>> GooHash by std::unordered_map found in the C++ standard library >>>>>>>> since >>>>>>>> C++11. This continues Poppler's slow drift towards standard library >>>>>>>> containers and removes one of the home-grown data structures with >>>>>>>> the >>>>>>>> main goals of reducing code size and improving the long term >>>>>>>> maintainability of the code base. >>>>>>> >>>>>>> Do you have any benchmarks on "rendering" speed and code size? >>>>>> >>>>>> Sorry, not at hand. I will try to prepare them during the week. >>>>> >>>>> I did run Splash rendering benchmarks of this branch against master >>>>> with >>>>> the result of rendering the circa 2400 documents of the TeXLive >>>> >>>>> documentation present on my machine being: >>>> I'm wondering if those 2400 documents are diverse enough, which they may >>>> not be given they are all coming from "the same place". >>> >>> They seem pretty diverse w.r.t. content, some being text heavy and some >>> graphics rich. But I guess they are definitely not diverse technically >>> as all are prepared using TeX itself. >>> >>> The main problem on my side is that I have failed to find my DVD copy of >>> the Poppler regtest document collection until now. :-\ In any case, I am >>> reasonably confident on the zero sum result since GooHash does not seem >>> to live in any performance critical code path. >>> >>>>> Cumulative run time: >>>>> Result: 90.95 min ∓ 1.1 % >>>>> Reference: 91.57 min ∓ 1.2 % >>>>> Deviation: -0.0 % >>>>> >>>>> Cumulative memory usage: >>>>> Result: 37.2 MB ∓ 0.7 % >>>>> Reference: 37.0 MB ∓ 0.7 % >>>>> Deviation: +0.0 % >>>>> >>>>> (Where result is this patch and the reference is master.) (The >>>>> measurement was taken using the perftest script which I proposed here >>>>> some time ago for which I'll attach the patch again for >>>>> reproduceability.) >>>> >>>> Did you really attach this before? i really don't remember it :D >>> >>> This was a very long-winded thread ending on 2016-01-01 centered around >>> the regtest framework. It went from custom driver using QTest's >>> benchmark functionality to custom backend in the regtest framework to >>> separate Python script. The script still has problems to reliably >>> measure really short things like running "pdftotext" for which a custom >>> C++ driver might be needed, but for long-running stuff like "pdftoppm", >>> the results seem reasonable. >>> >>>> Anyhow it seems the change is mostly a zero-sum runtime wise. >>> >>> Indeed. (Although thinking really really long term, I think a Poppler >>> that is using std::vector, std::string and friends everywhere, could >>> loose a lot of distributed fat in the form of a lot of memory >>> indirections and benefit from the highly optimized standard library. But >>> it just is not a single patch thing to reap any of these benefits.) >>> >>>> Which bring us to the code side. First i know it sucks but your spacing >>>> is >>>> broken, please check the whole patch >>>> >>>> - nameToGID->removeInt(macGlyphNames[j]); >>>> - nameToGID->add(new GooString(macGlyphNames[j]), i); >>>> + nameToGID[macGlyphNames[j]] = i; >>>> >>>> i could fix it, but it's better (for me) if you do :D >>> >>> Definitely for me to fix. The main problem is that the spacing in those >>> files was already broken and I am unsure whether I should fix the >>> surrounding spacing even if I am not touching it otherwise. Due to that, >>> there sometimes is no correct way in the current patch. If you do not >>> say otherwise, I will try to make at least the touched blocks of code >>> consistent. >> >> Are you sure the spacing is broken? I'd say it's just xpdf weird spacing >> rules of using 2 soft spaces and then hard tabs at 8. >> >>>> Now onto the code, >>>> >>>> const auto emplaceRangeMap = [&](const char* encodingName, GBool >>>> unicodeOut,> >>>> >>>> UnicodeMapRange* ranges, int len) { >>>> >>>> residentUnicodeMaps.emplace( >>>> >>>> std::piecewise_construct, >>>> std::forward_as_tuple(encodingName), >>>> std::forward_as_tuple(encodingName, unicodeOut, ranges, len) >>>> >>>> ); >>>> >>>> }; >>>> >>>> It's the first time in my life i see std::piecewise_construct and >>>> std::forward_as_tuple, yes that probably makes me a bad C++ developer, >>>> but >>>> given there's lots of us around, it doesn't make me happy now i don't >>>> understand what the code does. >>> >>> The problem is that most internal Poppler types lack proper construction >>> and assignment operators, hence I need to work harder to construct those >>> UnicodeMap instances in-place inside the map (GooHash just stored a >>> pointer to it, so there was no problem.) >>> >>> The whole piecewise_construct and forward_as_tuple dance just ensures, >>> that those parameters to emplace are properly grouped before being >>> unpacked to become the parameters of std::string and UnicodeMap >>> construction again. If UnicodeMap was movable, this would probably look >>> like >>> >>> residentUnicodeMaps.emplace( >>> >>> encodingName, >>> UnicodeMap{encodingName, unicodeOut, ranges, len} >>> >>> ); >>> >>> If you like, I can try to make Unicode a move-only type and simplify the >>> mentioned helper functions? >> >> you can give it a try, not sure how easy that's going to be. >> >>>> Then there's the benefit/risk ratio. The code using GooHash is mostly >>>> rocksolid and we haven't probably touched any line in it for a long time >>>> and we have probably neither written new code that uses GooHash. >>>> >>>> In that regard we risk breaking working code. >>>> >>>> The benefit is not more speed nor less memory usage as your measurements >>>> show. >>>> >>>> Mostly the benefit is "removing code from maintainership", which i agree >>>> is a good thing, but as mentioned before it's some code "we mostly >>>> ignore", so it's not that much of a good thing. >>> >>> I very much agree with the risk assessment. >>> >>> But I also think the code will ossify (or maybe already is?) due to >>> those custom data structures and the less than idiomatic C++ usage. >>> Hence I think, Poppler would not just loose code, but the remaining code >>> should become easier to maintain. (Of course, the piecewise_construct >>> fiasco shows that this has intermediate costs. But again, I think this >>> is just an incentive to provide types with the usual C++ semantics which >>> should make all code using those types better.) >> >> Yeah, i guess you're right. >> >> Cheers, >> Albert >> >>> Best regards, Adam. >>> >>>> So i'm hesitant of what to do. It really sounds like a nice thing to not >>>> have custom structures, but on the other hand it's not like they get >>>> much >>>> in the way of coding. >>>> >>>> I'd really appreciate other people's comments on this. >>>> >>>> Cheers, >>>> >>>> Albert >>>>> >>>>> I'll also attach the detailed comparison, but the gist seems to be that >>>>> if there are significant changes, the run time is reduced but the >>>>> memory >>>>> usage is increased in the majority of cases. But this does not seem to >>>>> show up in the cumulative results. >>>>> >>>>> Best regards, Adam. >>>>> >>>>> P.S.: One could try to improve the memory usage by tuning the load >>>>> factor or calling shrink_to_fit where appropriate. Would you like me to >>>>> try to do this? >>>>> >>>>> P.P.S.: One obvious area for improvement would be better >>>>> interoperability between GooString and std::string, i.e. not converting >>>>> them as C strings so that the length information does not need to be >>>>> recomputed. I will try to prepare this as a separate patch on top of >>>>> this one or should I include this here? >>>>> >>>>> Best regards, Adam. >>>>> >>>>>> Concerning code size, a release build of libpoppler.so goes from >>>>>> >>>>>> text data bss dec hex filename >>>>>> >>>>>> 2464034 288852 360 2753246 2a02de libpoppler.so.73.0.0 >>>>>> >>>>>> for the current master to >>>>>> >>>>>> text data bss dec hex filename >>>>>> >>>>>> 2482129 288756 360 2771245 2a492d libpoppler.so.73.0.0 >>>>>> >>>>>> with the patch applied, i.e. a 0.65% increase in binary size. >>>>>> >>>>>> >>>>>> Please note that in my original message, I was referring only to >>>>>> source >>>>>> code size, i.e. >>>>>> >>>>>> git diff --stat master...remove-goo-hash >>>>>> >>>>>> 18 files changed, 168 insertions(+), 803 deletions(-) >>>>>> >>>>>>> Cheers, >>>>>>> >>>>>>> Albert >>>>>> >>>>>> Best regards, Adam. >>>>>> >>>>>>>> Best regards, Adam. >>>>>>> >>>>>>> _______________________________________________ >>>>>>> poppler mailing list >>>>>>> poppler@lists.freedesktop.org >>>>>>> https://lists.freedesktop.org/mailman/listinfo/poppler >>>>>> >>>>>> _______________________________________________ >>>>>> poppler mailing list >>>>>> poppler@lists.freedesktop.org >>>>>> https://lists.freedesktop.org/mailman/listinfo/poppler >>>> >>>> _______________________________________________ >>>> poppler mailing list >>>> poppler@lists.freedesktop.org >>>> https://lists.freedesktop.org/mailman/listinfo/poppler >> >> _______________________________________________ >> poppler mailing list >> poppler@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/poppler > > > > > _______________________________________________ > poppler mailing list > poppler@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/poppler >
From d71ecca49f09b39c109378a9c26e2b223d91cb92 Mon Sep 17 00:00:00 2001 From: Adam Reichold <adam.reich...@t-online.de> Date: Tue, 20 Feb 2018 08:41:04 +0100 Subject: [PATCH] Add simple Python script to measure and comprare rendering performance of Poppler builds. --- CMakeLists.txt | 1 + perftest/CMakeLists.txt | 4 + perftest/compare.py | 114 ++++++++++++++++++ perftest/driver.cc | 291 ++++++++++++++++++++++++++++++++++++++++++++++ perftest/measure.py | 120 +++++++++++++++++++ perftest/poppler-perftest | 40 +++++++ perftest/util.py | 20 ++++ 7 files changed, 590 insertions(+) create mode 100644 perftest/CMakeLists.txt create mode 100644 perftest/compare.py create mode 100644 perftest/driver.cc create mode 100644 perftest/measure.py create mode 100755 perftest/poppler-perftest create mode 100644 perftest/util.py diff --git a/CMakeLists.txt b/CMakeLists.txt index 5e3d6a17..5576d16a 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -661,6 +661,7 @@ if(ENABLE_GLIB) add_subdirectory(glib) endif() add_subdirectory(test) +add_subdirectory(perftest) if(ENABLE_QT5) add_subdirectory(qt5) endif() diff --git a/perftest/CMakeLists.txt b/perftest/CMakeLists.txt new file mode 100644 index 00000000..84e1e3a2 --- /dev/null +++ b/perftest/CMakeLists.txt @@ -0,0 +1,4 @@ +if (ENABLE_SPLASH) + add_executable(driver driver.cc) + target_link_libraries(driver poppler) +endif () diff --git a/perftest/compare.py b/perftest/compare.py new file mode 100644 index 00000000..58eb8c9e --- /dev/null +++ b/perftest/compare.py @@ -0,0 +1,114 @@ +import pickle +import zlib + +from util import reference, reldev, maxabs + +def collect_stats3a(stats, entry): + if stats is None: + stats = (0, 0) + + sum, acc = stats + mean, stdev = entry + stats = (sum + mean, acc + abs(stdev / mean)) + + return stats + +def collect_stats3b(stats, entry): + if stats is None: + stats = 0 + + stats += entry + + return stats + +def collect_stats2(stats, entry): + if stats is None: + stats = { 'run_time': None, 'memory_usage': None } + + stats['run_time'] = collect_stats3a(stats['run_time'], entry['run_time']) + stats['memory_usage'] = collect_stats3b(stats['memory_usage'], entry['memory_usage']) + + return stats + +def collect_stats1(stats, entry, ref_entry): + if stats is None: + stats = { 'results': None, 'ref_results': None } + + stats['results'] = collect_stats2(stats['results'], entry) + stats['ref_results'] = collect_stats2(stats['ref_results'], ref_entry) + + return stats + +def print_stats(count, stats): + run_time_sum, ref_run_time_sum = stats['results']['run_time'][0], stats['ref_results']['run_time'][0] + run_time_acc, ref_run_time_acc = stats['results']['run_time'][1] / count, stats['ref_results']['run_time'][1] / count + run_time_reldev = reldev(run_time_sum, ref_run_time_sum) + + memory_usage_sum, ref_memory_usage_sum = stats['results']['memory_usage'], stats['ref_results']['memory_usage'] + memory_usage_reldev = reldev(memory_usage_sum, ref_memory_usage_sum) + + print('\tCumulative run time:') + print('\t\tResult: %.2f min â %.1f %%' % (run_time_sum * 1.0e-6 / 60.0, run_time_acc * 100.0)) + print('\t\tReference: %.2f min â %.1f %%' % (ref_run_time_sum * 1.0e-6 / 60.0, ref_run_time_acc * 100.0)) + print('\t\tDeviation: %+.2f %%' % (run_time_reldev)) + print('\tCumulative memory usage:') + print('\t\tResult: %.1f MB %%' % (memory_usage_sum / 1024.0 / 1024.0)) + print('\t\tReference: %.1f MB %%' % (ref_memory_usage_sum / 1024.0 / 1024.0)) + print('\t\tDeviation: %+.2f %%' % (memory_usage_reldev)) + +def compare(args): + with open(args.results, 'rb') as file: + data = file.read() + data = zlib.decompress(data) + results = pickle.loads(data) + with open(args.reference_results, 'rb') as file: + data = file.read() + data = zlib.decompress(data) + ref_results = pickle.loads(data) + + count = 0 + stats = None + count_above_threshold = 0 + stats_above_threshold = None + + for document in results.keys() & ref_results.keys(): + entries = results[document] + ref_entries = ref_results[document] + + for page in entries.keys() & ref_entries.keys(): + entry = entries[page] + ref_entry = ref_entries[page] + + count += 1 + stats = collect_stats1(stats, entry, ref_entry) + + run_time_mean, run_time_stdev = entry['run_time'] + ref_run_time_mean, ref_run_time_stdev = ref_entry['run_time'] + run_time_reldev = reldev(run_time_mean, ref_run_time_mean) + + memory_usage = entry['memory_usage'] + ref_memory_usage = ref_entry['memory_usage'] + memory_usage_reldev = reldev(memory_usage, ref_memory_usage) + + if maxabs(run_time_reldev, memory_usage_reldev) <= args.threshold: + continue + + count_above_threshold += 1 + stats_above_threshold = collect_stats1(stats, entry, ref_entry) + + print('%s:' % (reference(document, page))) + print('\tRun time:') + print('\t\tResult: %.2f â %.3f s' % (run_time_mean * 1.0e-6, run_time_stdev * 1.0e-6)) + print('\t\tReference: %.2f â %.3f s' % (ref_run_time_mean * 1.0e-6, ref_run_time_stdev * 1.0e-6)) + print('\t\tDeviation: %.1f %%' % (run_time_reldev * 100.0)) + print('\tMemory usage:') + print('\t\tResult: %.1f kB' % (memory_usage / 1024.0)) + print('\t\tReference: %.1f kB' % (ref_memory_usage / 1024.0)) + print('\t\tDeviation: %.1f %%' % (memory_usage_reldev * 100.0)) + + print('%d matching result(s):' % (count)) + print_stats(count, stats) + + if count_above_threshold != 0: + print('%d matching result(s) above the given threshold %.1f %%:' % (count_above_threshold, args.threshold * 100.0)) + print_stats(count_above_threshold, stats_above_threshold) diff --git a/perftest/driver.cc b/perftest/driver.cc new file mode 100644 index 00000000..f031f697 --- /dev/null +++ b/perftest/driver.cc @@ -0,0 +1,291 @@ +#include <string> +#include <memory> +#include <functional> +#include <vector> +#include <iostream> +#include <algorithm> +#include <numeric> + +#include <cstdlib> +#include <cmath> + +#include <time.h> + +#include "PDFDoc.h" +#include "SplashOutputDev.h" +#include "splash/SplashBitmap.h" +#include "TextOutputDev.h" + +namespace +{ + +std::unique_ptr<PDFDoc> openDocument(const char *filePath) { + std::unique_ptr<PDFDoc> document{ + new PDFDoc(new GooString(filePath), nullptr, nullptr, nullptr) + }; + + if (!document->isOk()) { + document.reset(); + } + + return document; +} + +std::unique_ptr<SplashOutputDev> openSplashDevice(PDFDoc *document, SplashColorPtr paperColor) { + std::unique_ptr<SplashOutputDev> device{ + new SplashOutputDev(splashModeXBGR8, 4, gFalse, paperColor) + }; + + if (device) { + device->startDoc(document); + } + + return device; +} + +void displayPageUsingSplash(PDFDoc *document, SplashOutputDev *device, int page, double resolution) { + document->displayPage( + device, + page, + resolution, resolution, 0, + gFalse, gFalse, gFalse + ); + + delete device->takeBitmap(); +} + +bool renderAllPagesUsingSplash(PDFDoc *document, SplashColorPtr paperColor, double resolution) { + const auto device = openSplashDevice(document, paperColor); + + if (!device) { + return false; + } + + for (int page = 1, pageCount = document->getNumPages(); page <= pageCount; ++page) { + displayPageUsingSplash(document, device.get(), page, resolution); + } + + return true; +} + +bool renderPageUsingSplash(PDFDoc *document, int page, SplashColorPtr paperColor, double resolution) { + const auto device = openSplashDevice(document, paperColor); + + if (!device) { + return false; + } + + displayPageUsingSplash(document, device.get(), page, resolution); + + delete device->takeBitmap(); + + return true; +} + +std::unique_ptr<TextOutputDev> openTextDevice() { + std::unique_ptr<TextOutputDev> device{ + new TextOutputDev(nullptr, gTrue, 0.0, gFalse, gFalse) + }; + + if (!device->isOk()) { + device.reset(); + } + + return device; +} + +void displayPageAsText(PDFDoc *document, TextOutputDev *device, int page) { + document->displayPage( + device, + page, + 72.0, 72.0, 0, + gFalse, gFalse, gFalse + ); + + delete device->makeWordList(); +} + +bool renderAllPagesAsText(PDFDoc *document) { + const auto device = openTextDevice(); + + if (!device) { + return false; + } + + for (int page = 1, pageCount = document->getNumPages(); page <= pageCount; ++page) { + displayPageAsText(document, device.get(), page); + } + + return true; +} + +bool renderPageAsText(PDFDoc *document, int page) { + const auto device = openTextDevice(); + + if (!device) { + return false; + } + + displayPageAsText(document, device.get(), page); + + return true; +} + +double compute_accuracy(const std::vector<double>& values) { + if (values.size() < 2) { + return std::numeric_limits<double>::max(); + } + + const auto sum = std::accumulate(values.begin(), values.end(), 0.0); + const auto mean = sum / values.size(); + + const auto variance = std::accumulate( + values.begin(), values.end(), 0.0, + [mean](double variance, double value) { + return variance + (value - mean) * (value - mean); + } + ); + const auto stdev = std::sqrt(variance / (values.size() - 1)); + + return std::abs(stdev / mean); +} + +bool check_page_count(const char* filePath) { + const auto document = openDocument(filePath); + + if (!document) { + return false; + } + + std::cout << document->getNumPages(); + + return true; +} + +bool measure_action(const std::function<bool()>& action, int warmUpIterations, int minIterations, int maxIterations, double targetAccuracy) { + std::vector<double> runtimes; + runtimes.reserve(maxIterations); + + for (int iteration = 1; iteration <= warmUpIterations; ++iteration) { + action(); + } + + for (int iteration = 1; iteration <= maxIterations; ++iteration) { + struct timespec before; + ::clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &before); + + if (!action()) { + return false; + } + + struct timespec after; + ::clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &after); + + runtimes.emplace_back( + (after.tv_sec - before.tv_sec) * 1.0e+6 + + (after.tv_nsec - before.tv_nsec) * 1.0e-3 + ); + + if (iteration >= minIterations && compute_accuracy(runtimes) <= targetAccuracy) { + break; + } + } + + std::cout << "["; + + for (auto runtime = runtimes.begin(); runtime != runtimes.end(); ++runtime) { + if (runtime != runtimes.begin()) { + std::cout << ','; + } + std::cout << *runtime; + } + + std::cout << "]"; + + return true; +} + +} + +int main(int argc, char** argv) { + std::unique_ptr<GlobalParams> globalParams{ + new GlobalParams() + }; + + if (!globalParams) { + return EXIT_FAILURE; + } + + ::globalParams = globalParams.get(); + + if (argc == 2) { + return check_page_count(argv[1]) ? EXIT_SUCCESS : EXIT_FAILURE; + } + else if (argc != 8) { + return EXIT_FAILURE; + } + + const auto warmUpIterations = std::atoi(argv[1]); + const auto minIterations = std::atoi(argv[2]); + const auto maxIterations = std::atoi(argv[3]); + const auto targetAccuracy = std::atof(argv[4]); + const auto mode = std::string{argv[5]}; + const auto filePath = argv[6]; + const auto page = std::atoi(argv[7]); + + if (warmUpIterations < 1 || minIterations < 1 || maxIterations < 1 || minIterations > maxIterations) { + return EXIT_FAILURE; + } + + if (mode != "splash" && mode != "text") { + return EXIT_FAILURE; + } + + SplashColor paperColor; + paperColor[0] = 0xFF; + paperColor[1] = 0xFF; + paperColor[2] = 0xFF; + + const auto resolution = 72.0; + + const auto document = openDocument(filePath); + + if (!document) { + return EXIT_FAILURE; + } + + const auto pageCount = document->getNumPages(); + + if (pageCount < 1 || pageCount < page) { + return EXIT_FAILURE; + } + + std::function<bool()> action; + + if (mode == "splash" && page < 1) { + action = [&document, &paperColor, resolution]() { + return renderAllPagesUsingSplash(document.get(), paperColor, resolution); + }; + } + else if (mode == "splash") { + action = [&document, page, &paperColor, resolution]() { + return renderPageUsingSplash(document.get(), page, paperColor, resolution); + }; + } + else if (mode == "text" && page < 1) { + action = [&document]() { + return renderAllPagesAsText(document.get()); + }; + } + else if (mode == "text") { + action = [&document, page]() { + return renderPageAsText(document.get(), page); + }; + } + + if (!action) { + return EXIT_FAILURE; + } + + return measure_action(action, warmUpIterations, minIterations, maxIterations, targetAccuracy) ? EXIT_SUCCESS : EXIT_FAILURE; +} diff --git a/perftest/measure.py b/perftest/measure.py new file mode 100644 index 00000000..7d0f6202 --- /dev/null +++ b/perftest/measure.py @@ -0,0 +1,120 @@ +import json +import multiprocessing +import os +import pickle +import statistics +import subprocess +import sys +import time +import zlib + +from util import rewrite, reference + +def check_page_count(document): + global driver + + try: + return int(subprocess.check_output([ driver, document ], stderr = subprocess.DEVNULL)) + except (subprocess.CalledProcessError, ValueError): + return 0 + +def measure_command(command): + try: + process = subprocess.Popen(command, stdout = subprocess.PIPE, stderr = subprocess.DEVNULL) + + _, status, resources = os.wait4(process.pid, 0) + + if not os.WIFEXITED(status) or os.WEXITSTATUS(status) != 0: + return ' '.join(command) + + run_times = json.load(process.stdout) + + run_time = (statistics.mean(run_times), statistics.stdev(run_times)) + memory_usage = resources.ru_maxrss + + return (run_time, memory_usage) + except: + return sys.exc_info()[0] + +def measure_task(task): + global driver, mode, warm_up_iterations, min_iterations, max_iterations, target_accuracy + + document, page = task + + command = [ + driver, + str(warm_up_iterations), str(min_iterations), str(max_iterations), + str(target_accuracy), + mode, + document, + str(page) if page else '0' + ] + + return (document, page, measure_command(command)) + +def measure(args): + global driver, mode, warm_up_iterations, min_iterations, max_iterations, target_accuracy + + driver = args.driver + mode = args.mode + warm_up_iterations = args.warm_up_iterations + min_iterations = args.min_iterations + max_iterations = args.max_iterations + target_accuracy = args.target_accuracy + + todo = 0 + + written = rewrite(0, 'Scanning...') + + tasks = [] + + for path, _, files in os.walk(args.documents): + for file in files: + file_path = os.path.join(path, file) + + page_count = check_page_count(file_path) + + if page_count == 0: + continue + + pages = range(1, page_count + 1) if args.pages else [ None ] + + for page in pages: + tasks.append((file_path, page)) + + todo +=1 + if todo % 100 == 0: + written = rewrite(written, 'Found %d...' % todo) + + done = 0 + + written = rewrite(written, '%d/%d (%.1f%%): Measuring...' % (done, todo, 0)) + begin = time.time() + + with multiprocessing.Pool() as pool: + results = {} + + for result in pool.imap(measure_task, tasks): + document, page, measurement = result + + try: + run_time, memory_usage = measurement + except: + rewrite(written, 'Measurement failed: %s\n' % (measurement)) + sys.exit(1) + + entry = results.setdefault(document, {}).setdefault(page, {}) + entry['run_time'] = run_time + entry['memory_usage'] = memory_usage + + done += 1 + if done % max(1, todo // 500) == 0: + written = rewrite(written, '%d/%d (%.1f%%): Measured %s...' % (done, todo, 100 * done / todo, reference(document, page))) + + end = time.time() + rewrite(written, '%d/%d (%.1f%%): Measurement took %s.\n' % (done, todo, 100, time.strftime('%T', time.gmtime(end - begin)))) + + with open(args.results, 'wb') as file: + data = pickle.dumps(results) + data = zlib.compress(data) + file.write(data) diff --git a/perftest/poppler-perftest b/perftest/poppler-perftest new file mode 100755 index 00000000..517ba48c --- /dev/null +++ b/perftest/poppler-perftest @@ -0,0 +1,40 @@ +#!/usr/bin/env python3 + +import argparse +import sys + +from measure import measure +from compare import compare + +def main(args): + parser = argparse.ArgumentParser() + subparsers = parser.add_subparsers() + + measure_parser = subparsers.add_parser('measure') + measure_parser.set_defaults(func=measure) + measure_parser.add_argument('--driver', default='./driver') + measure_parser.add_argument('--mode', choices=[ 'splash', 'text' ], default='splash') + measure_parser.add_argument('--pages', action='store_true') + measure_parser.add_argument('--warm_up_iterations', type = int, default=5) + measure_parser.add_argument('--min_iterations', type=int, default=5) + measure_parser.add_argument('--max_iterations', type=int, default=25) + measure_parser.add_argument('--target_accuracy', type=float, default=0.01) + measure_parser.add_argument('documents') + measure_parser.add_argument('results') + + compare_parser = subparsers.add_parser('compare') + compare_parser.set_defaults(func=compare) + compare_parser.add_argument('--threshold', type=float, default=0.05) + compare_parser.add_argument('results') + compare_parser.add_argument('reference_results') + + try: + args = parser.parse_args(args) + args.func(args) + return 0 + except AttributeError: + parser.print_help() + return 1 + +if __name__ == '__main__': + sys.exit(main(sys.argv[1:])) diff --git a/perftest/util.py b/perftest/util.py new file mode 100644 index 00000000..2d14a37f --- /dev/null +++ b/perftest/util.py @@ -0,0 +1,20 @@ +import math +import sys + +def rewrite(written, message): + sys.stdout.write('\r' * written + ' ' * written + '\r') + written = sys.stdout.write(message) + sys.stdout.flush() + return written + +def reference(document, page): + if page is not None: + return '%s[%d]' % (document, page) + else: + return document + +def reldev(x, y): + return (x - y) / y + +def maxabs(x, y): + return max(abs(x), abs(y)) -- 2.16.2
From 6991d25814abb2be7692d9def951210f3515c05d Mon Sep 17 00:00:00 2001 From: Adam Reichold <adam.reich...@t-online.de> Date: Tue, 20 Feb 2018 09:07:17 +0100 Subject: [PATCH 1/8] Add conversion methods between GooString and std::string. --- goo/GooString.h | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/goo/GooString.h b/goo/GooString.h index 49f2888e..c694de10 100644 --- a/goo/GooString.h +++ b/goo/GooString.h @@ -38,6 +38,7 @@ #include <stdarg.h> #include <stdlib.h> // for NULL +#include <string> #include "gtypes.h" #ifdef __clang__ @@ -170,6 +171,10 @@ public: // The caller owns the return value GooString *sanitizedName(GBool psmode) const; + // Conversion from and to std::string + explicit GooString(const std::string& str) : GooString(str.data(), str.size()) {} + std::string toStr() const { return std::string(getCString(), getLength()); } + private: GooString(const GooString &other); GooString& operator=(const GooString &other); -- 2.16.2 From 1dc0ddbc864ea5f59904a0802e5cade12621b432 Mon Sep 17 00:00:00 2001 From: Adam Reichold <adam.reich...@t-online.de> Date: Sun, 18 Feb 2018 16:10:16 +0100 Subject: [PATCH 2/8] Replace GooHash by std::unordered_map in FoFiTrueType. --- fofi/FoFiTrueType.cc | 38 ++++++++++++++------------------------ fofi/FoFiTrueType.h | 7 ++++--- 2 files changed, 18 insertions(+), 27 deletions(-) diff --git a/fofi/FoFiTrueType.cc b/fofi/FoFiTrueType.cc index 76299acc..807f7c7a 100644 --- a/fofi/FoFiTrueType.cc +++ b/fofi/FoFiTrueType.cc @@ -43,7 +43,6 @@ #include "goo/gmem.h" #include "goo/GooLikely.h" #include "goo/GooString.h" -#include "goo/GooHash.h" #include "FoFiType1C.h" #include "FoFiTrueType.h" #include "poppler/Error.h" @@ -305,7 +304,6 @@ FoFiTrueType::FoFiTrueType(char *fileA, int lenA, GBool freeFileDataA, int faceI nTables = 0; cmaps = nullptr; nCmaps = 0; - nameToGID = nullptr; parsedOk = gFalse; faceIndex = faceIndexA; gsubFeatureTable = 0; @@ -317,9 +315,6 @@ FoFiTrueType::FoFiTrueType(char *fileA, int lenA, GBool freeFileDataA, int faceI FoFiTrueType::~FoFiTrueType() { gfree(tables); gfree(cmaps); - if (nameToGID) { - delete nameToGID; - } } int FoFiTrueType::getNumCmaps() { @@ -442,11 +437,12 @@ int FoFiTrueType::mapCodeToGID(int i, Guint c) { return gid; } -int FoFiTrueType::mapNameToGID(char *name) { - if (!nameToGID) { +int FoFiTrueType::mapNameToGID(char *name) const { + const auto gid = nameToGID.find(name); + if (gid == nameToGID.end()) { return 0; } - return nameToGID->lookupInt(name); + return gid->second; } GBool FoFiTrueType::getCFFBlock(char **start, int *length) { @@ -1450,7 +1446,7 @@ void FoFiTrueType::parse() { } void FoFiTrueType::readPostTable() { - GooString *name; + std::string name; int tablePos, postFmt, stringIdx, stringPos; GBool ok; int i, j, n, m; @@ -1465,12 +1461,12 @@ void FoFiTrueType::readPostTable() { goto err; } if (postFmt == 0x00010000) { - nameToGID = new GooHash(gTrue); + nameToGID.reserve(258); for (i = 0; i < 258; ++i) { - nameToGID->add(new GooString(macGlyphNames[i]), i); + nameToGID.emplace(macGlyphNames[i], i); } } else if (postFmt == 0x00020000) { - nameToGID = new GooHash(gTrue); + nameToGID.reserve(258); n = getU16BE(tablePos + 32, &ok); if (!ok) { goto err; @@ -1484,8 +1480,7 @@ void FoFiTrueType::readPostTable() { ok = gTrue; j = getU16BE(tablePos + 34 + 2*i, &ok); if (j < 258) { - nameToGID->removeInt(macGlyphNames[j]); - nameToGID->add(new GooString(macGlyphNames[j]), i); + nameToGID[macGlyphNames[j]] = i; } else { j -= 258; if (j != stringIdx) { @@ -1500,23 +1495,21 @@ void FoFiTrueType::readPostTable() { if (!ok || !checkRegion(stringPos + 1, m)) { continue; } - name = new GooString((char *)&file[stringPos + 1], m); - nameToGID->removeInt(name); - nameToGID->add(name, i); + name.assign((char *)&file[stringPos + 1], m); + nameToGID[name] = i; ++stringIdx; stringPos += 1 + m; } } } else if (postFmt == 0x00028000) { - nameToGID = new GooHash(gTrue); + nameToGID.reserve(258); for (i = 0; i < nGlyphs; ++i) { j = getU8(tablePos + 32 + i, &ok); if (!ok) { continue; } if (j < 258) { - nameToGID->removeInt(macGlyphNames[j]); - nameToGID->add(new GooString(macGlyphNames[j]), i); + nameToGID[macGlyphNames[j]] = i; } } } @@ -1524,10 +1517,7 @@ void FoFiTrueType::readPostTable() { return; err: - if (nameToGID) { - delete nameToGID; - nameToGID = nullptr; - } + nameToGID.clear(); } int FoFiTrueType::seekTable(const char *tag) { diff --git a/fofi/FoFiTrueType.h b/fofi/FoFiTrueType.h index 0c42c837..432fe4bb 100644 --- a/fofi/FoFiTrueType.h +++ b/fofi/FoFiTrueType.h @@ -32,11 +32,12 @@ #endif #include "stddef.h" +#include <unordered_map> +#include <string> #include "goo/gtypes.h" #include "FoFiBase.h" class GooString; -class GooHash; struct TrueTypeTable; struct TrueTypeCmap; @@ -82,7 +83,7 @@ public: // Returns the GID corresponding to <name> according to the post // table. Returns 0 if there is no mapping for <name> or if the // font does not have a post table. - int mapNameToGID(char *name); + int mapNameToGID(char *name) const; // Return the mapping from CIDs to GIDs, and return the number of // CIDs in *<nCIDs>. This is only useful for CID fonts. (Only @@ -199,7 +200,7 @@ private: int nGlyphs; int locaFmt; int bbox[4]; - GooHash *nameToGID; + std::unordered_map<std::string,int> nameToGID; GBool openTypeCFF; GBool parsedOk; -- 2.16.2 From 652767270234c4408af5a1fa5508b95d253e2e8b Mon Sep 17 00:00:00 2001 From: Adam Reichold <adam.reich...@t-online.de> Date: Sun, 18 Feb 2018 16:10:59 +0100 Subject: [PATCH 3/8] Replace GooHash by std::unordered_map in GlobalParams. --- poppler/CharCodeToUnicode.cc | 8 +- poppler/CharCodeToUnicode.h | 2 +- poppler/GlobalParams.cc | 217 ++++++++++++++----------------------------- poppler/GlobalParams.h | 33 +++---- 4 files changed, 93 insertions(+), 167 deletions(-) diff --git a/poppler/CharCodeToUnicode.cc b/poppler/CharCodeToUnicode.cc index 17df0292..a5fbe0c3 100644 --- a/poppler/CharCodeToUnicode.cc +++ b/poppler/CharCodeToUnicode.cc @@ -125,7 +125,7 @@ CharCodeToUnicode *CharCodeToUnicode::makeIdentityMapping() { return ctu; } -CharCodeToUnicode *CharCodeToUnicode::parseCIDToUnicode(GooString *fileName, +CharCodeToUnicode *CharCodeToUnicode::parseCIDToUnicode(const char *fileName, GooString *collection) { FILE *f; Unicode *mapA; @@ -134,8 +134,8 @@ CharCodeToUnicode *CharCodeToUnicode::parseCIDToUnicode(GooString *fileName, Unicode u; CharCodeToUnicode *ctu; - if (!(f = openFile(fileName->getCString(), "r"))) { - error(errIO, -1, "Couldn't open cidToUnicode file '{0:t}'", + if (!(f = openFile(fileName, "r"))) { + error(errIO, -1, "Couldn't open cidToUnicode file '{0:s}'", fileName); return nullptr; } @@ -152,7 +152,7 @@ CharCodeToUnicode *CharCodeToUnicode::parseCIDToUnicode(GooString *fileName, if (sscanf(buf, "%x", &u) == 1) { mapA[mapLenA] = u; } else { - error(errSyntaxWarning, -1, "Bad line ({0:d}) in cidToUnicode file '{1:t}'", + error(errSyntaxWarning, -1, "Bad line ({0:d}) in cidToUnicode file '{1:s}'", (int)(mapLenA + 1), fileName); mapA[mapLenA] = 0; } diff --git a/poppler/CharCodeToUnicode.h b/poppler/CharCodeToUnicode.h index 13572217..54e9139f 100644 --- a/poppler/CharCodeToUnicode.h +++ b/poppler/CharCodeToUnicode.h @@ -55,7 +55,7 @@ public: // Read the CID-to-Unicode mapping for <collection> from the file // specified by <fileName>. Sets the initial reference count to 1. // Returns NULL on failure. - static CharCodeToUnicode *parseCIDToUnicode(GooString *fileName, + static CharCodeToUnicode *parseCIDToUnicode(const char *fileName, GooString *collection); // Create a Unicode-to-Unicode mapping from the file specified by diff --git a/poppler/GlobalParams.cc b/poppler/GlobalParams.cc index 50b7eee2..8532285f 100644 --- a/poppler/GlobalParams.cc +++ b/poppler/GlobalParams.cc @@ -65,7 +65,6 @@ #include "goo/gmem.h" #include "goo/GooString.h" #include "goo/GooList.h" -#include "goo/GooHash.h" #include "goo/gfile.h" #include "Error.h" #include "NameToCharCode.h" @@ -548,9 +547,6 @@ Plugin::~Plugin() { GlobalParams::GlobalParams(const char *customPopplerDataDir) : popplerDataDir(customPopplerDataDir) { - UnicodeMap *map; - int i; - #ifdef MULTITHREADED gInitMutex(&mutex); gInitMutex(&unicodeMapCacheMutex); @@ -562,7 +558,7 @@ GlobalParams::GlobalParams(const char *customPopplerDataDir) // scan the encoding in reverse because we want the lowest-numbered // index for each char name ('space' is encoded twice) macRomanReverseMap = new NameToCharCode(); - for (i = 255; i >= 0; --i) { + for (int i = 255; i >= 0; --i) { if (macRomanEncoding[i]) { macRomanReverseMap->add(macRomanEncoding[i], (CharCode)i); } @@ -573,13 +569,7 @@ GlobalParams::GlobalParams(const char *customPopplerDataDir) #endif nameToUnicodeZapfDingbats = new NameToCharCode(); nameToUnicodeText = new NameToCharCode(); - cidToUnicodes = new GooHash(gTrue); - unicodeToUnicodes = new GooHash(gTrue); - residentUnicodeMaps = new GooHash(); - unicodeMaps = new GooHash(gTrue); - cMapDirs = new GooHash(gTrue); toUnicodeDirs = new GooList(); - fontFiles = new GooHash(gTrue); sysFonts = new SysFontList(); psExpandSmaller = gFalse; psShrinkLarger = gTrue; @@ -612,31 +602,39 @@ GlobalParams::GlobalParams(const char *customPopplerDataDir) #endif // set up the initial nameToUnicode tables - for (i = 0; nameToUnicodeZapfDingbatsTab[i].name; ++i) { + for (int i = 0; nameToUnicodeZapfDingbatsTab[i].name; ++i) { nameToUnicodeZapfDingbats->add(nameToUnicodeZapfDingbatsTab[i].name, nameToUnicodeZapfDingbatsTab[i].u); } - for (i = 0; nameToUnicodeTextTab[i].name; ++i) { + for (int i = 0; nameToUnicodeTextTab[i].name; ++i) { nameToUnicodeText->add(nameToUnicodeTextTab[i].name, nameToUnicodeTextTab[i].u); } // set up the residentUnicodeMaps table - map = new UnicodeMap("Latin1", gFalse, - latin1UnicodeMapRanges, latin1UnicodeMapLen); - residentUnicodeMaps->add(map->getEncodingName(), map); - map = new UnicodeMap("ASCII7", gFalse, - ascii7UnicodeMapRanges, ascii7UnicodeMapLen); - residentUnicodeMaps->add(map->getEncodingName(), map); - map = new UnicodeMap("Symbol", gFalse, - symbolUnicodeMapRanges, symbolUnicodeMapLen); - residentUnicodeMaps->add(map->getEncodingName(), map); - map = new UnicodeMap("ZapfDingbats", gFalse, zapfDingbatsUnicodeMapRanges, - zapfDingbatsUnicodeMapLen); - residentUnicodeMaps->add(map->getEncodingName(), map); - map = new UnicodeMap("UTF-8", gTrue, &mapUTF8); - residentUnicodeMaps->add(map->getEncodingName(), map); - map = new UnicodeMap("UTF-16", gTrue, &mapUTF16); - residentUnicodeMaps->add(map->getEncodingName(), map); + + const auto emplaceRangeMap = [&](const char* encodingName, GBool unicodeOut, UnicodeMapRange* ranges, int len) { + residentUnicodeMaps.emplace( + std::piecewise_construct, + std::forward_as_tuple(encodingName), + std::forward_as_tuple(encodingName, unicodeOut, ranges, len) + ); + }; + + const auto emplaceFuncMap = [&](const char* encodingName, GBool unicodeOut, UnicodeMapFunc func) { + residentUnicodeMaps.emplace( + std::piecewise_construct, + std::forward_as_tuple(encodingName), + std::forward_as_tuple(encodingName, unicodeOut, func) + ); + }; + + residentUnicodeMaps.reserve(6); + emplaceRangeMap("Latin1", gFalse, latin1UnicodeMapRanges, latin1UnicodeMapLen); + emplaceRangeMap("ASCII7", gFalse, ascii7UnicodeMapRanges, ascii7UnicodeMapLen); + emplaceRangeMap("Symbol", gFalse, symbolUnicodeMapRanges, symbolUnicodeMapLen); + emplaceRangeMap("ZapfDingbats", gFalse, zapfDingbatsUnicodeMapRanges, zapfDingbatsUnicodeMapLen); + emplaceFuncMap("UTF-8", gTrue, &mapUTF8); + emplaceFuncMap("UTF-16", gTrue, &mapUTF16); scanEncodingDirs(); } @@ -717,34 +715,16 @@ void GlobalParams::parseNameToUnicode(GooString *name) { fclose(f); } -void GlobalParams::addCIDToUnicode(GooString *collection, - GooString *fileName) { - GooString *old; - - if ((old = (GooString *)cidToUnicodes->remove(collection))) { - delete old; - } - cidToUnicodes->add(collection->copy(), fileName->copy()); +void GlobalParams::addCIDToUnicode(GooString *collection, GooString *fileName) { + cidToUnicodes[collection->toStr()] = fileName->toStr(); } -void GlobalParams::addUnicodeMap(GooString *encodingName, GooString *fileName) -{ - GooString *old; - - if ((old = (GooString *)unicodeMaps->remove(encodingName))) { - delete old; - } - unicodeMaps->add(encodingName->copy(), fileName->copy()); +void GlobalParams::addUnicodeMap(GooString *encodingName, GooString *fileName) { + unicodeMaps[encodingName->toStr()] = fileName->toStr(); } void GlobalParams::addCMapDir(GooString *collection, GooString *dir) { - GooList *list; - - if (!(list = (GooList *)cMapDirs->lookup(collection))) { - list = new GooList(); - cMapDirs->add(collection->copy(), list); - } - list->append(dir->copy()); + cMapDirs.emplace(collection->toStr(), dir->toStr()); } GBool GlobalParams::parseYesNo2(const char *token, GBool *flag) { @@ -765,28 +745,13 @@ GlobalParams::~GlobalParams() { delete nameToUnicodeZapfDingbats; delete nameToUnicodeText; - deleteGooHash(cidToUnicodes, GooString); - deleteGooHash(unicodeToUnicodes, GooString); - deleteGooHash(residentUnicodeMaps, UnicodeMap); - deleteGooHash(unicodeMaps, GooString); deleteGooList(toUnicodeDirs, GooString); - deleteGooHash(fontFiles, GooString); #ifdef _WIN32 deleteGooHash(substFiles, GooString); #endif delete sysFonts; delete textEncoding; - GooHashIter *iter; - GooString *key; - cMapDirs->startIter(&iter); - void *val; - while (cMapDirs->getNext(&iter, &key, &val)) { - GooList* list = (GooList*)val; - deleteGooList(list, GooString); - } - delete cMapDirs; - delete cidToUnicodeCache; delete unicodeToUnicodeCache; delete unicodeMapCache; @@ -827,55 +792,48 @@ Unicode GlobalParams::mapNameToUnicodeText(const char *charName) { } UnicodeMap *GlobalParams::getResidentUnicodeMap(GooString *encodingName) { - UnicodeMap *map; + UnicodeMap *map = nullptr; lockGlobalParams; - map = (UnicodeMap *)residentUnicodeMaps->lookup(encodingName); - unlockGlobalParams; - if (map) { + const auto unicodeMap = residentUnicodeMaps.find(encodingName->toStr()); + if (unicodeMap != residentUnicodeMaps.end()) { + map = &unicodeMap->second; map->incRefCnt(); } + unlockGlobalParams; + return map; } FILE *GlobalParams::getUnicodeMapFile(GooString *encodingName) { - GooString *fileName; - FILE *f; + FILE *file = nullptr; lockGlobalParams; - if ((fileName = (GooString *)unicodeMaps->lookup(encodingName))) { - f = openFile(fileName->getCString(), "r"); - } else { - f = nullptr; + const auto unicodeMap = unicodeMaps.find(encodingName->toStr()); + if (unicodeMap != unicodeMaps.end()) { + file = openFile(unicodeMap->second.c_str(), "r"); } unlockGlobalParams; - return f; + + return file; } FILE *GlobalParams::findCMapFile(GooString *collection, GooString *cMapName) { - GooList *list; - GooString *dir; - GooString *fileName; - FILE *f; - int i; + FILE *file = nullptr; lockGlobalParams; - if (!(list = (GooList *)cMapDirs->lookup(collection))) { - unlockGlobalParams; - return nullptr; - } - for (i = 0; i < list->getLength(); ++i) { - dir = (GooString *)list->get(i); - fileName = appendToPath(dir->copy(), cMapName->getCString()); - f = openFile(fileName->getCString(), "r"); - delete fileName; - if (f) { - unlockGlobalParams; - return f; + const auto cMapDirs = this->cMapDirs.equal_range(collection->toStr()); + for (auto cMapDir = cMapDirs.first; cMapDir != cMapDirs.second; ++cMapDir) { + auto* const path = new GooString(cMapDir->second); + appendToPath(path, cMapName->getCString()); + file = openFile(path->getCString(), "r"); + delete path; + if (file) { + break; } } unlockGlobalParams; - return nullptr; + return file; } FILE *GlobalParams::findToUnicodeFile(GooString *name) { @@ -1072,17 +1030,16 @@ static FcPattern *buildFcPattern(GfxFont *font, GooString *base14Name) #endif GooString *GlobalParams::findFontFile(GooString *fontName) { - GooString *path; + GooString *path = nullptr; setupBaseFonts(nullptr); lockGlobalParams; - if ((path = (GooString *)fontFiles->lookup(fontName))) { - path = path->copy(); - unlockGlobalParams; - return path; + const auto fontFile = fontFiles.find(fontName->toStr()); + if (fontFile != fontFiles.end()) { + path = new GooString(fontFile->second); } unlockGlobalParams; - return nullptr; + return path; } /* if you can't or don't want to use Fontconfig, you need to implement @@ -1090,7 +1047,7 @@ GooString *GlobalParams::findFontFile(GooString *fontName) { */ #ifdef WITH_FONTCONFIGURATION_FONTCONFIG // not needed for fontconfig -void GlobalParams::setupBaseFonts(char *dir) { +void GlobalParams::setupBaseFonts(char *) { } GooString *GlobalParams::findBase14FontFile(GooString *base14Name, GfxFont *font) { @@ -1453,40 +1410,15 @@ GBool GlobalParams::getErrQuiet() { } CharCodeToUnicode *GlobalParams::getCIDToUnicode(GooString *collection) { - GooString *fileName; CharCodeToUnicode *ctu; lockGlobalParams; if (!(ctu = cidToUnicodeCache->getCharCodeToUnicode(collection))) { - if ((fileName = (GooString *)cidToUnicodes->lookup(collection)) && - (ctu = CharCodeToUnicode::parseCIDToUnicode(fileName, collection))) { - cidToUnicodeCache->add(ctu); - } - } - unlockGlobalParams; - return ctu; -} - -CharCodeToUnicode *GlobalParams::getUnicodeToUnicode(GooString *fontName) { - lockGlobalParams; - GooHashIter *iter; - unicodeToUnicodes->startIter(&iter); - GooString *fileName = nullptr; - GooString *fontPattern; - void *val; - while (!fileName && unicodeToUnicodes->getNext(&iter, &fontPattern, &val)) { - if (strstr(fontName->getCString(), fontPattern->getCString())) { - unicodeToUnicodes->killIter(&iter); - fileName = (GooString*)val; - } - } - CharCodeToUnicode *ctu = nullptr; - if (fileName) { - ctu = unicodeToUnicodeCache->getCharCodeToUnicode(fileName); - if (!ctu) { - ctu = CharCodeToUnicode::parseUnicodeToUnicode(fileName); - if (ctu) - unicodeToUnicodeCache->add(ctu); + const auto cidToUnicode = cidToUnicodes.find(collection->toStr()); + if (cidToUnicode != cidToUnicodes.end()) { + if((ctu = CharCodeToUnicode::parseCIDToUnicode(cidToUnicode->second.c_str(), collection))) { + cidToUnicodeCache->add(ctu); + } } } unlockGlobalParams; @@ -1523,20 +1455,13 @@ UnicodeMap *GlobalParams::getTextEncoding() { GooList *GlobalParams::getEncodingNames() { - GooList *result = new GooList; - GooHashIter *iter; - GooString *key; - void *val; - residentUnicodeMaps->startIter(&iter); - while (residentUnicodeMaps->getNext(&iter, &key, &val)) { - result->append(key); + auto* const result = new GooList; + for (const auto& unicodeMap : residentUnicodeMaps) { + result->append(new GooString(unicodeMap.first)); } - residentUnicodeMaps->killIter(&iter); - unicodeMaps->startIter(&iter); - while (unicodeMaps->getNext(&iter, &key, &val)) { - result->append(key); + for (const auto& unicodeMap : unicodeMaps) { + result->append(new GooString(unicodeMap.first)); } - unicodeMaps->killIter(&iter); return result; } @@ -1546,7 +1471,7 @@ GooList *GlobalParams::getEncodingNames() void GlobalParams::addFontFile(GooString *fontName, GooString *path) { lockGlobalParams; - fontFiles->add(fontName, path); + fontFiles[fontName->toStr()] = path->toStr(); unlockGlobalParams; } diff --git a/poppler/GlobalParams.h b/poppler/GlobalParams.h index e3d660cf..09a39814 100644 --- a/poppler/GlobalParams.h +++ b/poppler/GlobalParams.h @@ -44,6 +44,9 @@ #include <stdio.h> #include "goo/gtypes.h" #include "CharTypes.h" +#include "UnicodeMap.h" +#include <unordered_map> +#include <string> #ifdef MULTITHREADED #include "goo/GooMutex.h" @@ -51,11 +54,9 @@ class GooString; class GooList; -class GooHash; class NameToCharCode; class CharCodeToUnicode; class CharCodeToUnicodeCache; -class UnicodeMap; class UnicodeMapCache; class CMap; class CMapCache; @@ -147,7 +148,7 @@ public: GBool getErrQuiet(); CharCodeToUnicode *getCIDToUnicode(GooString *collection); - CharCodeToUnicode *getUnicodeToUnicode(GooString *fontName); + CharCodeToUnicode *getUnicodeToUnicode(GooString *) { return nullptr; } UnicodeMap *getUnicodeMap(GooString *encodingName); CMap *getCMap(GooString *collection, GooString *cMapName, Stream *stream = NULL); UnicodeMap *getTextEncoding(); @@ -199,24 +200,24 @@ private: nameToUnicodeZapfDingbats; NameToCharCode * // mapping from char name to Unicode for text nameToUnicodeText; // extraction - GooHash *cidToUnicodes; // files for mappings from char collections - // to Unicode, indexed by collection name - // [GooString] - GooHash *unicodeToUnicodes; // files for Unicode-to-Unicode mappings, - // indexed by font name pattern [GooString] - GooHash *residentUnicodeMaps; // mappings from Unicode to char codes, - // indexed by encoding name [UnicodeMap] - GooHash *unicodeMaps; // files for mappings from Unicode to char - // codes, indexed by encoding name [GooString] - GooHash *cMapDirs; // list of CMap dirs, indexed by collection - // name [GooList[GooString]] + // files for mappings from char collections + // to Unicode, indexed by collection name + std::unordered_map<std::string, std::string> cidToUnicodes; + // mappings from Unicode to char codes, + // indexed by encoding name + std::unordered_map<std::string, UnicodeMap> residentUnicodeMaps; + // files for mappings from Unicode to char + // codes, indexed by encoding name + std::unordered_map<std::string, std::string> unicodeMaps; + // list of CMap dirs, indexed by collection + std::unordered_multimap<std::string, std::string> cMapDirs; GooList *toUnicodeDirs; // list of ToUnicode CMap dirs [GooString] GBool baseFontsInitialized; #ifdef _WIN32 GooHash *substFiles; // windows font substitutes (for CID fonts) #endif - GooHash *fontFiles; // font files: font name mapped to path - // [GString] + // font files: font name mapped to path + std::unordered_map<std::string, std::string> fontFiles; SysFontList *sysFonts; // system fonts GBool psExpandSmaller; // expand smaller pages to fill paper GBool psShrinkLarger; // shrink larger pages to fit paper -- 2.16.2 From c92c4731743ab74386593eb489b9e1578187f1b1 Mon Sep 17 00:00:00 2001 From: Adam Reichold <adam.reich...@t-online.de> Date: Sun, 18 Feb 2018 16:11:29 +0100 Subject: [PATCH 4/8] Replace GooHash by std::unordered_map in OutputDev. --- poppler/Gfx.cc | 19 +++---------------- poppler/OutputDev.cc | 14 +++----------- poppler/OutputDev.h | 14 ++++++++------ poppler/PSOutputDev.h | 2 +- poppler/ProfileData.cc | 9 --------- poppler/ProfileData.h | 24 +++++++++--------------- test/pdf-inspector.cc | 16 ++++------------ 7 files changed, 28 insertions(+), 70 deletions(-) diff --git a/poppler/Gfx.cc b/poppler/Gfx.cc index 4f6c33f8..0d7b072e 100644 --- a/poppler/Gfx.cc +++ b/poppler/Gfx.cc @@ -59,7 +59,6 @@ #include <memory> #include "goo/gmem.h" #include "goo/GooTimer.h" -#include "goo/GooHash.h" #include "GlobalParams.h" #include "CharTypes.h" #include "Object.h" @@ -739,21 +738,9 @@ void Gfx::go(GBool topLevel) { // Update the profile information if (unlikely(profileCommands)) { - GooHash *hash; - - hash = out->getProfileHash (); - if (hash) { - GooString *cmd_g; - ProfileData *data_p; - - cmd_g = new GooString (obj.getCmd()); - data_p = (ProfileData *)hash->lookup (cmd_g); - if (data_p == nullptr) { - data_p = new ProfileData(); - hash->add (cmd_g, data_p); - } - - data_p->addElement(timer->getElapsed ()); + if (auto* const hash = out->getProfileHash()) { + auto& data = (*hash)[obj.getCmd()]; + data.addElement(timer->getElapsed()); } delete timer; } diff --git a/poppler/OutputDev.cc b/poppler/OutputDev.cc index 8afa7052..fe09bd3d 100644 --- a/poppler/OutputDev.cc +++ b/poppler/OutputDev.cc @@ -36,7 +36,6 @@ #include "Stream.h" #include "GfxState.h" #include "OutputDev.h" -#include "goo/GooHash.h" //------------------------------------------------------------------------ // OutputDev @@ -179,18 +178,11 @@ void OutputDev::opiEnd(GfxState *state, Dict *opiDict) { #endif void OutputDev::startProfile() { - if (profileHash) - delete profileHash; - - profileHash = new GooHash (true); + profileHash.reset(new std::unordered_map<std::string, ProfileData>); } -GooHash *OutputDev::endProfile() { - GooHash *profile = profileHash; - - profileHash = nullptr; - - return profile; +std::unique_ptr<std::unordered_map<std::string, ProfileData>> OutputDev::endProfile() { + return std::move(profileHash); } #ifdef USE_CMS diff --git a/poppler/OutputDev.h b/poppler/OutputDev.h index aca84cda..7e29b0f9 100644 --- a/poppler/OutputDev.h +++ b/poppler/OutputDev.h @@ -42,10 +42,13 @@ #include "CharTypes.h" #include "Object.h" #include "PopplerCache.h" +#include "ProfileData.h" +#include <memory> +#include <unordered_map> +#include <string> class Annot; class Dict; -class GooHash; class GooString; class GfxState; class Gfx; @@ -79,7 +82,6 @@ public: : iccColorSpaceCache(5) #endif { - profileHash = nullptr; } // Destructor. @@ -327,9 +329,9 @@ public: virtual void psXObject(Stream * /*psStream*/, Stream * /*level1Stream*/) {} //----- Profiling - virtual void startProfile(); - virtual GooHash *getProfileHash() {return profileHash; } - virtual GooHash *endProfile(); + void startProfile(); + std::unordered_map<std::string, ProfileData>* getProfileHash() const { return profileHash.get(); } + std::unique_ptr<std::unordered_map<std::string, ProfileData>> endProfile(); //----- transparency groups and soft masks virtual GBool checkTransparencyGroup(GfxState * /*state*/, GBool /*knockout*/) { return gTrue; } @@ -359,7 +361,7 @@ private: double defCTM[6]; // default coordinate transform matrix double defICTM[6]; // inverse of default CTM - GooHash *profileHash; + std::unique_ptr<std::unordered_map<std::string, ProfileData>> profileHash; #ifdef USE_CMS PopplerCache iccColorSpaceCache; diff --git a/poppler/PSOutputDev.h b/poppler/PSOutputDev.h index 330b81b3..4ec2cdd7 100644 --- a/poppler/PSOutputDev.h +++ b/poppler/PSOutputDev.h @@ -47,7 +47,7 @@ #include <map> #include <vector> -class GHooash; +class GooHash; class PDFDoc; class XRef; class Function; diff --git a/poppler/ProfileData.cc b/poppler/ProfileData.cc index a0c44747..1044913c 100644 --- a/poppler/ProfileData.cc +++ b/poppler/ProfileData.cc @@ -12,21 +12,12 @@ #pragma implementation #endif -#include <stdlib.h> -#include <stddef.h> #include "ProfileData.h" //------------------------------------------------------------------------ // ProfileData //------------------------------------------------------------------------ -ProfileData::ProfileData() { - count = 0; - total = 0.0; - min = 0.0; - max = 0.0; -} - void ProfileData::addElement (double elapsed) { if (count == 0) { diff --git a/poppler/ProfileData.h b/poppler/ProfileData.h index 418ee010..30b9a484 100644 --- a/poppler/ProfileData.h +++ b/poppler/ProfileData.h @@ -19,23 +19,17 @@ class ProfileData { public: - - // Constructor. - ProfileData (); - - // Destructor. - ~ProfileData() {} - void addElement (double elapsed); - int getCount () { return count; } - double getTotal () { return total; } - double getMin () { return max; } - double getMax () { return max; } + + int getCount () const { return count; } + double getTotal () const { return total; } + double getMin () const { return max; } + double getMax () const { return max; } private: - int count; // size of <elems> array - double total; // number of elements in array - double min; // reference count - double max; // reference count + int count = 0; // size of <elems> array + double total = 0.0; // number of elements in array + double min = 0.0; // reference count + double max = 0.0; // reference count }; #endif diff --git a/test/pdf-inspector.cc b/test/pdf-inspector.cc index d672add1..a1015aaf 100644 --- a/test/pdf-inspector.cc +++ b/test/pdf-inspector.cc @@ -13,7 +13,6 @@ #endif #include <goo/gmem.h> -#include <goo/GooHash.h> #include <goo/GooTimer.h> #include <splash/SplashTypes.h> #include <splash/SplashBitmap.h> @@ -214,10 +213,6 @@ PdfInspector::on_analyze_clicked (GtkWidget *widget, PdfInspector *inspector) void PdfInspector::analyze_page (int page) { - GooHashIter *iter; - GooHash *hash; - GooString *key; - void *p; GtkWidget *label; char *text; cairo_t *cr; @@ -245,24 +240,21 @@ PdfInspector::analyze_page (int page) g_free (text); // Individual times; - hash = output->endProfile (); - hash->startIter(&iter); - while (hash->getNext(&iter, &key, &p)) + auto hash = output->endProfile (); + for (const auto& kvp : *hash) { GtkTreeIter tree_iter; - ProfileData *data_p = (ProfileData *) p; + const auto* const data_p = &kvp.second; gtk_list_store_append (GTK_LIST_STORE (model), &tree_iter); gtk_list_store_set (GTK_LIST_STORE (model), &tree_iter, - OP_STRING, key->getCString(), + OP_STRING, kvp.first.c_str (), OP_COUNT, data_p->getCount (), OP_TOTAL, data_p->getTotal (), OP_MIN, data_p->getMin (), OP_MAX, data_p->getMax (), -1); } - hash->killIter(&iter); - deleteGooHash (hash, ProfileData); } void -- 2.16.2 From 347ced8b22a3cb95c52f472fe88b17456e1deb39 Mon Sep 17 00:00:00 2001 From: Adam Reichold <adam.reich...@t-online.de> Date: Sun, 18 Feb 2018 16:11:55 +0100 Subject: [PATCH 5/8] Replace GooHash by std::unordered_map in PSOutputDev. --- poppler/PSOutputDev.cc | 30 ++++++++++-------------------- poppler/PSOutputDev.h | 8 +++++--- 2 files changed, 15 insertions(+), 23 deletions(-) diff --git a/poppler/PSOutputDev.cc b/poppler/PSOutputDev.cc index 2f55557c..ca3a55b5 100644 --- a/poppler/PSOutputDev.cc +++ b/poppler/PSOutputDev.cc @@ -52,7 +52,6 @@ #include <algorithm> #include "goo/GooString.h" #include "goo/GooList.h" -#include "goo/GooHash.h" #include "poppler-config.h" #include "GlobalParams.h" #include "Object.h" @@ -1107,8 +1106,6 @@ PSOutputDev::PSOutputDev(const char *fileName, PDFDoc *doc, customCodeCbkData = customCodeCbkDataA; fontIDs = nullptr; - fontNames = new GooHash(gTrue); - fontMaxValidGlyph = new GooHash(gTrue); t1FontNames = nullptr; font8Info = nullptr; font16Enc = nullptr; @@ -1176,8 +1173,6 @@ PSOutputDev::PSOutputDev(PSOutputFunc outputFuncA, void *outputStreamA, customCodeCbkData = customCodeCbkDataA; fontIDs = nullptr; - fontNames = new GooHash(gTrue); - fontMaxValidGlyph = new GooHash(gTrue); t1FontNames = nullptr; font8Info = nullptr; font16Enc = nullptr; @@ -1408,7 +1403,7 @@ void PSOutputDev::postInit() fontIDLen = 0; fontIDs = (Ref *)gmallocn(fontIDSize, sizeof(Ref)); for (i = 0; i < 14; ++i) { - fontNames->add(new GooString(psBase14SubstFonts[i].psName), 1); + fontNames.emplace(psBase14SubstFonts[i].psName); } t1FontNameSize = 64; t1FontNameLen = 0; @@ -1498,8 +1493,6 @@ PSOutputDev::~PSOutputDev() { if (fontIDs) { gfree(fontIDs); } - delete fontNames; - delete fontMaxValidGlyph; if (t1FontNames) { for (i = 0; i < t1FontNameLen; ++i) { delete t1FontNames[i].psName; @@ -2128,10 +2121,9 @@ void PSOutputDev::setupEmbeddedType1Font(Ref *id, GooString *psName) { GBool writePadding = gTrue; // check if font is already embedded - if (fontNames->lookupInt(psName)) { + if (!fontNames.emplace(psName->toStr()).second) { return; } - fontNames->add(psName->copy(), 1); // get the font stream and info Object obj1, obj2, obj3; @@ -2307,10 +2299,9 @@ void PSOutputDev::setupExternalType1Font(GooString *fileName, GooString *psName) FILE *fontFile; int c; - if (fontNames->lookupInt(psName)) { + if (!fontNames.emplace(psName->toStr()).second) { return; } - fontNames->add(psName->copy(), 1); // beginning comment writePSFmt("%%BeginResource: font {0:t}\n", psName); @@ -2546,8 +2537,9 @@ void PSOutputDev::setupExternalTrueTypeFont(GfxFont *font, GooString *fileName, void PSOutputDev::updateFontMaxValidGlyph(GfxFont *font, int maxValidGlyph) { if (maxValidGlyph >= 0 && font->getName()) { - if (maxValidGlyph > fontMaxValidGlyph->lookupInt(font->getName())) { - fontMaxValidGlyph->replace(font->getName()->copy(), maxValidGlyph); + auto& fontMaxValidGlyph = this->fontMaxValidGlyph[font->getName()->toStr()]; + if (fontMaxValidGlyph < maxValidGlyph) { + fontMaxValidGlyph = maxValidGlyph; } } } @@ -2871,16 +2863,14 @@ GooString *PSOutputDev::makePSFontName(GfxFont *font, Ref *id) { if ((s = font->getEmbeddedFontName())) { psName = filterPSName(s); - if (!fontNames->lookupInt(psName)) { - fontNames->add(psName->copy(), 1); + if (fontNames.emplace(psName->toStr()).second) { return psName; } delete psName; } if ((s = font->getName())) { psName = filterPSName(s); - if (!fontNames->lookupInt(psName)) { - fontNames->add(psName->copy(), 1); + if (fontNames.emplace(psName->toStr()).second) { return psName; } delete psName; @@ -2895,7 +2885,7 @@ GooString *PSOutputDev::makePSFontName(GfxFont *font, Ref *id) { psName->append('_')->append(s); delete s; } - fontNames->add(psName->copy(), 1); + fontNames.emplace(psName->toStr()); return psName; } @@ -5041,7 +5031,7 @@ void PSOutputDev::drawString(GfxState *state, GooString *s) { if (!(font = state->getFont())) { return; } - maxGlyphInt = (font->getName()? fontMaxValidGlyph->lookupInt(font->getName()): 0); + maxGlyphInt = (font->getName() ? fontMaxValidGlyph[font->getName()->toStr()] : 0); if (maxGlyphInt < 0) maxGlyphInt = 0; maxGlyph = (CharCode) maxGlyphInt; wMode = font->getWMode(); diff --git a/poppler/PSOutputDev.h b/poppler/PSOutputDev.h index 4ec2cdd7..98ac9c98 100644 --- a/poppler/PSOutputDev.h +++ b/poppler/PSOutputDev.h @@ -46,8 +46,10 @@ #include <set> #include <map> #include <vector> +#include <unordered_set> +#include <unordered_map> +#include <string> -class GooHash; class PDFDoc; class XRef; class Function; @@ -474,8 +476,8 @@ private: int fontIDLen; // number of entries in fontIDs array int fontIDSize; // size of fontIDs array std::set<int> resourceIDs; // list of object IDs of objects containing Resources we've already set up - GooHash *fontNames; // all used font names - GooHash *fontMaxValidGlyph; // max valid glyph of each font + std::unordered_set<std::string> fontNames; // all used font names + std::unordered_map<std::string, int> fontMaxValidGlyph; // max valid glyph of each font PST1FontName *t1FontNames; // font names for Type 1/1C fonts int t1FontNameLen; // number of entries in t1FontNames array int t1FontNameSize; // size of t1FontNames array -- 2.16.2 From 69dcf09676e7223e123986fe436493d91fe397a6 Mon Sep 17 00:00:00 2001 From: Adam Reichold <adam.reich...@t-online.de> Date: Sun, 18 Feb 2018 16:23:36 +0100 Subject: [PATCH 6/8] Also replace the Win32-specific usage of GooHash by std::unordered_map on a best-effort basis. --- poppler/GlobalParams.cc | 6 ------ poppler/GlobalParams.h | 3 ++- poppler/GlobalParamsWin.cc | 23 +++++++++++------------ 3 files changed, 13 insertions(+), 19 deletions(-) diff --git a/poppler/GlobalParams.cc b/poppler/GlobalParams.cc index 8532285f..c119f9df 100644 --- a/poppler/GlobalParams.cc +++ b/poppler/GlobalParams.cc @@ -564,9 +564,6 @@ GlobalParams::GlobalParams(const char *customPopplerDataDir) } } -#ifdef _WIN32 - substFiles = new GooHash(gTrue); -#endif nameToUnicodeZapfDingbats = new NameToCharCode(); nameToUnicodeText = new NameToCharCode(); toUnicodeDirs = new GooList(); @@ -746,9 +743,6 @@ GlobalParams::~GlobalParams() { delete nameToUnicodeZapfDingbats; delete nameToUnicodeText; deleteGooList(toUnicodeDirs, GooString); -#ifdef _WIN32 - deleteGooHash(substFiles, GooString); -#endif delete sysFonts; delete textEncoding; diff --git a/poppler/GlobalParams.h b/poppler/GlobalParams.h index 09a39814..69435bfc 100644 --- a/poppler/GlobalParams.h +++ b/poppler/GlobalParams.h @@ -214,7 +214,8 @@ private: GooList *toUnicodeDirs; // list of ToUnicode CMap dirs [GooString] GBool baseFontsInitialized; #ifdef _WIN32 - GooHash *substFiles; // windows font substitutes (for CID fonts) + // windows font substitutes (for CID fonts) + std::unordered_map<std::string, std::string> substFiles; #endif // font files: font name mapped to path std::unordered_map<std::string, std::string> fontFiles; diff --git a/poppler/GlobalParamsWin.cc b/poppler/GlobalParamsWin.cc index 1734f6eb..725709e9 100644 --- a/poppler/GlobalParamsWin.cc +++ b/poppler/GlobalParamsWin.cc @@ -37,7 +37,6 @@ description for all fonts available in Windows. That's how MuPDF works. #include "goo/gmem.h" #include "goo/GooString.h" #include "goo/GooList.h" -#include "goo/GooHash.h" #include "goo/gfile.h" #include "Error.h" #include "NameToCharCode.h" @@ -479,7 +478,7 @@ void GlobalParams::setupBaseFonts(char * dir) addFontFile(new GooString(obj1.getName()), obj3.getString()->copy()); // Aliases } else if (obj2.isName()) { - substFiles->add(new GooString(obj1.getName()), new GooString(obj2.getName())); + substFiles.emplace(obj1.getName(), obj2.getName()); } } obj1 = parser->getObj(); @@ -496,8 +495,8 @@ void GlobalParams::setupBaseFonts(char * dir) } } -static const char *findSubstituteName(GfxFont *font, GooHash *fontFiles, - GooHash *substFiles, +static const char *findSubstituteName(GfxFont *font, const std::unordered_map<std::string, std::string>& fontFiles, + const std::unordered_map<std::string, std::string>& substFiles, const char *origName) { assert(origName); @@ -514,10 +513,10 @@ static const char *findSubstituteName(GfxFont *font, GooHash *fontFiles, name2->del(n - 11, 11); n -= 11; } - GooString *substName = (GooString *)substFiles->lookup(name2); - if (substName != nullptr) { + const auto substFile = substFiles.find(name2->getCString()); + if (substFile != substFiles.end()) { delete name2; - return substName->getCString(); + return substFile->second.c_str(); } /* TODO: try to at least guess bold/italic/bolditalic from the name */ @@ -537,10 +536,10 @@ static const char *findSubstituteName(GfxFont *font, GooHash *fontFiles, else if ( !collection->cmp("Adobe-Korea1") ) name3 = DEFAULT_CID_FONT_AK1_MSWIN; - if (name3 && fontFiles->lookup(name3)) + if (name3 && fontFiles.count(name3) != 0) return name3; - if (fontFiles->lookup(DEFAULT_CID_FONT_MSWIN)) + if (fontFiles.count(DEFAULT_CID_FONT_MSWIN) != 0) return DEFAULT_CID_FONT_MSWIN; } return DEFAULT_SUBSTITUTE_FONT; @@ -572,10 +571,10 @@ GooString *GlobalParams::findSystemFontFile(GfxFont *font, GooString *substFontName = new GooString(findSubstituteName(font, fontFiles, substFiles, fontName->getCString())); - GooString *path2 = nullptr; error(errSyntaxError, -1, "Couldn't find a font for '{0:t}', subst is '{1:t}'", fontName, substFontName); - if ((path2 = (GooString *)fontFiles->lookup(substFontName))) { - path = new GooString(path2); + const auto fontFile = fontFiles.find(substFontName); + if (fontFile != fontFiles.end()) { + path = new GooString(fontFile->second.c_str()); if (substituteFontName) substituteFontName->Set(path->getCString()); if (!strcasecmp(path->getCString() + path->getLength() - 4, ".ttc")) { -- 2.16.2 From 82521c530c5ff61f0eda8763a0571674f4a6d7ef Mon Sep 17 00:00:00 2001 From: Adam Reichold <adam.reich...@t-online.de> Date: Sun, 18 Feb 2018 16:12:19 +0100 Subject: [PATCH 7/8] Remove GooHash after replacing it by std::unordered_map. --- CMakeLists.txt | 2 - goo/GooHash.cc | 402 --------------------------------------------------------- goo/GooHash.h | 92 ------------- 3 files changed, 496 deletions(-) delete mode 100644 goo/GooHash.cc delete mode 100644 goo/GooHash.h diff --git a/CMakeLists.txt b/CMakeLists.txt index 5e3d6a17..5156874b 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -308,7 +308,6 @@ configure_file(poppler/poppler-config.h.cmake ${CMAKE_CURRENT_BINARY_DIR}/popple set(poppler_SRCS goo/gfile.cc goo/gmempp.cc - goo/GooHash.cc goo/GooList.cc goo/GooTimer.cc goo/GooString.cc @@ -572,7 +571,6 @@ if(ENABLE_XPDF_HEADERS) ${CMAKE_CURRENT_BINARY_DIR}/poppler/poppler-config.h DESTINATION include/poppler) install(FILES - goo/GooHash.h goo/GooList.h goo/GooTimer.h goo/GooMutex.h diff --git a/goo/GooHash.cc b/goo/GooHash.cc deleted file mode 100644 index cd39a5ae..00000000 --- a/goo/GooHash.cc +++ /dev/null @@ -1,402 +0,0 @@ -//======================================================================== -// -// GooHash.cc -// -// Copyright 2001-2003 Glyph & Cog, LLC -// -//======================================================================== - -//======================================================================== -// -// Modified under the Poppler project - http://poppler.freedesktop.org -// -// All changes made under the Poppler project to this file are licensed -// under GPL version 2 or later -// -// Copyright (C) 2017 Albert Astals Cid <aa...@kde.org> -// -// To see a description of the changes please see the Changelog file that -// came with your tarball or type make ChangeLog if you are building from git -// -//======================================================================== - -#include <config.h> - -#ifdef USE_GCC_PRAGMAS -#pragma implementation -#endif - -#include "gmem.h" -#include "GooString.h" -#include "GooHash.h" -#include "GooLikely.h" - -//------------------------------------------------------------------------ - -struct GooHashBucket { - GooString *key; - union { - void *p; - int i; - } val; - GooHashBucket *next; -}; - -struct GooHashIter { - int h; - GooHashBucket *p; -}; - -//------------------------------------------------------------------------ - -GooHash::GooHash(GBool deleteKeysA) { - int h; - - deleteKeys = deleteKeysA; - size = 7; - tab = (GooHashBucket **)gmallocn(size, sizeof(GooHashBucket *)); - for (h = 0; h < size; ++h) { - tab[h] = nullptr; - } - len = 0; -} - -GooHash::~GooHash() { - GooHashBucket *p; - int h; - - for (h = 0; h < size; ++h) { - while (tab[h]) { - p = tab[h]; - tab[h] = p->next; - if (deleteKeys) { - delete p->key; - } - delete p; - } - } - gfree(tab); -} - -void GooHash::add(GooString *key, void *val) { - GooHashBucket *p; - int h; - - // expand the table if necessary - if (len >= size) { - expand(); - } - - // add the new symbol - p = new GooHashBucket; - p->key = key; - p->val.p = val; - h = hash(key); - p->next = tab[h]; - tab[h] = p; - ++len; -} - -void GooHash::add(GooString *key, int val) { - GooHashBucket *p; - int h; - - // expand the table if necessary - if (len >= size) { - expand(); - } - - // add the new symbol - p = new GooHashBucket; - p->key = key; - p->val.i = val; - h = hash(key); - p->next = tab[h]; - tab[h] = p; - ++len; -} - -void GooHash::replace(GooString *key, void *val) { - GooHashBucket *p; - int h; - - if ((p = find(key, &h))) { - p->val.p = val; - if (deleteKeys) { - delete key; - } - } else { - add(key, val); - } -} - -void GooHash::replace(GooString *key, int val) { - GooHashBucket *p; - int h; - - if ((p = find(key, &h))) { - p->val.i = val; - if (deleteKeys) { - delete key; - } - } else { - add(key, val); - } -} - -void *GooHash::lookup(GooString *key) { - GooHashBucket *p; - int h; - - if (!(p = find(key, &h))) { - return nullptr; - } - return p->val.p; -} - -int GooHash::lookupInt(GooString *key) { - GooHashBucket *p; - int h; - - if (!(p = find(key, &h))) { - return 0; - } - return p->val.i; -} - -void *GooHash::lookup(const char *key) { - GooHashBucket *p; - int h; - - if (!(p = find(key, &h))) { - return nullptr; - } - return p->val.p; -} - -int GooHash::lookupInt(const char *key) { - GooHashBucket *p; - int h; - - if (!(p = find(key, &h))) { - return 0; - } - return p->val.i; -} - -void *GooHash::remove(GooString *key) { - GooHashBucket *p; - GooHashBucket **q; - void *val; - int h; - - if (!(p = find(key, &h))) { - return nullptr; - } - q = &tab[h]; - while (*q != p) { - q = &((*q)->next); - } - *q = p->next; - if (deleteKeys) { - delete p->key; - } - val = p->val.p; - delete p; - --len; - return val; -} - -int GooHash::removeInt(GooString *key) { - GooHashBucket *p; - GooHashBucket **q; - int val; - int h; - - if (!(p = find(key, &h))) { - return 0; - } - q = &tab[h]; - while (*q != p) { - q = &((*q)->next); - } - *q = p->next; - if (deleteKeys) { - delete p->key; - } - val = p->val.i; - delete p; - --len; - return val; -} - -void *GooHash::remove(const char *key) { - GooHashBucket *p; - GooHashBucket **q; - void *val; - int h; - - if (!(p = find(key, &h))) { - return nullptr; - } - q = &tab[h]; - while (*q != p) { - q = &((*q)->next); - } - *q = p->next; - if (deleteKeys) { - delete p->key; - } - val = p->val.p; - delete p; - --len; - return val; -} - -int GooHash::removeInt(const char *key) { - GooHashBucket *p; - GooHashBucket **q; - int val; - int h; - - if (!(p = find(key, &h))) { - return 0; - } - q = &tab[h]; - while (*q != p) { - q = &((*q)->next); - } - *q = p->next; - if (deleteKeys) { - delete p->key; - } - val = p->val.i; - delete p; - --len; - return val; -} - -void GooHash::startIter(GooHashIter **iter) { - *iter = new GooHashIter; - (*iter)->h = -1; - (*iter)->p = nullptr; -} - -GBool GooHash::getNext(GooHashIter **iter, GooString **key, void **val) { - if (!*iter) { - return gFalse; - } - if ((*iter)->p) { - (*iter)->p = (*iter)->p->next; - } - while (!(*iter)->p) { - if (++(*iter)->h == size) { - delete *iter; - *iter = nullptr; - return gFalse; - } - (*iter)->p = tab[(*iter)->h]; - } - *key = (*iter)->p->key; - *val = (*iter)->p->val.p; - return gTrue; -} - -GBool GooHash::getNext(GooHashIter **iter, GooString **key, int *val) { - if (!*iter) { - return gFalse; - } - if ((*iter)->p) { - (*iter)->p = (*iter)->p->next; - } - while (!(*iter)->p) { - if (++(*iter)->h == size) { - delete *iter; - *iter = nullptr; - return gFalse; - } - (*iter)->p = tab[(*iter)->h]; - } - *key = (*iter)->p->key; - *val = (*iter)->p->val.i; - return gTrue; -} - -void GooHash::killIter(GooHashIter **iter) { - delete *iter; - *iter = nullptr; -} - -void GooHash::expand() { - GooHashBucket **oldTab; - GooHashBucket *p; - int oldSize, h, i; - - oldSize = size; - oldTab = tab; - size = 2*size + 1; - tab = (GooHashBucket **)gmallocn(size, sizeof(GooHashBucket *)); - for (h = 0; h < size; ++h) { - tab[h] = nullptr; - } - for (i = 0; i < oldSize; ++i) { - while (oldTab[i]) { - p = oldTab[i]; - oldTab[i] = oldTab[i]->next; - h = hash(p->key); - p->next = tab[h]; - tab[h] = p; - } - } - gfree(oldTab); -} - -GooHashBucket *GooHash::find(GooString *key, int *h) { - GooHashBucket *p; - - if (unlikely(!key)) - return nullptr; - - *h = hash(key); - for (p = tab[*h]; p; p = p->next) { - if (!p->key->cmp(key)) { - return p; - } - } - return nullptr; -} - -GooHashBucket *GooHash::find(const char *key, int *h) { - GooHashBucket *p; - - *h = hash(key); - for (p = tab[*h]; p; p = p->next) { - if (!p->key->cmp(key)) { - return p; - } - } - return nullptr; -} - -int GooHash::hash(GooString *key) { - const char *p; - unsigned int h; - int i; - - h = 0; - for (p = key->getCString(), i = 0; i < key->getLength(); ++p, ++i) { - h = 17 * h + (int)(*p & 0xff); - } - return (int)(h % size); -} - -int GooHash::hash(const char *key) { - const char *p; - unsigned int h; - - h = 0; - for (p = key; *p; ++p) { - h = 17 * h + (int)(*p & 0xff); - } - return (int)(h % size); -} diff --git a/goo/GooHash.h b/goo/GooHash.h deleted file mode 100644 index eda19e31..00000000 --- a/goo/GooHash.h +++ /dev/null @@ -1,92 +0,0 @@ -//======================================================================== -// -// GooHash.h -// -// Copyright 2001-2003 Glyph & Cog, LLC -// -//======================================================================== - -//======================================================================== -// -// Modified under the Poppler project - http://poppler.freedesktop.org -// -// All changes made under the Poppler project to this file are licensed -// under GPL version 2 or later -// -// Copyright (C) 2012 Albert Astals Cid <aa...@kde.org> -// -// To see a description of the changes please see the Changelog file that -// came with your tarball or type make ChangeLog if you are building from git -// -//======================================================================== - -#ifndef GHASH_H -#define GHASH_H - -#ifdef USE_GCC_PRAGMAS -#pragma interface -#endif - -#include "gtypes.h" - -class GooString; -struct GooHashBucket; -struct GooHashIter; - -//------------------------------------------------------------------------ - -class GooHash { -public: - - GooHash(GBool deleteKeysA = gFalse); - ~GooHash(); - void add(GooString *key, void *val); - void add(GooString *key, int val); - void replace(GooString *key, void *val); - void replace(GooString *key, int val); - void *lookup(GooString *key); - int lookupInt(GooString *key); - void *lookup(const char *key); - int lookupInt(const char *key); - void *remove(GooString *key); - int removeInt(GooString *key); - void *remove(const char *key); - int removeInt(const char *key); - int getLength() { return len; } - void startIter(GooHashIter **iter); - GBool getNext(GooHashIter **iter, GooString **key, void **val); - GBool getNext(GooHashIter **iter, GooString **key, int *val); - void killIter(GooHashIter **iter); - -private: - GooHash(const GooHash &other); - GooHash& operator=(const GooHash &other); - - void expand(); - GooHashBucket *find(GooString *key, int *h); - GooHashBucket *find(const char *key, int *h); - int hash(GooString *key); - int hash(const char *key); - - GBool deleteKeys; // set if key strings should be deleted - int size; // number of buckets - int len; // number of entries - GooHashBucket **tab; -}; - -#define deleteGooHash(hash, T) \ - do { \ - GooHash *_hash = (hash); \ - { \ - GooHashIter *_iter; \ - GooString *_key; \ - void *_p; \ - _hash->startIter(&_iter); \ - while (_hash->getNext(&_iter, &_key, &_p)) { \ - delete (T*)_p; \ - } \ - delete _hash; \ - } \ - } while(0) - -#endif -- 2.16.2 From 44795ac72f379f384a41df69c96fe246ec893e32 Mon Sep 17 00:00:00 2001 From: Adam Reichold <adam.reich...@t-online.de> Date: Wed, 21 Feb 2018 20:15:22 +0100 Subject: [PATCH 8/8] Make UnicodeMap a move-only type to simplify the initialization of residentUnicodeMaps s.t. it is closer to the GooHash-based version. --- poppler/GlobalParams.cc | 35 ++++++---------- poppler/UnicodeMap.cc | 105 +++++++++++++++++++++++++++++++++++------------- poppler/UnicodeMap.h | 15 ++++--- 3 files changed, 95 insertions(+), 60 deletions(-) diff --git a/poppler/GlobalParams.cc b/poppler/GlobalParams.cc index c119f9df..0606a389 100644 --- a/poppler/GlobalParams.cc +++ b/poppler/GlobalParams.cc @@ -608,30 +608,19 @@ GlobalParams::GlobalParams(const char *customPopplerDataDir) } // set up the residentUnicodeMaps table - - const auto emplaceRangeMap = [&](const char* encodingName, GBool unicodeOut, UnicodeMapRange* ranges, int len) { - residentUnicodeMaps.emplace( - std::piecewise_construct, - std::forward_as_tuple(encodingName), - std::forward_as_tuple(encodingName, unicodeOut, ranges, len) - ); - }; - - const auto emplaceFuncMap = [&](const char* encodingName, GBool unicodeOut, UnicodeMapFunc func) { - residentUnicodeMaps.emplace( - std::piecewise_construct, - std::forward_as_tuple(encodingName), - std::forward_as_tuple(encodingName, unicodeOut, func) - ); - }; - residentUnicodeMaps.reserve(6); - emplaceRangeMap("Latin1", gFalse, latin1UnicodeMapRanges, latin1UnicodeMapLen); - emplaceRangeMap("ASCII7", gFalse, ascii7UnicodeMapRanges, ascii7UnicodeMapLen); - emplaceRangeMap("Symbol", gFalse, symbolUnicodeMapRanges, symbolUnicodeMapLen); - emplaceRangeMap("ZapfDingbats", gFalse, zapfDingbatsUnicodeMapRanges, zapfDingbatsUnicodeMapLen); - emplaceFuncMap("UTF-8", gTrue, &mapUTF8); - emplaceFuncMap("UTF-16", gTrue, &mapUTF16); + UnicodeMap map = {"Latin1", gFalse, latin1UnicodeMapRanges, latin1UnicodeMapLen}; + residentUnicodeMaps.emplace(map.getEncodingName()->toStr(), std::move(map)); + map = {"ASCII7", gFalse, ascii7UnicodeMapRanges, ascii7UnicodeMapLen}; + residentUnicodeMaps.emplace(map.getEncodingName()->toStr(), std::move(map)); + map = {"Symbol", gFalse, symbolUnicodeMapRanges, symbolUnicodeMapLen}; + residentUnicodeMaps.emplace(map.getEncodingName()->toStr(), std::move(map)); + map = {"ZapfDingbats", gFalse, zapfDingbatsUnicodeMapRanges, zapfDingbatsUnicodeMapLen}; + residentUnicodeMaps.emplace(map.getEncodingName()->toStr(), std::move(map)); + map = {"UTF-8", gTrue, &mapUTF8}; + residentUnicodeMaps.emplace(map.getEncodingName()->toStr(), std::move(map)); + map = {"UTF-16", gTrue, &mapUTF16}; + residentUnicodeMaps.emplace(map.getEncodingName()->toStr(), std::move(map)); scanEncodingDirs(); } diff --git a/poppler/UnicodeMap.cc b/poppler/UnicodeMap.cc index dabdce2a..b16a442b 100644 --- a/poppler/UnicodeMap.cc +++ b/poppler/UnicodeMap.cc @@ -139,9 +139,6 @@ UnicodeMap::UnicodeMap(GooString *encodingNameA) { eMaps = nullptr; eMapsLen = 0; refCnt = 1; -#ifdef MULTITHREADED - gInitMutex(&mutex); -#endif } UnicodeMap::UnicodeMap(const char *encodingNameA, GBool unicodeOutA, @@ -154,9 +151,6 @@ UnicodeMap::UnicodeMap(const char *encodingNameA, GBool unicodeOutA, eMaps = nullptr; eMapsLen = 0; refCnt = 1; -#ifdef MULTITHREADED - gInitMutex(&mutex); -#endif } UnicodeMap::UnicodeMap(const char *encodingNameA, GBool unicodeOutA, @@ -168,9 +162,6 @@ UnicodeMap::UnicodeMap(const char *encodingNameA, GBool unicodeOutA, eMaps = nullptr; eMapsLen = 0; refCnt = 1; -#ifdef MULTITHREADED - gInitMutex(&mutex); -#endif } UnicodeMap::~UnicodeMap() { @@ -181,32 +172,88 @@ UnicodeMap::~UnicodeMap() { if (eMaps) { gfree(eMaps); } -#ifdef MULTITHREADED - gDestroyMutex(&mutex); -#endif +} + +UnicodeMap::UnicodeMap(UnicodeMap &&other) noexcept + : encodingName{other.encodingName} + , kind{other.kind} + , unicodeOut{other.unicodeOut} + , len{other.len} + , eMaps{other.eMaps} + , eMapsLen{other.eMapsLen} + , refCnt{1} +{ + switch (kind) { + case unicodeMapUser: + case unicodeMapResident: + ranges = other.ranges; + other.ranges = nullptr; + break; + case unicodeMapFunc: + func = other.func; + break; + } + other.encodingName = nullptr; + other.eMaps = nullptr; +} + +UnicodeMap& UnicodeMap::operator=(UnicodeMap &&other) noexcept +{ + if (this != &other) + swap(other); + return *this; +} + +void UnicodeMap::swap(UnicodeMap &other) noexcept +{ + using std::swap; + swap(encodingName, other.encodingName); + swap(unicodeOut, other.unicodeOut); + switch (kind) { + case unicodeMapUser: + case unicodeMapResident: + switch (other.kind) { + case unicodeMapUser: + case unicodeMapResident: + swap(ranges, other.ranges); + break; + case unicodeMapFunc: + { + const auto tmp = ranges; + func = other.func; + other.ranges = tmp; + break; + } + } + break; + case unicodeMapFunc: + switch (other.kind) { + case unicodeMapUser: + case unicodeMapResident: + { + const auto tmp = func; + ranges = other.ranges; + other.func = tmp; + break; + } + case unicodeMapFunc: + swap(func, other.func); + break; + } + break; + } + swap(kind, other.kind); + swap(len, other.len); + swap(eMaps, other.eMaps); + swap(eMapsLen, other.eMapsLen); } void UnicodeMap::incRefCnt() { -#ifdef MULTITHREADED - gLockMutex(&mutex); -#endif - ++refCnt; -#ifdef MULTITHREADED - gUnlockMutex(&mutex); -#endif + refCnt.fetch_add(1); } void UnicodeMap::decRefCnt() { - GBool done; - -#ifdef MULTITHREADED - gLockMutex(&mutex); -#endif - done = --refCnt == 0; -#ifdef MULTITHREADED - gUnlockMutex(&mutex); -#endif - if (done) { + if (refCnt.fetch_sub(1) == 1) { delete this; } } diff --git a/poppler/UnicodeMap.h b/poppler/UnicodeMap.h index f3444aa5..b741ff61 100644 --- a/poppler/UnicodeMap.h +++ b/poppler/UnicodeMap.h @@ -33,10 +33,7 @@ #include "poppler-config.h" #include "goo/gtypes.h" #include "CharTypes.h" - -#ifdef MULTITHREADED -#include "goo/GooMutex.h" -#endif +#include <atomic> class GooString; @@ -75,6 +72,11 @@ public: UnicodeMap(const char *encodingNameA, GBool unicodeOutA, UnicodeMapFunc funcA); + UnicodeMap(UnicodeMap &&other) noexcept; + UnicodeMap& operator=(UnicodeMap &&other) noexcept; + + void swap(UnicodeMap& other) noexcept; + ~UnicodeMap(); UnicodeMap(const UnicodeMap &) = delete; @@ -111,10 +113,7 @@ private: int len; // (user, resident) UnicodeMapExt *eMaps; // (user) int eMapsLen; // (user) - int refCnt; -#ifdef MULTITHREADED - GooMutex mutex; -#endif + std::atomic_int refCnt; }; //------------------------------------------------------------------------ -- 2.16.2
signature.asc
Description: OpenPGP digital signature
_______________________________________________ poppler mailing list poppler@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/poppler