Re: [poppler] [RFC] Replace GooHash by std::unordered_map
El dimecres, 2 de maig de 2018, a les 19:11:19 CEST, Adam Reichold va escriure: > Hello, > > Am 01.05.2018 um 23:57 schrieb Albert Astals Cid: > > El diumenge, 4 de març de 2018, a les 22:59:20 CEST, Albert Astals Cid va > > > > escriure: > >> El dissabte, 3 de març de 2018, a les 20:51:55 CET, Adam Reichold va > > > > escriure: > >>> Hello Albert, > >>> > >>> please note that in the previous patch, I made a mistake with the move > >>> constructor (the ref count should not be moved/swapped since it is tied > >>> to the instance, not its value). Attached is a fixed patch. > >>> > >>> I found the issue during further performance tests which however only > >>> continued to strengthen to zero sum result. In any case, I extend the > >>> Python-based perftest script with a C++-based driver to be able to > >>> reliably measure single-page runtimes for this, which I will also attach > >>> if anyone is interested in this. > >> > >> Ok, so it seems there's a mild agreement (or at least no strong > >> disagreement) that this may be the way forward. I will first wait to get > >> poppler 0.63 out of the door (that is already late several months, > >> between > >> freedesktop shutting down ssh access, them forgetting to tell us it was > >> open again, my disk dying and then me procastinating as usual) and once > >> it's out I'll come back and reevaluate this. > > > > I was trying to apply the patch for running a quick regtest but running > > git am over it gave me some warning/error I wasn't sure how to get out of > > correctly. regtest results are good, i guess i'll have a second quick look at the patches and unless someone disagrees i'll push them. Cheers, Albert > > > > Could you try rebasing the patch on top of current master? > > > > Cheers, > > > > Albert > > Attached is a version rebased against the current master. > > Best regards, > Adam > > >> Cheers, > >> > >> Albert > >>> > >>> Best regards, Adam. > >>> > >>> Am 03.03.2018 um 20:29 schrieb Albert Astals Cid: > Ha! didn't realize you had already done all that, will read/answer the > other email. > > Cheers, > > Albert > > El dissabte, 3 de març de 2018, a les 20:25:32 CET, Albert Astals Cid > va > > escriure: > > El dimecres, 21 de febrer de 2018, a les 7:13:57 CET, Adam Reichold va > > > > escriure: > >> Hello again, > >> > >> Am 21.02.2018 um 00:31 schrieb Albert Astals Cid: > >>> El dimarts, 20 de febrer de 2018, a les 8:58:24 CET, Adam Reichold > >>> va > >>> > >>> escriure: > Hello again, > > Am 18.02.2018 um 23:23 schrieb Adam Reichold: > > Am 18.02.2018 um 23:08 schrieb Albert Astals Cid: > >> El diumenge, 18 de febrer de 2018, a les 16:55:37 CET, Adam > >> Reichold > >> va > >> > >> escriure: > >>> Hello, > >>> > >>> the attached patch replaced Poppler's internal hash table > >>> implementation > >>> GooHash by std::unordered_map found in the C++ standard library > >>> since > >>> C++11. This continues Poppler's slow drift towards standard > >>> library > >>> containers and removes one of the home-grown data structures > >>> with > >>> the > >>> main goals of reducing code size and improving the long term > >>> maintainability of the code base. > >> > >> Do you have any benchmarks on "rendering" speed and code size? > > > > Sorry, not at hand. I will try to prepare them during the week. > > I did run Splash rendering benchmarks of this branch against master > with > the result of rendering the circa 2400 documents of the TeXLive > >>> > documentation present on my machine being: > >>> I'm wondering if those 2400 documents are diverse enough, which they > >>> may > >>> not be given they are all coming from "the same place". > >> > >> They seem pretty diverse w.r.t. content, some being text heavy and > >> some > >> graphics rich. But I guess they are definitely not diverse > >> technically > >> as all are prepared using TeX itself. > >> > >> The main problem on my side is that I have failed to find my DVD copy > >> of > >> the Poppler regtest document collection until now. :-\ In any case, I > >> am > >> reasonably confident on the zero sum result since GooHash does not > >> seem > >> to live in any performance critical code path. > >> > Cumulative run time: > Result: 90.95 min ∓ 1.1 % > Reference: 91.57 min ∓ 1.2 % > Deviation: -0.0 % > > Cumulative memory usage: > Result: 37.2 MB ∓ 0.7 % > Reference: 37.0 MB ∓ 0.7 % > Deviation: +0.0 % > >
Re: [poppler] [RFC] Replace GooHash by std::unordered_map
Hello, Am 01.05.2018 um 23:57 schrieb Albert Astals Cid: > El diumenge, 4 de març de 2018, a les 22:59:20 CEST, Albert Astals Cid va > escriure: >> El dissabte, 3 de març de 2018, a les 20:51:55 CET, Adam Reichold va > escriure: >>> Hello Albert, >>> >>> please note that in the previous patch, I made a mistake with the move >>> constructor (the ref count should not be moved/swapped since it is tied >>> to the instance, not its value). Attached is a fixed patch. >>> >>> I found the issue during further performance tests which however only >>> continued to strengthen to zero sum result. In any case, I extend the >>> Python-based perftest script with a C++-based driver to be able to >>> reliably measure single-page runtimes for this, which I will also attach >>> if anyone is interested in this. >> >> Ok, so it seems there's a mild agreement (or at least no strong >> disagreement) that this may be the way forward. I will first wait to get >> poppler 0.63 out of the door (that is already late several months, between >> freedesktop shutting down ssh access, them forgetting to tell us it was >> open again, my disk dying and then me procastinating as usual) and once >> it's out I'll come back and reevaluate this. > > I was trying to apply the patch for running a quick regtest but running git > am > over it gave me some warning/error I wasn't sure how to get out of correctly. > > Could you try rebasing the patch on top of current master? > > Cheers, > Albert Attached is a version rebased against the current master. Best regards, Adam >> >> Cheers, >> Albert >> >>> Best regards, Adam. >>> >>> Am 03.03.2018 um 20:29 schrieb Albert Astals Cid: Ha! didn't realize you had already done all that, will read/answer the other email. Cheers, Albert El dissabte, 3 de març de 2018, a les 20:25:32 CET, Albert Astals Cid va escriure: > El dimecres, 21 de febrer de 2018, a les 7:13:57 CET, Adam Reichold va > > escriure: >> Hello again, >> >> Am 21.02.2018 um 00:31 schrieb Albert Astals Cid: >>> El dimarts, 20 de febrer de 2018, a les 8:58:24 CET, Adam Reichold va >>> >>> escriure: Hello again, Am 18.02.2018 um 23:23 schrieb Adam Reichold: > Am 18.02.2018 um 23:08 schrieb Albert Astals Cid: >> El diumenge, 18 de febrer de 2018, a les 16:55:37 CET, Adam >> Reichold >> va >> >> escriure: >>> Hello, >>> >>> the attached patch replaced Poppler's internal hash table >>> implementation >>> GooHash by std::unordered_map found in the C++ standard library >>> since >>> C++11. This continues Poppler's slow drift towards standard >>> library >>> containers and removes one of the home-grown data structures with >>> the >>> main goals of reducing code size and improving the long term >>> maintainability of the code base. >> >> Do you have any benchmarks on "rendering" speed and code size? > > Sorry, not at hand. I will try to prepare them during the week. I did run Splash rendering benchmarks of this branch against master with the result of rendering the circa 2400 documents of the TeXLive >>> documentation present on my machine being: >>> I'm wondering if those 2400 documents are diverse enough, which they >>> may >>> not be given they are all coming from "the same place". >> >> They seem pretty diverse w.r.t. content, some being text heavy and >> some >> graphics rich. But I guess they are definitely not diverse technically >> as all are prepared using TeX itself. >> >> The main problem on my side is that I have failed to find my DVD copy >> of >> the Poppler regtest document collection until now. :-\ In any case, I >> am >> reasonably confident on the zero sum result since GooHash does not >> seem >> to live in any performance critical code path. >> Cumulative run time: Result: 90.95 min ∓ 1.1 % Reference: 91.57 min ∓ 1.2 % Deviation: -0.0 % Cumulative memory usage: Result: 37.2 MB ∓ 0.7 % Reference: 37.0 MB ∓ 0.7 % Deviation: +0.0 % (Where result is this patch and the reference is master.) (The measurement was taken using the perftest script which I proposed here some time ago for which I'll attach the patch again for reproduceability.) >>> >>> Did you really attach this before? i really don't remember it :D >> >> This was a very long-winded thread ending on 2016-01-01 centered >> around >> the regtest framework. It went from custom driver using QTest's >> benchmark functionality to custom backend in the regtest