I'll take a look and see if I can work out what is going on. On 22/01/15 16:58, suzuki toshiya wrote: > Dear Adrian, > > I've tested with The_Miseries_of_Human_Life.pdf, but > strangely I could not reproduce the slowdown issue. > Furthermore, pdftocairo with JBIG2 patch finishes > the rendering faster than one without JBIG2 patch. > Maybe my machine (Core2Duo 1GHz, RAM 2GB) is quite > slower than yours - in fact, my machine spent 3-5 minutes > to render the PDF by pdftocairo. > > I uploaded Linux/i386 static binaries built with "-pg -g -O2" at: > http://gyvern.ipc.hiroshima-u.ac.jp/~mpsuzuki/JBIG2/pdftocairo > http://gyvern.ipc.hiroshima-u.ac.jp/~mpsuzuki/JBIG2/pdftocairo-jbig2 > > Also I uploaded sample profiles (gmon.out) at: > http://gyvern.ipc.hiroshima-u.ac.jp/~mpsuzuki/JBIG2/pdftocairo-4m20s.gmon.out > > http://gyvern.ipc.hiroshima-u.ac.jp/~mpsuzuki/JBIG2/pdftocairo-jbig2-3m03s.gmon.out > > > > Their analysis outputs by gprof are at: > http://gyvern.ipc.hiroshima-u.ac.jp/~mpsuzuki/JBIG2/pdftocairo-4m20s.gmon.log > > http://gyvern.ipc.hiroshima-u.ac.jp/~mpsuzuki/JBIG2/pdftocairo-jbig2-3m03s.gmon.log > > > If I made the binaries without profiler (without -pg) > the processing times are almost same. So I think this is > not because the profiler made the rendering slower. > > Comparing the profile results, the remarkable difference > would be that 3 functions OUT of poppler spent long time > in original pdftocairo: > * _cairo_pdf_surface_show_page() spent 35.77s (spent 0.08s in patched) > * longest_match() (of zlib) spent 18.72s (spent 0.72s in patched) > * _cairo_image_analyze_color() spent 13.02s (not called in patched) > > I guess they are used to reencoding of pixel data from > JBIG2 to Deflate. If it's required, I will make a profiled > Cairo library and check more detail. > > -- > > At present, I don't have an access to the machines that > can render The_Miseries_of_Human_Life.pdf within 1 min. > So, please help me to analyze the cause of slowdown. > > Could you build the binary with profiler and show the > profiled results? Of course you can use my binaries to > check if our difference is because of the machines, or, > because of different configurations. > > Regards, > mpsuzuki > > > suzuki toshiya wrote: >> Dear Adrian, >> >> Great Thank you for benchmark evaluation. >> I will try to analyze the slowdown and >> improve the performance. >> >> Regards, >> mpsuzuki >> >> Adrian Johnson wrote: >>> I've tested the lastest patch. It works fine but I noticed about a 50% >>> slow down with pdfs containing a lot of JBIG2 images. >>> >>> I've uploaded a couple of sample pdfs here: >>> http://people.freedesktop.org/~ajohnson/jbig2/ >>> >>> Below are the timings for each document using git master and git master >>> with the jbig2 patch. I used the command line >>> pdftocairo -pdf input.pdf output.pdf >>> >>> >>> - AnnualReport.pdf >>> master >>> real 0m13.562s >>> user 0m12.620s >>> sys 0m0.820s >>> >>> jbig2 >>> real 0m21.045s >>> user 0m20.000s >>> sys 0m1.060s >>> >>> >>> - The_Miseries_of_Human_Life.pdf >>> master >>> real 0m32.690s >>> user 0m29.716s >>> sys 0m2.832s >>> >>> jbig2 >>> real 0m42.793s >>> user 0m39.680s >>> sys 0m2.848s >>> >>> >>> While it is still worth a small performance reduction to get the >>> significant reduction in the output file size, it may be worth seeing if >>> there is a simple solution to the slowdown. My first guess is to try >>> attaching the global data to only one image instead of to every image. >>> >>> >>> On 20/01/15 02:42, suzuki toshiya wrote: >>>> Dear Carlos, >>>> >>>> Sorry! I had a silly mistake, I should pass ref->getRef() >>>> (in my previous patch, I passed ref->isRef()). >>>> I attach corrected patch. >>>> >>>> Regards, >>>> mpsuzuki >>>> >>>> suzuki toshiya wrote: >>>>> Dear Carlos, >>>>> >>>>> Thank you for finding out the points! I attached revised patch. >>>>> The difference from my previous revision is following. >>>>> >>>>> Regards, >>>>> mpsuzuki >>>>> >>>>> >>>>> diff --git a/poppler/CairoOutputDev.cc b/poppler/CairoOutputDev.cc >>>>> index a428d2c..17b6b3b 100644 >>>>> --- a/poppler/CairoOutputDev.cc >>>>> +++ b/poppler/CairoOutputDev.cc >>>>> @@ -2833,13 +2833,12 @@ void CairoOutputDev::setMimeData(GfxState >>>>> *state, Stream *str, Object *ref, >>>>> #endif >>>>> >>>>> if (getStreamData (str->getNextStream(), &strBuffer, &len)) { >>>>> - cairo_status_t status; >>>>> + cairo_status_t status = CAIRO_STATUS_SUCCESS; >>>>> >>>>> #if CAIRO_VERSION >= CAIRO_VERSION_ENCODE(1, 11, 2) >>>>> if (ref && ref->isRef()) { >>>>> - Ref imgRef = ref->getRef(); >>>>> status = setMimeIdFromRef(image, CAIRO_MIME_TYPE_UNIQUE_ID, >>>>> - "poppler-surface-", imgRef); >>>>> + "poppler-surface-", ref->isRef()); >>>>> } >>>>> #endif >>>>> if (!status) { >>>>> >>>>> >>>>> >>>>> Carlos Garcia Campos wrote: >>>>>> Adrian Johnson <[email protected]> writes: >>>>>> >>>>>>> The latest patch looks good. I'll wait a week before committing >>>>>>> to see >>>>>>> if anyone else has any comments. This will give me time to do some >>>>>>> testing of the patch. >>>>>> Looks good to me as well in general. There's only one thing I've >>>>>> noticed. >>>>>> >>>>>>> @@ -2746,31 +2827,28 @@ void CairoOutputDev::setMimeData(GfxState >>>>>>> *state, Stream *str, Object *ref, >>>>>>> if (!colorMapHasIdentityDecodeMap(colorMap)) >>>>>>> return; >>>>>>> >>>>>>> +#if CAIRO_VERSION >= CAIRO_VERSION_ENCODE(1, 14, 0) >>>>>>> + if (strKind == strJBIG2 && !setMimeDataForJBIG2Globals(str, >>>>>>> image)) >>>>>>> + return; >>>>>>> +#endif >>>>>>> + >>>>>>> if (getStreamData (str->getNextStream(), &strBuffer, &len)) { >>>>>>> - cairo_status_t st; >>>>>>> + cairo_status_t status; >>>>>>> >>>>>>> #if CAIRO_VERSION >= CAIRO_VERSION_ENCODE(1, 11, 2) >>>>>>> if (ref && ref->isRef()) { >>>>>>> Ref imgRef = ref->getRef(); >>>>>>> - GooString *surfaceId = new GooString("poppler-surface-"); >>>>>>> - surfaceId->appendf("{0:d}-{1:d}", imgRef.gen, imgRef.num); >>>>>>> - char *idBuffer = copyString(surfaceId->getCString()); >>>>>>> - st = cairo_surface_set_mime_data (image, >>>>>>> CAIRO_MIME_TYPE_UNIQUE_ID, >>>>>>> - (const unsigned char >>>>>>> *)idBuffer, >>>>>>> - surfaceId->getLength(), >>>>>>> - gfree, idBuffer); >>>>>>> - if (st) >>>>>>> - gfree(idBuffer); >>>>>>> - delete surfaceId; >>>>>>> + status = setMimeIdFromRef(image, CAIRO_MIME_TYPE_UNIQUE_ID, >>>>>>> + "poppler-surface-", imgRef); >>>>>> We could pass ref->getRef() directly here. >>>>>> >>>>>>> } >>>>>>> #endif >>>>>>> + if (!status) { >>>>>> status might be uninitialized at this point if cairo version is not >>>>>> recent enough or the ref && ref->isRef() condition is False. >>>>>> >>>>>>> + status = cairo_surface_set_mime_data (image, mime_type, >>>>>>> + (const unsigned char *)strBuffer, len, >>>>>>> + gfree, strBuffer); >>>>>>> + } >>>>>> >>>>>>> - st = cairo_surface_set_mime_data (image, >>>>>>> - str->getKind() == strDCT ? >>>>>>> - CAIRO_MIME_TYPE_JPEG : CAIRO_MIME_TYPE_JP2, >>>>>>> - (const unsigned char *)strBuffer, len, >>>>>>> - gfree, strBuffer); >>>>>>> - if (st) >>>>>>> + if (status) >>>>>>> gfree (strBuffer); >>>>>>> } >>>>>>> } >>>>>>> On 11/01/15 23:05, suzuki toshiya wrote: >>>>>>>> Dear Adrian, >>>>>>>> >>>>>>>> OK - considering that there is a lag to the day that Linux >>>>>>>> distribution ship the poppler stable branch including >>>>>>>> CairoOutputDev with JBIG2 support even if I support some >>>>>>>> snapshot of Cairo in current development branch of poppler >>>>>>>> (and there would be no need to support Cairo snapshot in >>>>>>>> the day), I decided to follow original version checking. >>>>>>>> >>>>>>>> Also I removed the default value (NULL) for Ref object to >>>>>>>> JBIG2Stream constructor and removed NULL pointer checking >>>>>>>> in JBIG2Stream constructor (as same as the argument to pass >>>>>>>> Globals object). >>>>>>>> >>>>>>>> I replaced "st" in CairoOutputDev, by "status". >>>>>>>> >>>>>>>> Here I attached revised patch - oh, I'm quite sorry, my >>>>>>>> previous posts were misunderstanding this year as 2014. >>>>>>>> >>>>>>>> Regards, >>>>>>>> mpsuzuki >>>>>>>> >>>>>>>> Adrian Johnson wrote: >>>>>>>>> On 08/01/15 13:35, suzuki toshiya wrote: >>>>>>>>>> Dear Adrian, >>>>>>>>>> >>>>>>>>>> Great Thank you for prompt review! I attached the revised patch, >>>>>>>>>> but please let me discuss a few points. >>>>>>>>>> >>>>>>>>>> Adrian Johnson wrote: >>>>>>>>>>> On 07/01/15 02:58, suzuki toshiya wrote: >>>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>> Here I submit a preliminary patch for git head, >>>>>>>>>>>> which enables JBIG2 embedding to PDF surface via Cairo. >>>>>>>>>>> +static cairo_status_t setJBIG2GlobalsIdByRefNums(int refNum, >>>>>>>>>>> int >>>>>>>>>>> refGen, >>>>>>>>>>> + >>>>>>>>>>> cairo_surface_t >>>>>>>>>>> *image) >>>>>>>>>>> +{ >>>>>>>>>>> + GooString *globalsStrId; >>>>>>>>>>> + char *idBuffer; >>>>>>>>>>> + cairo_status_t st; >>>>>>>>>>> + >>>>>>>>>>> + >>>>>>>>>>> + globalsStrId = new GooString; >>>>>>>>>>> + globalsStrId->appendf("{0:d}-{1:d}", refNum, refGen); >>>>>>>>>>> + idBuffer = copyString(globalsStrId->getCString()); >>>>>>>>>>> + st = cairo_surface_set_mime_data (image, >>>>>>>>>>> CAIRO_MIME_TYPE_JBIG2_GLOBAL_ID, >>>>>>>>>>> >>>>>>>>>>> The cairo JBIG2 mime types are new to 1.14.0 so will need to be >>>>>>>>>>> inside a >>>>>>>>>>> #if CAIRO_VERSION >= CAIRO_VERSION_ENCODE(1, 14, 0). >>>>>>>>>> If it's acceptable, I want to use the conditional like >>>>>>>>>> #ifdef CAIRO_MIME_TYPE_JBIG2_GLOBAL_ID >>>>>>>>>> The official release of cairo including JBIG2 support is 1.14.0, >>>>>>>>>> however, some distributions use a snapshot before 1.14 which >>>>>>>>>> supports JBIG2. For example, Ubuntu 14.04 (long time support) >>>>>>>>>> ships with 1.13.0~20140204. >>>>>>>>>> >>>>>>>>>> http://packages.ubuntu.com/search?suite=trusty§ion=all&arch=any&keywords=libcairo2&searchon=names >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> It includes JBIG2 support. Thus I prefer checking by the JBIG2 >>>>>>>>>> macros, instead of the version checking. >>>>>>>>> I'm not keen on making an exception for this case. The target >>>>>>>>> audience >>>>>>>>> for this is very small. It would only help users that compile a >>>>>>>>> new >>>>>>>>> version of poppler on an old distribution without also recompiling >>>>>>>>> cairo. JBIG2 support for pdftocairo is a very minor feature >>>>>>>>> that not >>>>>>>>> many people would be using (so far no one else has requested this >>>>>>>>> feature). >>>>>>>>> >>>>>>>>> If you really want to support distributions using a cairo 1.13 >>>>>>>>> snapshot >>>>>>>>> I suggest adding a comment before the first occurrence of >>>>>>>>> #ifdef CAIRO_MIME_TYPE_JBIG2_GLOBAL_ID to explain why the >>>>>>>>> version macro >>>>>>>>> is not used. Also include >>>>>>>>> "#if CAIRO_VERSION >= CAIRO_VERSION_ENCODE(1, 14, 0)" in the >>>>>>>>> comment so >>>>>>>>> that: >>>>>>>>> a) searching for CAIRO_VERSION will find it, and >>>>>>>>> b) to document that JBIG2 support is in 1.14 or later. >>>>>>>>> >>>>>>>>>> BTW, when the string for UNIQUE_ID is constructed, its syntax >>>>>>>>>> is RefGen-RefNum. In PDF data, RefNum then RefGen is popular >>>>>>>>>> syntax. Is there any reason why the order is reverted? >>>>>>>>> No reason. Feel free to change it to num-gen order. >>>>>>>>> >>>>>>>>>>> +GBool CairoOutputDev::setMimeDataForJBIG2Globals(Stream *str, >>>>>>>>>>> + >>>>>>>>>>> cairo_surface_t >>>>>>>>>>> *image) >>>>>>>>>>> +{ >>>>>>>>>>> + JBIG2Stream *jbig2Str = static_cast<JBIG2Stream *>(str); >>>>>>>>>>> + Object* globalsStr = jbig2Str->getGlobalsStream(); >>>>>>>>>>> + char *globalsBuffer; >>>>>>>>>>> + int globalsLength; >>>>>>>>>>> + >>>>>>>>>>> + // According to PDF spec, Globals should be stream (or >>>>>>>>>>> nothing) >>>>>>>>>>> + >>>>>>>>>>> + if (globalsStr->isNull()) >>>>>>>>>>> + return gTrue; >>>>>>>>>>> + >>>>>>>>>>> + if (!globalsStr->isStream()) >>>>>>>>>>> + return gFalse; >>>>>>>>>>> + >>>>>>>>>>> + // See JBIG2Stream constructor for valid reference number >>>>>>>>>>> + if (!jbig2Str->getGlobalsStreamRefNum()) >>>>>>>>>>> + return gFalse; >>>>>>>>>>> >>>>>>>>>>> The globalsStr->isStream() check should be sufficient. The >>>>>>>>>>> reference >>>>>>>>>>> number should always be valid of the global stream is valid. >>>>>>>>>> Umm. Requesting "both of JBIG2Stream::globalsStream and Ref to >>>>>>>>>> it must be always valid as far as GlobalsStream is used" is >>>>>>>>>> intuitive solution, but public JBIG2Stream constructor has not >>>>>>>>>> requested the Ref to GlobalsStream before in earlier version. >>>>>>>>>> >>>>>>>>>> If some developers use JBIG2Stream constructor with valid >>>>>>>>>> GlobalsStream but without Ref to it (as the syntax before my >>>>>>>>>> patch), there is the situation that JBIG2Stream::globalsStream >>>>>>>>>> is valid but Ref to it is invalid (if I supply the default >>>>>>>>>> value for the argument to pass Ref). How do we detect or prevent >>>>>>>>>> such case? >>>>>>>>>> >>>>>>>>>> A) Supplying the default value (NULL pointer) in the argument >>>>>>>>>> to pass the Ref, and JBIG2Stream returns an error when >>>>>>>>>> globalsStream is valid stream but its Ref is not passed - and >>>>>>>>>> prevent the case that globalsStream is valid but its Ref is >>>>>>>>>> invalid. >>>>>>>>>> >>>>>>>>>> B) Not supplying the default value, and prevent the construction >>>>>>>>>> without the Ref to the globalsStream. >>>>>>>>>> >>>>>>>>>> C) Add new boolean value in JBIG2Stream class, to indicate if >>>>>>>>>> Ref to globalsStream is explicitly set. >>>>>>>>>> >>>>>>>>>> D) Initialize Ref with irregular value (e.g. 0,0 or 0,65535) >>>>>>>>>> internally when Ref is not passed, and deal the case as Ref to >>>>>>>>>> the globalsStream is not initialized (my last patch design) >>>>>>>>>> >>>>>>>>>> Which is the best? Or, no need to case such case? >>>>>>>>> JBIG2Stream is not part of the public API so there is no need to >>>>>>>>> preserve compatibility. >>>>>>>>> >>>>>>>>> Something else I just noticed - could you rename "st" to "status". >>>>>>>>> CairoOutputdev.cc always uses the name "status" for cairo >>>>>>>>> status so we >>>>>>>>> should be consistent with this convention. >>>>>>>>> >>>>>>> _______________________________________________ >>>>>>> poppler mailing list >>>>>>> [email protected] >>>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler >>>>> >>>>> ------------------------------------------------------------------------ >>>>> >>>>> >>>>> _______________________________________________ >>>>> poppler mailing list >>>>> [email protected] >>>>> http://lists.freedesktop.org/mailman/listinfo/poppler >>>> >>>> >>>> _______________________________________________ >>>> poppler mailing list >>>> [email protected] >>>> http://lists.freedesktop.org/mailman/listinfo/poppler >>>> >>> >> >> _______________________________________________ >> poppler mailing list >> [email protected] >> http://lists.freedesktop.org/mailman/listinfo/poppler >> > _______________________________________________ > poppler mailing list > [email protected] > http://lists.freedesktop.org/mailman/listinfo/poppler
_______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
