Dear Adrian, Great Thank you for benchmark evaluation. I will try to analyze the slowdown and improve the performance.
Regards, mpsuzuki Adrian Johnson wrote: > I've tested the lastest patch. It works fine but I noticed about a 50% > slow down with pdfs containing a lot of JBIG2 images. > > I've uploaded a couple of sample pdfs here: > http://people.freedesktop.org/~ajohnson/jbig2/ > > Below are the timings for each document using git master and git master > with the jbig2 patch. I used the command line > pdftocairo -pdf input.pdf output.pdf > > > - AnnualReport.pdf > master > real 0m13.562s > user 0m12.620s > sys 0m0.820s > > jbig2 > real 0m21.045s > user 0m20.000s > sys 0m1.060s > > > - The_Miseries_of_Human_Life.pdf > master > real 0m32.690s > user 0m29.716s > sys 0m2.832s > > jbig2 > real 0m42.793s > user 0m39.680s > sys 0m2.848s > > > While it is still worth a small performance reduction to get the > significant reduction in the output file size, it may be worth seeing if > there is a simple solution to the slowdown. My first guess is to try > attaching the global data to only one image instead of to every image. > > > On 20/01/15 02:42, suzuki toshiya wrote: >> Dear Carlos, >> >> Sorry! I had a silly mistake, I should pass ref->getRef() >> (in my previous patch, I passed ref->isRef()). >> I attach corrected patch. >> >> Regards, >> mpsuzuki >> >> suzuki toshiya wrote: >>> Dear Carlos, >>> >>> Thank you for finding out the points! I attached revised patch. >>> The difference from my previous revision is following. >>> >>> Regards, >>> mpsuzuki >>> >>> >>> diff --git a/poppler/CairoOutputDev.cc b/poppler/CairoOutputDev.cc >>> index a428d2c..17b6b3b 100644 >>> --- a/poppler/CairoOutputDev.cc >>> +++ b/poppler/CairoOutputDev.cc >>> @@ -2833,13 +2833,12 @@ void CairoOutputDev::setMimeData(GfxState *state, >>> Stream *str, Object *ref, >>> #endif >>> >>> if (getStreamData (str->getNextStream(), &strBuffer, &len)) { >>> - cairo_status_t status; >>> + cairo_status_t status = CAIRO_STATUS_SUCCESS; >>> >>> #if CAIRO_VERSION >= CAIRO_VERSION_ENCODE(1, 11, 2) >>> if (ref && ref->isRef()) { >>> - Ref imgRef = ref->getRef(); >>> status = setMimeIdFromRef(image, CAIRO_MIME_TYPE_UNIQUE_ID, >>> - "poppler-surface-", imgRef); >>> + "poppler-surface-", ref->isRef()); >>> } >>> #endif >>> if (!status) { >>> >>> >>> >>> Carlos Garcia Campos wrote: >>>> Adrian Johnson <[email protected]> writes: >>>> >>>>> The latest patch looks good. I'll wait a week before committing to see >>>>> if anyone else has any comments. This will give me time to do some >>>>> testing of the patch. >>>> Looks good to me as well in general. There's only one thing I've >>>> noticed. >>>> >>>>> @@ -2746,31 +2827,28 @@ void CairoOutputDev::setMimeData(GfxState *state, >>>>> Stream *str, Object *ref, >>>>> if (!colorMapHasIdentityDecodeMap(colorMap)) >>>>> return; >>>>> >>>>> +#if CAIRO_VERSION >= CAIRO_VERSION_ENCODE(1, 14, 0) >>>>> + if (strKind == strJBIG2 && !setMimeDataForJBIG2Globals(str, image)) >>>>> + return; >>>>> +#endif >>>>> + >>>>> if (getStreamData (str->getNextStream(), &strBuffer, &len)) { >>>>> - cairo_status_t st; >>>>> + cairo_status_t status; >>>>> >>>>> #if CAIRO_VERSION >= CAIRO_VERSION_ENCODE(1, 11, 2) >>>>> if (ref && ref->isRef()) { >>>>> Ref imgRef = ref->getRef(); >>>>> - GooString *surfaceId = new GooString("poppler-surface-"); >>>>> - surfaceId->appendf("{0:d}-{1:d}", imgRef.gen, imgRef.num); >>>>> - char *idBuffer = copyString(surfaceId->getCString()); >>>>> - st = cairo_surface_set_mime_data (image, CAIRO_MIME_TYPE_UNIQUE_ID, >>>>> - (const unsigned char *)idBuffer, >>>>> - surfaceId->getLength(), >>>>> - gfree, idBuffer); >>>>> - if (st) >>>>> - gfree(idBuffer); >>>>> - delete surfaceId; >>>>> + status = setMimeIdFromRef(image, CAIRO_MIME_TYPE_UNIQUE_ID, >>>>> + "poppler-surface-", imgRef); >>>> We could pass ref->getRef() directly here. >>>> >>>>> } >>>>> #endif >>>>> + if (!status) { >>>> status might be uninitialized at this point if cairo version is not >>>> recent enough or the ref && ref->isRef() condition is False. >>>> >>>>> + status = cairo_surface_set_mime_data (image, mime_type, >>>>> + (const unsigned char *)strBuffer, >>>>> len, >>>>> + gfree, strBuffer); >>>>> + } >>>> >>>>> - st = cairo_surface_set_mime_data (image, >>>>> - str->getKind() == strDCT ? >>>>> - CAIRO_MIME_TYPE_JPEG : >>>>> CAIRO_MIME_TYPE_JP2, >>>>> - (const unsigned char *)strBuffer, len, >>>>> - gfree, strBuffer); >>>>> - if (st) >>>>> + if (status) >>>>> gfree (strBuffer); >>>>> } >>>>> } >>>>> On 11/01/15 23:05, suzuki toshiya wrote: >>>>>> Dear Adrian, >>>>>> >>>>>> OK - considering that there is a lag to the day that Linux >>>>>> distribution ship the poppler stable branch including >>>>>> CairoOutputDev with JBIG2 support even if I support some >>>>>> snapshot of Cairo in current development branch of poppler >>>>>> (and there would be no need to support Cairo snapshot in >>>>>> the day), I decided to follow original version checking. >>>>>> >>>>>> Also I removed the default value (NULL) for Ref object to >>>>>> JBIG2Stream constructor and removed NULL pointer checking >>>>>> in JBIG2Stream constructor (as same as the argument to pass >>>>>> Globals object). >>>>>> >>>>>> I replaced "st" in CairoOutputDev, by "status". >>>>>> >>>>>> Here I attached revised patch - oh, I'm quite sorry, my >>>>>> previous posts were misunderstanding this year as 2014. >>>>>> >>>>>> Regards, >>>>>> mpsuzuki >>>>>> >>>>>> Adrian Johnson wrote: >>>>>>> On 08/01/15 13:35, suzuki toshiya wrote: >>>>>>>> Dear Adrian, >>>>>>>> >>>>>>>> Great Thank you for prompt review! I attached the revised patch, >>>>>>>> but please let me discuss a few points. >>>>>>>> >>>>>>>> Adrian Johnson wrote: >>>>>>>>> On 07/01/15 02:58, suzuki toshiya wrote: >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> Here I submit a preliminary patch for git head, >>>>>>>>>> which enables JBIG2 embedding to PDF surface via Cairo. >>>>>>>>> +static cairo_status_t setJBIG2GlobalsIdByRefNums(int refNum, int >>>>>>>>> refGen, >>>>>>>>> + cairo_surface_t >>>>>>>>> *image) >>>>>>>>> +{ >>>>>>>>> + GooString *globalsStrId; >>>>>>>>> + char *idBuffer; >>>>>>>>> + cairo_status_t st; >>>>>>>>> + >>>>>>>>> + >>>>>>>>> + globalsStrId = new GooString; >>>>>>>>> + globalsStrId->appendf("{0:d}-{1:d}", refNum, refGen); >>>>>>>>> + idBuffer = copyString(globalsStrId->getCString()); >>>>>>>>> + st = cairo_surface_set_mime_data (image, >>>>>>>>> CAIRO_MIME_TYPE_JBIG2_GLOBAL_ID, >>>>>>>>> >>>>>>>>> The cairo JBIG2 mime types are new to 1.14.0 so will need to be >>>>>>>>> inside a >>>>>>>>> #if CAIRO_VERSION >= CAIRO_VERSION_ENCODE(1, 14, 0). >>>>>>>> If it's acceptable, I want to use the conditional like >>>>>>>> #ifdef CAIRO_MIME_TYPE_JBIG2_GLOBAL_ID >>>>>>>> The official release of cairo including JBIG2 support is 1.14.0, >>>>>>>> however, some distributions use a snapshot before 1.14 which >>>>>>>> supports JBIG2. For example, Ubuntu 14.04 (long time support) >>>>>>>> ships with 1.13.0~20140204. >>>>>>>> >>>>>>>> http://packages.ubuntu.com/search?suite=trusty§ion=all&arch=any&keywords=libcairo2&searchon=names >>>>>>>> >>>>>>>> >>>>>>>> It includes JBIG2 support. Thus I prefer checking by the JBIG2 >>>>>>>> macros, instead of the version checking. >>>>>>> I'm not keen on making an exception for this case. The target audience >>>>>>> for this is very small. It would only help users that compile a new >>>>>>> version of poppler on an old distribution without also recompiling >>>>>>> cairo. JBIG2 support for pdftocairo is a very minor feature that not >>>>>>> many people would be using (so far no one else has requested this >>>>>>> feature). >>>>>>> >>>>>>> If you really want to support distributions using a cairo 1.13 snapshot >>>>>>> I suggest adding a comment before the first occurrence of >>>>>>> #ifdef CAIRO_MIME_TYPE_JBIG2_GLOBAL_ID to explain why the version macro >>>>>>> is not used. Also include >>>>>>> "#if CAIRO_VERSION >= CAIRO_VERSION_ENCODE(1, 14, 0)" in the comment so >>>>>>> that: >>>>>>> a) searching for CAIRO_VERSION will find it, and >>>>>>> b) to document that JBIG2 support is in 1.14 or later. >>>>>>> >>>>>>>> BTW, when the string for UNIQUE_ID is constructed, its syntax >>>>>>>> is RefGen-RefNum. In PDF data, RefNum then RefGen is popular >>>>>>>> syntax. Is there any reason why the order is reverted? >>>>>>> No reason. Feel free to change it to num-gen order. >>>>>>> >>>>>>>>> +GBool CairoOutputDev::setMimeDataForJBIG2Globals(Stream *str, >>>>>>>>> + cairo_surface_t >>>>>>>>> *image) >>>>>>>>> +{ >>>>>>>>> + JBIG2Stream *jbig2Str = static_cast<JBIG2Stream *>(str); >>>>>>>>> + Object* globalsStr = jbig2Str->getGlobalsStream(); >>>>>>>>> + char *globalsBuffer; >>>>>>>>> + int globalsLength; >>>>>>>>> + >>>>>>>>> + // According to PDF spec, Globals should be stream (or nothing) >>>>>>>>> + >>>>>>>>> + if (globalsStr->isNull()) >>>>>>>>> + return gTrue; >>>>>>>>> + >>>>>>>>> + if (!globalsStr->isStream()) >>>>>>>>> + return gFalse; >>>>>>>>> + >>>>>>>>> + // See JBIG2Stream constructor for valid reference number >>>>>>>>> + if (!jbig2Str->getGlobalsStreamRefNum()) >>>>>>>>> + return gFalse; >>>>>>>>> >>>>>>>>> The globalsStr->isStream() check should be sufficient. The reference >>>>>>>>> number should always be valid of the global stream is valid. >>>>>>>> Umm. Requesting "both of JBIG2Stream::globalsStream and Ref to >>>>>>>> it must be always valid as far as GlobalsStream is used" is >>>>>>>> intuitive solution, but public JBIG2Stream constructor has not >>>>>>>> requested the Ref to GlobalsStream before in earlier version. >>>>>>>> >>>>>>>> If some developers use JBIG2Stream constructor with valid >>>>>>>> GlobalsStream but without Ref to it (as the syntax before my >>>>>>>> patch), there is the situation that JBIG2Stream::globalsStream >>>>>>>> is valid but Ref to it is invalid (if I supply the default >>>>>>>> value for the argument to pass Ref). How do we detect or prevent >>>>>>>> such case? >>>>>>>> >>>>>>>> A) Supplying the default value (NULL pointer) in the argument >>>>>>>> to pass the Ref, and JBIG2Stream returns an error when >>>>>>>> globalsStream is valid stream but its Ref is not passed - and >>>>>>>> prevent the case that globalsStream is valid but its Ref is >>>>>>>> invalid. >>>>>>>> >>>>>>>> B) Not supplying the default value, and prevent the construction >>>>>>>> without the Ref to the globalsStream. >>>>>>>> >>>>>>>> C) Add new boolean value in JBIG2Stream class, to indicate if >>>>>>>> Ref to globalsStream is explicitly set. >>>>>>>> >>>>>>>> D) Initialize Ref with irregular value (e.g. 0,0 or 0,65535) >>>>>>>> internally when Ref is not passed, and deal the case as Ref to >>>>>>>> the globalsStream is not initialized (my last patch design) >>>>>>>> >>>>>>>> Which is the best? Or, no need to case such case? >>>>>>> JBIG2Stream is not part of the public API so there is no need to >>>>>>> preserve compatibility. >>>>>>> >>>>>>> Something else I just noticed - could you rename "st" to "status". >>>>>>> CairoOutputdev.cc always uses the name "status" for cairo status so we >>>>>>> should be consistent with this convention. >>>>>>> >>>>> _______________________________________________ >>>>> poppler mailing list >>>>> [email protected] >>>>> http://lists.freedesktop.org/mailman/listinfo/poppler >>> >>> ------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> poppler mailing list >>> [email protected] >>> http://lists.freedesktop.org/mailman/listinfo/poppler >> >> >> _______________________________________________ >> poppler mailing list >> [email protected] >> http://lists.freedesktop.org/mailman/listinfo/poppler >> > _______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
