Albert Astals Cid wrote:
> A Dijous, 21 de maig de 2009, Angus March va escriure:
>
>> Albert Astals Cid wrote:
>>
>>> A Dimecres, 20 de maig de 2009, Angus March va escriure:
>>>
>>>> Albert Astals Cid wrote:
>>>>
>>>>> A Dimarts, 19 de maig de 2009, Angus March va escriure:
>>>>>
>>>>>> Adrian Johnson wrote:
>>>>>>
>>>>>>> Angus March wrote:
>>>>>>>
>>>>>>>> I tried using Poppler to get a Cairo surface and then saving the
>>>>>>>> surface to a PNG. Unfortunately, the resulting image was of
>>>>>>>> disastrously low quality.
>>>>>>>>
>>>>>>> Without seeing your code or the output you are getting I can only
>>>>>>> guess at what the problem might be. Did you alter the cairo scale to
>>>>>>> get the desired image dpi?
>>>>>>>
>>>>>> It was definitely an improvement, but I think the only thing that
>>>>>> did improve was the resolution. The old problems that caused me to
>>>>>> abandon Cairo persisted, which are: gradients have ugly stripes on
>>>>>> them, a background that should be white and opaque is black and
>>>>>> transparent, and some text that has a shadow in the PDF doesn't in the
>>>>>> image. I don't suppose you know of a way to deal w/those problems.
>>>>>>
>>>>> ?
>>>>>
>>>>> I don't see anything obviously wrong.
>>>>>
>>>>> Basically it is:
>>>>> * Create PDFDoc
>>>>> * Create SplashOutputDev
>>>>> * Call SplashOutputDev::startDoc
>>>>> * Call PDFDoc::displayPageSlice
>>>>>
>>>> Well there definitely is something wrong, because it works with
>>>> pdftoppm. I thought of things like the __attribute__((constructor))
>>>> attribute, or static objects, but I don't see any evidence of the
>>>> attribute and I wouldn't know how to find a static object in all that
>>>> code. Maybe multiple processes causes problems for Splash.
>>>>
>>>>
>>>>
>>>> It's hard to know where to go.
>>>>
>>> The crashes you pasted are from poppler compiled with -O2? If so remove
>>> the - O2 and substitute -g by -g3. Optimized poppler backtraces are
>>> really misleading.
>>>
>> I figured out a way to get my app to build from the poppler lib I
>> rolled myself (although I'd still like to know what the proper procedure
>> is to get it to build in debug, and install the Splash stuff) and I got
>> some valgrind reports that might be more helpful, but are fewer than
>> those I got when I was using the SUSE distro's lib:
>>
>> ==8577== Conditional jump or move depends on uninitialised value(s)
>> ==8577== at 0x53DACE4: FoFiType1C::parse() (FoFiType1C.cc:1848)
>> ==8577== by 0x53E10AB: FoFiType1C::make(char*, int) (FoFiType1C.cc:35)
>> ==8577== by 0x5369A58: Gfx8BitFont::Gfx8BitFont(XRef*, char*, Ref,
>> GooString*, GfxFontType, Dict*) (GfxFont.cc:699)
>> ==8577== by 0x536D72C: GfxFont::makeFont(XRef*, char*, Ref, Dict*)
>> (GfxFont.cc:143)
>> ==8577== by 0x536D933: GfxFontDict::GfxFontDict(XRef*, Ref*, Dict*)
>> (GfxFont.cc:2051)
>> ==8577== by 0x535AD21: GfxResources::GfxResources(XRef*, Dict*,
>> GfxResources*) (Gfx.cc:313)
>> ==8577== by 0x535DD6B: Gfx::Gfx(XRef*, OutputDev*, int, Dict*,
>> Catalog*, double, double, PDFRectangle*, PDFRectangle*, int, int
>> (*)(void*), void*) (Gfx.cc:502)
>> ==8577== by 0x539AF12: Page::createGfx(OutputDev*, double, double,
>> int, int, int, int, int, int, int, int, Catalog*, int (*)(void*), void*,
>> int (*)(Annot*, void*), void*) (Page.cc:404)
>> ==8577== by 0x539B173: Page::displaySlice(OutputDev*, double, double,
>> int, int, int, int, int, int, int, int, Catalog*, int (*)(void*), void*,
>> int (*)(Annot*, void*), void*) (Page.cc:433)
>> ==8577== by 0x40A756: pdf2jpg::GetSplash(int) (pdf2jpg.cpp:176)
>> ==8577== by 0x40A9B5: pdf2jpg::TopupJpegThreads(int, astring const&)
>> (pdf2jpg.cpp:156)
>> ==8577== by 0x40B3B1: pdf2jpg::Execute(int, char const*, char const*,
>> int) (pdf2jpg.cpp:99)
>> ==8577==
>>
>
> Are you positively sure this doesn't happen with pdftoppm? Doesn't make any
> sense.
>
It doesn't seem to be. I'll try running valgrind on the debug
version of pdftoppm that I have here, and see what that does...
Well she hasn't reported any problems so far. I'll see tomorrow
morning, then I guess I'll know for sure.
Also, I keep forgetting to point out that another problem my app has
is with Splash getting stuck in an infinite loop every so often,
requiring a kill -9.
How about this: I send you a sample of something that causes the
problems. Compile this and run it through valgrind. It came across a few
problems in a short time. BTW, for the sake of simplicity, it doesn't
actually output any files. It just gets the raw image data from Splash.
/***************************************************************************
* Copyright (C) 2008 by Angus *
* [email protected] *
* *
* This program is free software; you can redistribute it and/or modify *
* it under the terms of the GNU General Public License as published by *
* the Free Software Foundation; either version 2 of the License, or *
* (at your option) any later version. *
* *
* This program is distributed in the hope that it will be useful, *
* but WITHOUT ANY WARRANTY; without even the implied warranty of *
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the *
* GNU General Public License for more details. *
* *
* You should have received a copy of the GNU General Public License *
* along with this program; if not, write to the *
* Free Software Foundation, Inc., *
* 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. *
***************************************************************************/
#ifdef HAVE_CONFIG_H
#include <config.h>
#endif
#ifdef NDEBUG
#define verify(x) x
#else
#define verify(x) assert(x)
#endif
#include <string.h>
#include <assert.h>
#include <string>
using namespace std;
class SplashOutputDev;
class PDFDoc;
#include <splash/SplashBitmap.h>
/**
\brief A class that converts a pdf to a series of jpegs
@author Angus March <an...@linux-cgg2>
*/
class pdf2jpg{
public:
pdf2jpg();
~pdf2jpg();
void Execute(const char *pPDFName, const char *pJPGPrefix, int nMaxDimension);
private:
int m_nMaxThreadCount, m_nRunningProcessCount;
double m_pg_w, m_pg_h;
int m_w, m_h;
int m_nOutputWidth, m_nOutputHeight;
int m_nInputWidth, m_nInputHeight;
PDFDoc *m_pdoc;
SplashColor m_paperColor;
int m_resolution;
private:
bool TopupJpegThreads(int pg, const string &sFileName);
SplashOutputDev *GetSplash(int pg);
public:
static string m_sCurrentPDF;
};
#include <stdlib.h>
int main(int argc, char *argv[]) {
assert(argc == 2);
pdf2jpg thing;
thing.Execute(argv[1], "page", 1324);
return EXIT_SUCCESS;
}
class SplashOutputDev;
#include <poppler/Object.h>
#include <splash/SplashTypes.h>
/**
\brief Represents a thread in which images are resized and saved as jpegs
@author Angus March <an...@linux-cgg2>
*/
class JpegThread {
public:
JpegThread(int nInputWidth, int nInputHeight, int nOutputWidth, int nOutputHeight);
void GiveJob(SplashOutputDev *pSplash, const string &sFileName);
protected:
int m_nInputWidth, m_nInputHeight, m_nOutputWidth, m_nOutputHeight;
SplashOutputDev *m_pSplash;
string m_sFileName;
private:
JpegThread(const JpegThread &);
inline void Thread();
typedef float fpinterpolate;
inline static fpinterpolate R(fpinterpolate x);
static void Interpolate(char *outImage, const Guchar *pOriginalImage,
const fpinterpolate dInputX, const fpinterpolate dInputY,
const int nOutputX, const int nOutputY,
int nInputStride,
int nInputRowSize, int nInputColumnSize,
int nOutputColumnSize);
void Log(const char *pMessage) const;
};
#include <boost/scoped_ptr.hpp>
#include <poppler/GlobalParams.h>
#include <poppler/PDFDoc.h>
#include <poppler/SplashOutputDev.h>
#include <math.h>
#include <sys/wait.h>
#include <sys/timeb.h>
string pdf2jpg::m_sCurrentPDF;
static void sig_handler(int) {
fprintf(stderr, "(%d) seg fault hit while trying to extract an image from %s. This is probably a corrupt pdf\n", (int)getpid(), pdf2jpg::m_sCurrentPDF.c_str());
exit(EXIT_FAILURE);
}
pdf2jpg::pdf2jpg() : m_nRunningProcessCount(0), m_pdoc(NULL) {
m_nMaxThreadCount = 4;
}
pdf2jpg::~pdf2jpg(){
delete m_pdoc;
}
void pdf2jpg::Execute(const char *pPDFName, const char *pJPGPrefix, int nMaxDimension) {
int &resolution = m_resolution;
m_sCurrentPDF = pPDFName;
signal(SIGSEGV, sig_handler);
boost::scoped_ptr<GlobalParams> bs(new GlobalParams());
globalParams = bs.get();
resolution = 150;
GooString *filename = new GooString(pPDFName);
m_pdoc = new PDFDoc(filename);
PDFDoc &doc = *m_pdoc;
if (!doc.isOk()) {
fprintf(stderr, "Error opening %s by PDFDoc (Poppler)\n", pPDFName);
return;
}
SplashColor &paperColor = m_paperColor;
paperColor[0] =255;paperColor[1] = 255;paperColor[2] = 255;
bool bWorkerProcess = false;
{
double &pg_w = m_pg_w, &pg_h = m_pg_h;
int &w = m_w, &h = m_h;
pg_w = doc.getPageMediaWidth(1);
pg_h = doc.getPageMediaHeight(1);
pg_w = pg_w * (resolution / 72.0);
pg_h = pg_h * (resolution / 72.0);
if (doc.getPageRotate(1)) {
double tmp = pg_w;
pg_w = pg_h;
pg_h = tmp;
}
w = (int)ceil(pg_w);
h = (int)ceil(pg_h);
w = (0+w > pg_w ? (int)ceil(pg_w-0) : w);
h = (0+h > pg_h ? (int)ceil(pg_h-0) : h);
}
{
SplashOutputDev *splashOut = GetSplash(1);
SplashBitmap &bmp = *splashOut->getBitmap();
m_nInputWidth = bmp.getWidth();m_nInputHeight = bmp.getHeight();
if (m_nInputWidth > nMaxDimension || m_nInputHeight > nMaxDimension) {
double dScale;
if (m_nInputWidth > m_nInputHeight) {
dScale = nMaxDimension/double(m_nInputWidth);
m_nOutputWidth = nMaxDimension;m_nOutputHeight = m_nInputHeight*dScale;
}
else {
dScale = nMaxDimension/double(m_nInputHeight);
m_nOutputWidth = m_nInputWidth*dScale;m_nOutputHeight = nMaxDimension;
}
}
else {
m_nOutputWidth = m_nInputWidth, m_nOutputHeight = m_nInputHeight;
}
delete splashOut;
}
//time_t t0 = time(NULL);
//struct timeb tStartDoc;tStartDoc.time=0;tStartDoc.millitm = 0;
for (int pg = 1; pg <= doc.getNumPages() && !bWorkerProcess; ++pg) {
fprintf(stderr, "Working page: %d of %s\n", pg, pPDFName);
/*ftime(&t1);
tStartDoc.time += t1.time - t0.time;
if (t1.millitm < t0.millitm) {tStartDoc.time--; tStartDoc.millitm += t1.millitm - t0.millitm + 1000;}
else tStartDoc.millitm += t1.millitm - t0.millitm;*/
char pageno[5];sprintf(pageno, "%04d", pg - 1);
bWorkerProcess = TopupJpegThreads(pg, (string)pJPGPrefix + pageno + ".jpg");
}
if (!bWorkerProcess)
for (; m_nRunningProcessCount > 0; --m_nRunningProcessCount)
verify(wait(NULL) > 0);
/*time_t secs = time(NULL) - t0;
time_t min = secs/60;
int othersecs = secs%60;
int nStartDocSecs = tStartDoc.time + tStartDoc.millitm/1000;
int nStartDocMillis = tStartDoc.millitm%1000;
if (nStartDocMillis < 0) nStartDocMillis = -nStartDocMillis;
int nStartDocMins = nStartDocSecs/60;
int nStartDocOtherSecs = nStartDocSecs%60;
char buf[1900];
assert(nStartDocMillis >= 0 && nStartDocMillis < 1000);
sprintf(buf, "Start Doc took: %dm%02d.%03ds and if you're interested nStartDocSecs: %d and tStartDoc.time was: %d and tStartDoc.millitm was: %d", nStartDocMins, nStartDocOtherSecs, nStartDocMillis, nStartDocSecs, int(tStartDoc.time), int(tStartDoc.millitm));
g_pLog->LogLine("quitting after " + LtoA(min) + "m" + LtoA(othersecs) + "s");
g_pLog->LogLine(buf);*/
}
/*!
\fn pdf2jpg::TopupJpegThreads()
*/
bool pdf2jpg::TopupJpegThreads(int pg, const string &sFileName) {
const int &nInputWidth = m_nInputWidth, &nInputHeight = m_nInputHeight, &nOutputWidth = m_nOutputWidth, &nOutputHeight = m_nOutputHeight;
if (m_nRunningProcessCount == m_nMaxThreadCount) {
verify(wait(NULL) > 0);
--m_nRunningProcessCount;
}
else assert(m_nRunningProcessCount < m_nMaxThreadCount);
pid_t pid = fork();
bool bWorkerProcess = pid == 0;
if (!bWorkerProcess) {
m_nRunningProcessCount++;
return false;
}
fprintf(stderr, "Forked: %d\n", (int)getpid());
boost::scoped_ptr<SplashOutputDev> splashOut(GetSplash(pg));
JpegThread jpeg(nInputWidth, nInputHeight, nOutputWidth, nOutputHeight);
jpeg.GiveJob(splashOut.get(), sFileName);
fprintf(stderr, "Quitting: %d\n", (int)getpid());
return true;
}
/*!
\fn pdf2jpg::GetSplash(int pg) const
*/
SplashOutputDev *pdf2jpg::GetSplash(int pg) {
SplashOutputDev *splashOut = new SplashOutputDev(splashModeRGB8, 4, gFalse, m_paperColor);
PDFDoc &doc = *m_pdoc;
splashOut->startDoc(doc.getXRef());
//struct timeb t0, t1;ftime(&t0);
doc.displayPageSlice(splashOut,
pg, m_resolution, m_resolution,
0,
gTrue, gFalse, gFalse,
0, 0, m_w, m_h
);
return splashOut;
}
#include <splash/SplashBitmap.h>
#include <poppler/SplashOutputDev.h>
#include <stdint.h>
JpegThread::JpegThread(int nInputWidth, int nInputHeight, int nOutputWidth, int nOutputHeight) : m_nInputWidth(nInputWidth), m_nInputHeight(nInputHeight), m_nOutputWidth(nOutputWidth), m_nOutputHeight(nOutputHeight) {
}
/*!
\fn JpegThread::GiveJob(const char *pImage)
*/
void JpegThread::GiveJob(SplashOutputDev *pSplash, const string &sFileName) {
m_pSplash = pSplash;
m_sFileName = sFileName;
Thread();
}
/*!
\fn JpegThread::Thread()
*/
void JpegThread::Thread() {
m_nOutputWidth = m_nInputWidth; m_nOutputHeight = m_nInputHeight;
char *pImage = (char *)malloc(3*m_nOutputWidth*m_nOutputHeight);
bool bScaling = m_nInputWidth > m_nOutputWidth || m_nInputHeight > m_nOutputHeight;//indicates we are downsampling
double dScale;
if (bScaling) {
if (m_nInputWidth > m_nInputHeight)
dScale = m_nOutputWidth/double(m_nInputWidth);
else
dScale = m_nOutputHeight/double(m_nInputHeight);
}
#ifndef NDEBUG
else dScale = -1;
#endif
{
assert(m_pSplash != NULL);
SplashBitmap &bmp = *m_pSplash->getBitmap();
Guchar *pImageOriginal = bmp.getDataPtr();
int nRowLength = bmp.getRowSize();
assert(nRowLength < 100000);
assert(m_nOutputHeight < 100000 && m_nOutputWidth < 100000);
if (bScaling) {
assert(dScale > 0);
for(int nOutputY = 0; nOutputY < m_nOutputHeight; nOutputY++) {
double dInputY = nOutputY/dScale;
// for each pixel in a destination row
for(int nOutputX = 0; nOutputX < m_nOutputWidth; nOutputX++) {
double dInputX = nOutputX/dScale;
//Interpolate(pImage, pImageOriginal, dInputX, dInputY, nOutputX, nOutputY, nRowLength, m_nInputHeight, m_nInputWidth, m_nOutputWidth);
}
}
}
else {
assert(pImage != NULL);
char *pOut = pImage;
Guchar *pIn = pImageOriginal;
for(int y = 0; y < m_nOutputHeight; y++) {
memcpy(pOut, pIn, 3*m_nOutputWidth);
pOut += 3*m_nOutputWidth;
assert(nRowLength == bmp.getRowSize());
pIn += nRowLength;
}
}
//verify(write_jpeg(pImage, m_nOutputWidth, m_nOutputHeight, m_sFileName));
}
free(pImage);
}
_______________________________________________
poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler