Albert Astals Cid wrote:
> A Dijous, 21 de maig de 2009, Angus March va escriure:
>   
>> Albert Astals Cid wrote:
>>     
>>> A Dimecres, 20 de maig de 2009, Angus March va escriure:
>>>       
>>>> Albert Astals Cid wrote:
>>>>         
>>>>> A Dimarts, 19 de maig de 2009, Angus March va escriure:
>>>>>           
>>>>>> Adrian Johnson wrote:
>>>>>>             
>>>>>>> Angus March wrote:
>>>>>>>               
>>>>>>>> I tried using Poppler to get a Cairo surface and then saving the
>>>>>>>> surface to a PNG. Unfortunately, the resulting image was of
>>>>>>>> disastrously low quality.
>>>>>>>>                 
>>>>>>> Without seeing your code or the output you are getting I can only
>>>>>>> guess at what the problem might be. Did you alter the cairo scale to
>>>>>>> get the desired image dpi?
>>>>>>>               
>>>>>>     It was definitely an improvement, but I think the only thing that
>>>>>> did improve was the resolution. The old problems that caused me to
>>>>>> abandon Cairo persisted, which are: gradients have ugly stripes on
>>>>>> them, a background that should be white and opaque is black and
>>>>>> transparent, and some text that has a shadow in the PDF doesn't in the
>>>>>> image. I don't suppose you know of a way to deal w/those problems.
>>>>>>             
>>>>> ?
>>>>>
>>>>> I don't see anything obviously wrong.
>>>>>
>>>>> Basically it is:
>>>>>  * Create PDFDoc
>>>>>  * Create SplashOutputDev
>>>>>  * Call SplashOutputDev::startDoc
>>>>>  * Call PDFDoc::displayPageSlice
>>>>>           
>>>>     Well there definitely is something wrong, because it works with
>>>> pdftoppm. I thought of things like the __attribute__((constructor))
>>>> attribute, or static objects, but I don't see any evidence of the
>>>> attribute and I wouldn't know how to find a static object in all that
>>>> code. Maybe multiple processes causes problems for Splash.
>>>>
>>>>
>>>>
>>>> It's hard to know where to go.
>>>>         
>>> The crashes you pasted are from poppler compiled with -O2? If so remove
>>> the - O2 and substitute -g by -g3. Optimized poppler backtraces are
>>> really misleading.
>>>       
>>     I figured out a way to get my app to build from the poppler lib I
>> rolled myself (although I'd still like to know what the proper procedure
>> is to get it to build in debug, and install the Splash stuff) and I got
>> some valgrind reports that might be more helpful, but are fewer than
>> those I got when I was using the SUSE distro's lib:
>>
>> ==8577== Conditional jump or move depends on uninitialised value(s)
>> ==8577==    at 0x53DACE4: FoFiType1C::parse() (FoFiType1C.cc:1848)
>> ==8577==    by 0x53E10AB: FoFiType1C::make(char*, int) (FoFiType1C.cc:35)
>> ==8577==    by 0x5369A58: Gfx8BitFont::Gfx8BitFont(XRef*, char*, Ref,
>> GooString*, GfxFontType, Dict*) (GfxFont.cc:699)
>> ==8577==    by 0x536D72C: GfxFont::makeFont(XRef*, char*, Ref, Dict*)
>> (GfxFont.cc:143)
>> ==8577==    by 0x536D933: GfxFontDict::GfxFontDict(XRef*, Ref*, Dict*)
>> (GfxFont.cc:2051)
>> ==8577==    by 0x535AD21: GfxResources::GfxResources(XRef*, Dict*,
>> GfxResources*) (Gfx.cc:313)
>> ==8577==    by 0x535DD6B: Gfx::Gfx(XRef*, OutputDev*, int, Dict*,
>> Catalog*, double, double, PDFRectangle*, PDFRectangle*, int, int
>> (*)(void*), void*) (Gfx.cc:502)
>> ==8577==    by 0x539AF12: Page::createGfx(OutputDev*, double, double,
>> int, int, int, int, int, int, int, int, Catalog*, int (*)(void*), void*,
>> int (*)(Annot*, void*), void*) (Page.cc:404)
>> ==8577==    by 0x539B173: Page::displaySlice(OutputDev*, double, double,
>> int, int, int, int, int, int, int, int, Catalog*, int (*)(void*), void*,
>> int (*)(Annot*, void*), void*) (Page.cc:433)
>> ==8577==    by 0x40A756: pdf2jpg::GetSplash(int) (pdf2jpg.cpp:176)
>> ==8577==    by 0x40A9B5: pdf2jpg::TopupJpegThreads(int, astring const&)
>> (pdf2jpg.cpp:156)
>> ==8577==    by 0x40B3B1: pdf2jpg::Execute(int, char const*, char const*,
>> int) (pdf2jpg.cpp:99)
>> ==8577==
>>     
>
> Are you positively sure this doesn't happen with pdftoppm? Doesn't make any 
> sense.
>   

    It doesn't seem to be. I'll try running valgrind on the debug
version of pdftoppm that I have here, and see what that does...
    Well she hasn't reported any problems so far. I'll see tomorrow
morning, then I guess I'll know for sure.
    Also, I keep forgetting to point out that another problem my app has
is with Splash getting stuck in an infinite loop every so often,
requiring a kill -9.
    How about this: I send you a sample of something that causes the
problems. Compile this and run it through valgrind. It came across a few
problems in a short time. BTW, for the sake of simplicity, it doesn't
actually output any files. It just gets the raw image data from Splash.
/***************************************************************************
 *   Copyright (C) 2008 by Angus   *
 *   [email protected]   *
 *                                                                         *
 *   This program is free software; you can redistribute it and/or modify  *
 *   it under the terms of the GNU General Public License as published by  *
 *   the Free Software Foundation; either version 2 of the License, or     *
 *   (at your option) any later version.                                   *
 *                                                                         *
 *   This program is distributed in the hope that it will be useful,       *
 *   but WITHOUT ANY WARRANTY; without even the implied warranty of        *
 *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the         *
 *   GNU General Public License for more details.                          *
 *                                                                         *
 *   You should have received a copy of the GNU General Public License     *
 *   along with this program; if not, write to the                         *
 *   Free Software Foundation, Inc.,                                       *
 *   59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.             *
 ***************************************************************************/


#ifdef HAVE_CONFIG_H
#include <config.h>
#endif

#ifdef NDEBUG
#define verify(x) x
#else
#define verify(x) assert(x)
#endif

#include <string.h>
#include <assert.h>
#include <string>
using namespace std;

class SplashOutputDev;
class PDFDoc;
#include <splash/SplashBitmap.h>

/**
\brief A class that converts a pdf to a series of jpegs

	@author Angus March <an...@linux-cgg2>
*/
class pdf2jpg{
public:
	pdf2jpg();
	~pdf2jpg();

	void Execute(const char *pPDFName, const char *pJPGPrefix, int nMaxDimension);

private:
	int m_nMaxThreadCount, m_nRunningProcessCount;
	double m_pg_w, m_pg_h;
	int m_w, m_h;
	int m_nOutputWidth, m_nOutputHeight;
	int m_nInputWidth, m_nInputHeight;
	PDFDoc *m_pdoc;
	SplashColor m_paperColor;
	int m_resolution;

private:
	bool TopupJpegThreads(int pg, const string &sFileName);
	SplashOutputDev *GetSplash(int pg);
public:
	static string m_sCurrentPDF;
};

#include <stdlib.h>


int main(int argc, char *argv[]) {
	assert(argc == 2);
	pdf2jpg thing;
	thing.Execute(argv[1], "page", 1324);

  return EXIT_SUCCESS;
}

class SplashOutputDev;

#include <poppler/Object.h>
#include <splash/SplashTypes.h>

/**
\brief Represents a thread in which images are resized and saved as jpegs

	@author Angus March <an...@linux-cgg2>
*/
class JpegThread {
public:
	JpegThread(int nInputWidth, int nInputHeight, int nOutputWidth, int nOutputHeight);

	void GiveJob(SplashOutputDev *pSplash, const string &sFileName);

protected:
	int m_nInputWidth, m_nInputHeight, m_nOutputWidth, m_nOutputHeight;
	SplashOutputDev *m_pSplash;
	string m_sFileName;
private:
	JpegThread(const JpegThread &);
	inline void Thread();

	typedef float fpinterpolate;

	inline static fpinterpolate R(fpinterpolate x);
	static void Interpolate(char *outImage, const Guchar *pOriginalImage,
										const fpinterpolate dInputX, const fpinterpolate dInputY,
										const int nOutputX, const int nOutputY,
										int nInputStride,
										int nInputRowSize, int nInputColumnSize,
										int nOutputColumnSize);
    void Log(const char *pMessage) const;
};


#include <boost/scoped_ptr.hpp>
#include <poppler/GlobalParams.h>
#include <poppler/PDFDoc.h>
#include <poppler/SplashOutputDev.h>
#include <math.h>
#include <sys/wait.h>

#include <sys/timeb.h>

string pdf2jpg::m_sCurrentPDF;

static void sig_handler(int) {
	fprintf(stderr, "(%d) seg fault hit while trying to extract an image from %s. This is probably a corrupt pdf\n", (int)getpid(), pdf2jpg::m_sCurrentPDF.c_str());
	exit(EXIT_FAILURE);
}


pdf2jpg::pdf2jpg() : m_nRunningProcessCount(0), m_pdoc(NULL) {
	m_nMaxThreadCount = 4;
}

pdf2jpg::~pdf2jpg(){
	delete m_pdoc;
}

void pdf2jpg::Execute(const char *pPDFName, const char *pJPGPrefix, int nMaxDimension) {
	int &resolution = m_resolution;

	m_sCurrentPDF = pPDFName;
	signal(SIGSEGV, sig_handler);

	boost::scoped_ptr<GlobalParams> bs(new GlobalParams());
	globalParams = bs.get();
	resolution = 150;
	GooString *filename = new GooString(pPDFName);
	m_pdoc = new PDFDoc(filename);
	PDFDoc &doc = *m_pdoc;
	if (!doc.isOk()) {
		fprintf(stderr, "Error opening %s by PDFDoc (Poppler)\n", pPDFName);
		return;
	}
	SplashColor &paperColor = m_paperColor;
	paperColor[0] =255;paperColor[1] = 255;paperColor[2] = 255;

	bool bWorkerProcess = false;

	{
		double &pg_w = m_pg_w, &pg_h = m_pg_h;
		int &w = m_w, &h = m_h;
		pg_w = doc.getPageMediaWidth(1);
    pg_h = doc.getPageMediaHeight(1);
    pg_w = pg_w * (resolution / 72.0);
    pg_h = pg_h * (resolution / 72.0);
    if (doc.getPageRotate(1)) {
      double tmp = pg_w;
      pg_w = pg_h;
      pg_h = tmp;
    }
		w = (int)ceil(pg_w);
  	h = (int)ceil(pg_h);
		w = (0+w > pg_w ? (int)ceil(pg_w-0) : w);
  	h = (0+h > pg_h ? (int)ceil(pg_h-0) : h);
	}
	{
		SplashOutputDev *splashOut = GetSplash(1);
		SplashBitmap &bmp = *splashOut->getBitmap();
		m_nInputWidth = bmp.getWidth();m_nInputHeight = bmp.getHeight();
		if (m_nInputWidth > nMaxDimension || m_nInputHeight > nMaxDimension) {
			double dScale;
			if (m_nInputWidth > m_nInputHeight) {
				dScale = nMaxDimension/double(m_nInputWidth);
				m_nOutputWidth = nMaxDimension;m_nOutputHeight = m_nInputHeight*dScale;
			}
			else {
				dScale = nMaxDimension/double(m_nInputHeight);
				m_nOutputWidth = m_nInputWidth*dScale;m_nOutputHeight = nMaxDimension;
			}
		}
		else {
			m_nOutputWidth = m_nInputWidth, m_nOutputHeight = m_nInputHeight;
		}
		delete splashOut;
	}
	//time_t t0 = time(NULL);
	//struct timeb tStartDoc;tStartDoc.time=0;tStartDoc.millitm = 0;
  for (int pg = 1; pg <= doc.getNumPages() && !bWorkerProcess; ++pg) {
fprintf(stderr, "Working page: %d of %s\n", pg, pPDFName);
		/*ftime(&t1);
		tStartDoc.time += t1.time - t0.time;
		if (t1.millitm < t0.millitm) {tStartDoc.time--; tStartDoc.millitm += t1.millitm - t0.millitm + 1000;}
		else tStartDoc.millitm += t1.millitm - t0.millitm;*/

		char pageno[5];sprintf(pageno, "%04d", pg - 1);
		bWorkerProcess = TopupJpegThreads(pg, (string)pJPGPrefix + pageno + ".jpg");

  }
	if (!bWorkerProcess)
		for (; m_nRunningProcessCount > 0; --m_nRunningProcessCount)
			verify(wait(NULL) > 0);
	/*time_t secs = time(NULL) - t0;
	time_t min = secs/60;
	int othersecs = secs%60;
	int nStartDocSecs = tStartDoc.time + tStartDoc.millitm/1000;
	int nStartDocMillis = tStartDoc.millitm%1000;
	if (nStartDocMillis < 0) nStartDocMillis = -nStartDocMillis;
	int nStartDocMins = nStartDocSecs/60;
	int nStartDocOtherSecs = nStartDocSecs%60;
	char buf[1900];
	assert(nStartDocMillis >= 0 && nStartDocMillis < 1000);
	sprintf(buf, "Start Doc took: %dm%02d.%03ds and if you're interested nStartDocSecs: %d and tStartDoc.time was: %d and tStartDoc.millitm was: %d", nStartDocMins, nStartDocOtherSecs, nStartDocMillis, nStartDocSecs, int(tStartDoc.time), int(tStartDoc.millitm));
g_pLog->LogLine("quitting after " + LtoA(min) + "m" + LtoA(othersecs) + "s");
g_pLog->LogLine(buf);*/
}

/*!
    \fn pdf2jpg::TopupJpegThreads()
 */
bool pdf2jpg::TopupJpegThreads(int pg, const string &sFileName) {
	const int &nInputWidth = m_nInputWidth, &nInputHeight = m_nInputHeight, &nOutputWidth = m_nOutputWidth, &nOutputHeight = m_nOutputHeight;
	if (m_nRunningProcessCount == m_nMaxThreadCount) {
		verify(wait(NULL) > 0);
		--m_nRunningProcessCount;
	}
	else assert(m_nRunningProcessCount < m_nMaxThreadCount);
	pid_t pid = fork();
	bool bWorkerProcess = pid == 0;
	if (!bWorkerProcess) {
		m_nRunningProcessCount++;
		return false;
	}
fprintf(stderr, "Forked: %d\n", (int)getpid());
	boost::scoped_ptr<SplashOutputDev> splashOut(GetSplash(pg));
	JpegThread jpeg(nInputWidth, nInputHeight, nOutputWidth, nOutputHeight);
	jpeg.GiveJob(splashOut.get(), sFileName);
fprintf(stderr, "Quitting: %d\n", (int)getpid());
	return true;
}

/*!
    \fn pdf2jpg::GetSplash(int pg) const
 */
SplashOutputDev *pdf2jpg::GetSplash(int pg) {
  SplashOutputDev *splashOut = new SplashOutputDev(splashModeRGB8, 4, gFalse, m_paperColor);
	PDFDoc &doc = *m_pdoc;
 	splashOut->startDoc(doc.getXRef());
	//struct timeb t0, t1;ftime(&t0);
 	doc.displayPageSlice(splashOut,
    	pg, m_resolution, m_resolution,
    	0,
    	gTrue, gFalse, gFalse,
    	0, 0, m_w, m_h
  	);
	return splashOut;
}

#include <splash/SplashBitmap.h>
#include <poppler/SplashOutputDev.h>
#include <stdint.h>

JpegThread::JpegThread(int nInputWidth, int nInputHeight, int nOutputWidth, int nOutputHeight) : m_nInputWidth(nInputWidth), m_nInputHeight(nInputHeight), m_nOutputWidth(nOutputWidth), m_nOutputHeight(nOutputHeight) {
}

/*!
    \fn JpegThread::GiveJob(const char *pImage)
 */
void JpegThread::GiveJob(SplashOutputDev *pSplash, const string &sFileName) {
	m_pSplash = pSplash;
	m_sFileName = sFileName;
	Thread();
}

/*!
    \fn JpegThread::Thread()
 */
void JpegThread::Thread() {
m_nOutputWidth = m_nInputWidth; m_nOutputHeight = m_nInputHeight;
	char *pImage = (char *)malloc(3*m_nOutputWidth*m_nOutputHeight);
	bool bScaling = m_nInputWidth > m_nOutputWidth || m_nInputHeight > m_nOutputHeight;//indicates we are downsampling
	double dScale;
	if (bScaling) {
		if (m_nInputWidth > m_nInputHeight)
			dScale = m_nOutputWidth/double(m_nInputWidth);
		else
			dScale = m_nOutputHeight/double(m_nInputHeight);
	}
#ifndef NDEBUG
	else dScale = -1;
#endif
	{
		assert(m_pSplash != NULL);
		SplashBitmap &bmp = *m_pSplash->getBitmap();
		Guchar *pImageOriginal = bmp.getDataPtr();
		int nRowLength = bmp.getRowSize();
		assert(nRowLength < 100000);
		assert(m_nOutputHeight < 100000 && m_nOutputWidth < 100000);
		if (bScaling) {
			assert(dScale > 0);
			for(int nOutputY = 0; nOutputY < m_nOutputHeight; nOutputY++) {
				double dInputY = nOutputY/dScale;
				// for each pixel in a destination row
				for(int nOutputX = 0; nOutputX < m_nOutputWidth; nOutputX++) {
					double dInputX = nOutputX/dScale;

					//Interpolate(pImage, pImageOriginal, dInputX, dInputY, nOutputX, nOutputY, nRowLength, m_nInputHeight, m_nInputWidth, m_nOutputWidth);
				}
			}
		}
		else {
			assert(pImage != NULL);
			char *pOut = pImage;
			Guchar *pIn = pImageOriginal;
			for(int y = 0; y < m_nOutputHeight; y++) {
				memcpy(pOut, pIn, 3*m_nOutputWidth);
				pOut += 3*m_nOutputWidth;
				assert(nRowLength == bmp.getRowSize());
				pIn += nRowLength;
			}
		}
    //verify(write_jpeg(pImage, m_nOutputWidth, m_nOutputHeight, m_sFileName));
	}
	free(pImage);
}
_______________________________________________
poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler

Reply via email to