The problem with the pdf files is that they are storing the information for 
every one of your points, even the ones that are overplotted by other points.  
The png file is smaller because it only stores information on which color each 
pixel should be, not how many points contributed to a particular pixel being a 
given color.  But then png files convert the text to pixel information as well 
which don't look good if there is post scaling.

If you want to go the pdf route, then you need to find some way to reduce 
redundant information while still getting the main points of the plot.  With so 
many point, I would suggest looking at the hexbin package (bioconductor I 
think) as one approach, it will not be an identical scatterplot, but will 
convey the information (possibly better) with much smaller graphics file sizes. 
 There are other tools like sunflower plots or others, but hexbin has worked 
well for me.

If you want to go the png route, the problem usually comes from scaling the 
plot after producing it.  So, the solution is to create the plot at the exact 
size and at the exact resolution that you want to use it at in your document so 
that no scaling needs to be done.  Use the png function, but don't accept the 
defaults, choose the size and resolution.  If you later decide on a different 
size of graph, recreate the file, don't let LaTeX rescale the first one.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -----Original Message-----
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Lasse Kliemann
> Sent: Thursday, October 22, 2009 1:07 PM
> To: r-help@r-project.org
> Subject: [R] PDF too large, PNG bad quality
> 
> I wish to save a scatter plot comprising approx. 2 million points
> in order to include it in a LaTeX document.
> 
> Using 'pdf(...)' produces a file of size about 20 MB, which is
> useless.
> 
> Using 'cairo_pdf(...)' produces a smaller file, around 3 MB. This
> is still too large. Not only that the document will be too large,
> but also PDF viewers choke on this. Moreover, Cairo has problems
> with text: by default text looks ugly, like scaled bitmaps. After
> hours of trying different settings, I discovered that choosing a
> different font family can help, e.g.: 'par(family="Mono")'. This
> gives good-looking text. Yet, the problem with the file size
> remains.
> 
> There exists the hint to produdc EPS instead and then convert to
> PDF using 'epstopdf'. The resulting PDF files are slightly
> smaller, but still too large, and PDF viewers still don't like
> it.
> 
> So I gave PNG a try. PNG files are much smaller and PDF viewers
> have no trouble with them. However, fonts look ugly. The same
> trick that worked for Cairo PDF has no effect for PNG. When I
> view the PNGs with a dedicated viewer like 'qiv', even the fonts
> look good. But not when included in LaTeX; I simply use
> '\includegraphics{...}' and run the document through 'pdflatex'.
> 
> I tried both, creating PNG with 'png(...)' and converting from
> PDF to PNG using 'convert' from ImageMagick.
> 
> So my questions are:
> 
> - Is there a way to produce sufficiently lean PDFs directly in R,
>   even when the plot comprises several million points?
> 
> - How to produce a PNG that still looks nice when included in a
>   LaTeX PDF document?
> 
> Any hints will be greatly appreciated.
> 
> Thank you
> Lasse

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to