Marc/Eryk, I have no experience with it, but I believe the hexbin package in BioC was there for this purpose: avoid heavy over-plotting lots of points. You might want to look into that, if you have not done so yet.
Best, Andy > From: Marc Schwartz > > On Wed, 2004-11-24 at 16:34 +0100, Witold Eryk Wolski wrote: > > Hi, > > > > I want to draw a scatter plot with 1M and more points and > save it as pdf. > > This makes the pdf file large. > > So i tried to save the file first as png and than convert > it to pdf. > > This looks OK if printed but if viewed e.g. with acrobat as > document > > figure the quality is bad. > > > > Anyone knows a way to reduce the size but keep the quality? > > Hi Eryk! > > Part of the problem is that in a pdf file, the vector based > instructions > will need to be defined for each of your 10 ^ 6 points in > order to draw > them. > > When trying to create a simple example: > > pdf() > plot(rnorm(1000000), rnorm(1000000)) > dev.off() > > The pdf file is 55 Mb in size. > > One immediate thought was to try a ps file and using the > above plot, the > ps file was "only" 23 Mb in size. So note that ps can be more > efficient. > > Going to a bitmap might result in a much smaller file, but as > you note, > the quality does degrade as compared to a vector based image. > > I tried the above to a png, then converted to a pdf (using 'convert') > and as expected, the image both viewed and printed was "pixelated", > since the pdf instructions are presumably drawing pixels and > not vector > based objects. > > Depending upon what you plan to do with the image, you may have to > choose among several options, resulting in tradeoffs between image > quality and file size. > > If you can create the bitmap file explicitly in the size that you > require for printing or incorporating in a document, that is > one way to > go and will preserve, to an extent, the overall fixed size image > quality, while keeping file size small. > > Another option to consider for the pdf approach, if it does not > compromise the integrity of your plot, is to remove any duplicate data > points if any exist. Thus, you will not need what are in effect > redundant instructions in the pdf file. This may not be possible > depending upon the nature of your data (ie. doubles) without > considering > some tolerance level for "equivalence". > > Perhaps others will have additional ideas. > > HTH, > > Marc Schwartz > > ______________________________________________ > [EMAIL PROTECTED] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > ______________________________________________ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
