> On Jul 6, 2017, at 4:12 AM, Ashley Betts <ashley.be...@saltbushsoftware.com> 
> wrote:
> 
> Hi All,
>    I'm quite new to R and recently started investigating the geospatial 
> plotting capabilities of R via ggplot2. I started by using some of the 
> publicly available datasets from the Australian Bureau of Statistics. 
> Plotting the Level 3 Statistical Area boundaries took over 2 hours on my 2012 
> Mac Book Pro. As there were over 3M rows in the fortify’ed data frame I 
> initially thought this was just how long it must take. I then ran the exact 
> same script on my work laptop which is similarly spec’ed and it ran in 
> approximately 30 seconds. This now has me extremely disappointed in the 
> performance on the Mac which is where I use R the most. I changed my BLAS 
> library to the Accelerate library in a whim that this might make a 
> difference. It did not. Whilst I primarily use RStudio I also ran the same 
> script in R.app and if there was any improvement it was not noticeable. I did 
> notice in the Windows run that it seemed to use multiple cores (which is what 
> made me investigate the BLAS change) whilst the Mac seems to stay bound to a 
> single core. My initial thoughts were that it must be something to do with 
> ggplot but after sampling the rsession process a number of times (see 
> attached Sample of rsession.txt) it appears to be spending most of it’s time 
> in CGContextDrawPath in Apples CoreGraphics so I assume it is a Graphics 
> related issue. I’m running R 3.4 on my Mac and 3.3.2 on the Windows machine. 
> I’ve attached the script, process sample text and a number of screen shots 
> that I hope will be helpful in analysing the issue. Could someone possibly 
> let me know if this is PEBKAC issue or an actual problem with R. If the later 
> how do I go about getting the issue resolved?
> 
> The SA3 boundary data is available here:
> 
> http://www.abs.gov.au/AUSSTATS/abs@.nsf/DetailsPage/1270.0.55.001July%202016?OpenDocument
> 
> as 'Statistical Area Level 3 (SA3) ASGS Ed 2016 Digital Boundaries in ESRI 
> Shapefile Format’

I tried opening that .R file (which I was surprised made it through the usual 
scrubbing process) and I see this at the top.
=====copied=====
library(rio)
library(ggplot2)
library(rgdal)
library(rgeos)
library(dplyr)

convert("../Data/ABS/14100DS0001_2017-03.xlsx", "absregdata.csv")
======end-copy===

After seeing that I went looking in the linked document which was really not a 
link to a document. I did find the referenced document on that page and 
downloaded the file:

http://www.abs.gov.au/AUSSTATS/subscriber.nsf/log?openagent&1270055001_sa3_2016_aust_shape.zip&1270.0.55.001&Data%20Cubes&43942523105745CBCA257FED0013DB07&0&July%202016&12.07.2016&Latest

That's the shapefile that is referenced later in the code, but I see no way to 
find the CSV file that you are loading. So I see no method of reproducing your 
observations.

You are also several version behind the current dplyr release. I happen to have 
the same outdated versions of rgdal, rgeos, and sp packages but they, too, are 
slightly out-of-date.

So unable to attempt reproducing your difficulties. You should try at a minimum 
to supply data that will allow this. You should also try starting your Mac with 
a minimum of of other loaded applications on a clean session. memory 
fragmentation often prevents execution of large jobs in memory and long times 
are possible if you need to page out to disk and do not have a SSD device as 
your system disk.

(I'm able to read but not to understand the results of your sampling. It's 
possible that more savvy users of macs will be able to tell whether my 
hypothesis, that this is caused by paging-out to disk, is correct.

Hope this helps;
David.



> 
> Regards,
> 
> Ashley
> 


David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   
-Gehm's Corollary to Clarke's Third Law

_______________________________________________
R-SIG-Mac mailing list
R-SIG-Mac@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-mac

Reply via email to