Re: [Rd] Colour Schemes
I've been thinking hard about generating colour schemes for data. There's quite a bit of existing code scattered in various packages for playing with colours and colour palettes, but I can't find the sort of thing I'm after for applying colours to data... To my mind a colour scheme is a mapping from data values to colours. There's a multitude of such mappings depending on the nature of the data. For example, for a factor you might want to map levels to unique colours. For numbers that run from -4 to +2 you might want to use a diverging colour palette centred on zero. This might be continuous in some colour space or composed of a small number of discrete colours, each of which covers a range of values. Or it could be piecewise continuous as used in topographic maps - less than zero is blue, zero to 400 goes from sandy yellow to grassy green, 400 to 1000 goes from grassy green to rocky brown, then suddenly you hit the ice and 1000 and upwards is white. Or you could have a multivariate mapping where (x,y,z) - (r,g,b,a) in complex and non-linear ways. I see a set of factory functions that return colour scheme mapping functions that map data to colours, so you'd do: # unique colour for each factor level scheme1 = exactColours(data$f,someColours) # data$f is a factor, someColours is a vector of colour values plot(data$x,data$y,col=scheme1(data$f)) # topological map colouring scheme2 = continuousColours(list(-1000,blue,0,sandYellow, 400,grassGreen,1000,rockBrown,1000,white,1)) # or something... plot(data$x,data$y,col=scheme2(data$height)) Now just because I can't find existing functions like this doesn't mean they don't exist. There's stuff in plotrix, colorspace, RColorBrewer etc for creating palettes but then the user is left to their own devices to map colours to data values. Does this kind of thing sound useful? Has it been done? Is it worth doing? Anybody got any better ideas? Most of the plots where colour is typically used to signify a variable already do map colours to data values. Take a look at help pages for levelplot/contourplot/wireframe from the lattice package, and image from base graphics. (The format is typically slightly different to your suggested specification, though the principle is the same. The functions take a vector of cut points, and a vector of colours.) There may be some utility in creating functions to generate these colour maps outside of the plotting functions, if only so that the code can be recycled for new functions. Regards, Richie. Mathematical Sciences Unit HSL ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Colour Schemes
On Thu, May 21, 2009 at 2:18 PM, richard.cot...@hsl.gov.uk wrote: Most of the plots where colour is typically used to signify a variable already do map colours to data values. Take a look at help pages for levelplot/contourplot/wireframe from the lattice package, and image from base graphics. (The format is typically slightly different to your suggested specification, though the principle is the same. The functions take a vector of cut points, and a vector of colours.) The problem here is that the user doesn't have exact control of the mapping from value to colour. For example (using a slightly more safe-for-use-after-lunch version of the levelplot example grid): x - seq(pi/4, 5 * pi, length.out = 100) y - seq(pi/4, 5 * pi, length.out = 100) r - as.vector(sqrt(outer(x^2, y^2, +))) grid - expand.grid(x=x, y=y) grid$z - r grid$z2 = r *0.5 Then I do: levelplot(z~x*y, grid, cuts = 5, col.regions=rainbow(5)) very nice, but suppose I want to show $r2 on the same colour scale, I can't just do: levelplot(z2~x*y, grid, cuts = 5, col.regions=rainbow(5)) because that looks the same as the first one since levelplot uses the whole colour range. The base graphics image function has zlim arguments which let you do: z=outer(1:10,1:10,*) image(z) image(z/2, zlim=range(z)) but again, not obvious, and complex/impossible when using more sophisticated colour mappings. There may be some utility in creating functions to generate these colour maps outside of the plotting functions, if only so that the code can be recycled for new functions. Exactly, it would make a new package. Barry __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Colour Schemes
I'm going to take your second example first. The base graphics image function has zlim arguments which let you do: z=outer(1:10,1:10,*) image(z) image(z/2, zlim=range(z)) but again, not obvious, and complex/impossible when using more sophisticated colour mappings. The way to do more complex examples, similar to the manner you suggested originally, is use the breaks and col arguments, e.g. breaks - c(0,25,75,100) col - c(red, blue, green) image(z, breaks=breaks, col=col) image(z/2, breaks=breaks, col=col) So it is possible here, though perhaps not obvious. With your first example ... The problem here is that the user doesn't have exact control of the mapping from value to colour. For example (using a slightly more safe-for-use-after-lunch version of the levelplot example grid): x - seq(pi/4, 5 * pi, length.out = 100) y - seq(pi/4, 5 * pi, length.out = 100) r - as.vector(sqrt(outer(x^2, y^2, +))) grid - expand.grid(x=x, y=y) grid$z - r grid$z2 = r *0.5 Then I do: levelplot(z~x*y, grid, cuts = 5, col.regions=rainbow(5)) very nice, but suppose I want to show $r2 on the same colour scale, I can't just do: levelplot(z2~x*y, grid, cuts = 5, col.regions=rainbow(5)) because that looks the same as the first one since levelplot uses the whole colour range. ... I agree that the inability to specify a vector of cut points ruins your chances of doing what you want. There may be some utility in creating functions to generate these colour maps outside of the plotting functions, if only so that the code can be recycled for new functions. Exactly, it would make a new package. Excellent. I reckon keeping the cut vector/colour vector input format would be sensible, e.g. continuousColours - function(x, breaks, col) { if(length(breaks) != length(col)+1) stop(must have one more break than colour) col[as.numeric(cut(x, breaks))] } x - runif(10, 0, 5) breaks - 0:5 col - c(red, blue, green, cyan, magenta) plot(1:10, x, col=continuousColours(x, breaks, col)) Regards, Richie. Mathematical Sciences Unit HSL ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Colour Schemes
Yes, but these things are all at the wrong conceptual level. What you are constructing here is a function that maps value to colour, but keeping it as breaks and cut values and colours instead of representing it as a function. Wouldn't it be nicer to build a real function object and have that to pass around? This is basically what ggplot2 does behind the scenes, with the slight addition that scales also know how to be trained, so that the domain can be learned from the data: sc - scale_colour_gradient() sc$train(1:10) sc$map(1:10) Hadley -- http://had.co.nz/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Colour Schemes
On Thu, May 21, 2009 at 10:53 AM, Barry Rowlingson b.rowling...@lancaster.ac.uk wrote: On Thu, May 21, 2009 at 5:29 PM, Deepayan Sarkar deepayan.sar...@gmail.com wrote: [oops I didnt reply-to-all] But you could specify an explicit 'at' vector specifying the color breakpoints: effectively, you want at = do.breaks(zlim, 5). lattice does have a function called 'level.colors' that factors out the color assignment computation. Yes, but these things are all at the wrong conceptual level. What you are constructing here is a function that maps value to colour, but keeping it as breaks and cut values and colours instead of representing it as a function. Wouldn't it be nicer to build a real function object and have that to pass around? If that tickles your fancy, it's not a big stretch to get to library(lattice) continuousColours - function(at, col.regions, ...) { function(x) { level.colors(x, at = at, col.regions = col.regions, ...) } } ## caveat: level.colors requires 'at' values to be unique, hence the 999,1001 scheme2 - continuousColours(at = list(-1000, 0, 400, 999, 1001, 1), col.regions = c(blue, sandYellow, grassGreen, rockBrown, white)) ## you could do something similar with your list format too, of course. --- which gives scheme2(c(-500, -200, 200, 2000)) [1] blue blue sandYellow white But generally speaking, I wouldn't presume to dictate that any given approach is universally nicer than another; I don't expect others to have the same tastes as me, and conversely, don't expect to be told what my tastes should be (if I did, I would probably be using Excel). -Deepayan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel