from:"Hans Ekbrand"

Re: [R] function input as variable name (deparse/quote/paste) ??

2012-03-11 Thread Hans Ekbrand

On Sat, Mar 10, 2012 at 04:01:21PM -0800, casperyc wrote:
 Sorry if I wasn't stating what I really wanted or it was a bit confusing.
 
 Basically, there are MANY datasets to run suing the same function
 
 I have written a function to analyze it and returns a LIST of useful out put
 in the variable 'res' (to the workspace).
 
 I also created another script run.r such as
 
 myname(dat1)
 myname(dat2)
 myname(dat3)
 myname(dat4)
 myname(dat5) 
 
 For now, each time the output in the main workspace 'res' (the list) is over
 written.
 
 I want it to have different suffix to differentiate them. So I can have a
 look later after the batch is run.

I see no advantage in having that information in variable names. Just

- add the name of the data set to the information that is included in
  the returned list.

- run your function with sapply() and the returned list of sapply will
  be a list of lists.

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] hierarchical clustering of large dataset

2012-03-10 Thread Hans Ekbrand

On Fri, Mar 09, 2012 at 08:26:01PM -0500, Massimo Di Stefano wrote:
 my target is to have 'groups of species' based on the similarity of theyr 
 environmental parameters, and build a dendrogram like [2] 
 
 [2] http://massimo-timecapsule.whoi.edu//data/img/manova_clust_matlab.png

 Il giorno Mar 9, 2012, alle ore 7:18 PM, Peter Langfelder ha scritto:
 
  Well, you didn't say that column e was a label that you wanted to keep
  separate. Any other labels in the data? You may not want to use labels
  in the distance calculation.

If you want to use the results of the cluster-analysis as evidence on
similarities and differences between species, you _must_ not include
numeric variables representing labels in the matrix. Including them
would mean imposing the expected result onto the data.

First do the cluster analysis, then test the distribution of species
in clusters.

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Issues in installing rgl in Mac OS 10.6.8

2012-03-10 Thread Hans Ekbrand

On Fri, Mar 09, 2012 at 04:52:31PM -0800, A Ezhil wrote:
 Dear All,
 
 I am trying to install rgl on my mac notebook from the source file. I tried 
 using: /usr/bin/R64 CMD INSTALL rgl_0.92.798.tar.gz and get the following
 error message:
 
 checking for X... no
 configure: error: X11 not found but required, configure aborted.
 ERROR: configuration failed for package ‘rgl’
 * removing
 ‘/Library/Frameworks/R.framework/Versions/2.14/Resources/library/rgl’
 * restoring previous
 ‘/Library/Frameworks/R.framework/Versions/2.14/Resources/library/rgl’
 
 I do see a directory X11 installed under /usr and Sys.getenv(PATH) inside R 
 gives me: [1] /usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin
 
 Could you please hep me to install rgl package?

Not really, but I can offer a hint: I think your system has the
_runtime_ libraries for X11 (in /usr/X11), but you need _development_
libraries to comile rgl.

I have no knowledge about Mac OS, but in my system, Debian GNU/Linux,
the needed libraries to build rgl from source are:

libgl1-mesa-dev
libglu1-mesa-dev
mesa-common-dev

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] function input as variable name (deparse/quote/paste) ??

2012-03-10 Thread Hans Ekbrand

On Sat, Mar 10, 2012 at 01:29:16PM -0800, casperyc wrote:
 Hi all
 
 Say I have a function:
 
 myname=function(dat,x=5,y=6){
 res-x+y-dat
 }
 
 for various input such as
 
 myname(dat1)
 myname(dat2)
 myname(dat3)
 myname(dat4)
 myname(dat5)
 
 how should I modify the 'res' line, to have new informative variable name
 correspondingly, such as
 
 dat1.res
 dat2.res
 dat3.res
 dat4.res
 dat5.res
 
 stored in the workspace.

Why not keep the information of input values in a list, or vector?
What is gained by storing that info in the variable _name_ ? Your
function could return a list with both the result and the input value.

While you did say that this was part of something complex, I suspect
your post might be a case of Being overly specific and not stating
your real goal.

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How do I force confint() for glm() to be quiet?

2012-03-09 Thread Hans Ekbrand

I need confint() for glm() to supress the messages 

Waiting for profiling to be done...

because they mess up the caching mechanism of pgfSweave (see
https://github.com/cameronbracken/pgfSweave/issues/40).

I have read the help page of confint(), but I do not know how to get
the help page for the glm() version, if any such help page exists.

Is there a general way of turning of output from functions in R, that
would help here?

Below is an example of an intended usage scenario:

x - 1
set.seed(42)
a - rnorm(x)
b - factor(LETTERS[sample(1:7, x, replace = TRUE)])
c - factor(LETTERS[sample(1:4, x, replace = TRUE)])
my.fit - glm(c ~ b + a, family = binomial)
my.results - confint(my.fit)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How do I force confint() for glm() to be quiet?

2012-03-09 Thread Hans Ekbrand


On 2012-03-09 15:30, David Winsemius wrote:


On Mar 9, 2012, at 6:14 AM, Hans Ekbrand wrote:


I need confint() for glm() to supress the messages


I'm wondering if suppressMessages would be helpful? Which in turn 
suggests that you do not know how to use ??, so firt you should get 
in the habit of doing a helpSearch before posting.


??suppress messages

OK, noted.




Waiting for profiling to be done...

because they mess up the caching mechanism of pgfSweave (see
https://github.com/cameronbracken/pgfSweave/issues/40).

I have read the help page of confint(), but I do not know how to get
the help page for the glm() version, if any such help page exists.


When I type ?confint.glm at my console I get this help page:

Ah, I tried ?confint.lm without success and didn't go further.

If suppressMessages is not effective then look at:

?sink

OK, but since suppressMessages works, I'll stick to that.


G. A _minimal_ example would have had fewer iterations,

Sorry.

but this does seem to be effective:

suppressMessages(my.results - confint(my.fit))


Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] speed up merge

2012-03-02 Thread Hans Ekbrand

On Fri, Mar 02, 2012 at 03:24:20AM -0700, Ben quant wrote:
 Hello,
 
 I have a nasty loop that I have to do 11877 times. 

Are you completely sure about that? I often find my self avoiding
loops-by-row by constructing vectors of which rows that fullfil a
condition, and then creating new vectors out of that vector. If you
elaborate on the problem, perhaps we could find a way to avoid the
loops altogether?

Mostly as a note to self, I wrote
http://code.cjb.net/vectors-instead-of-loop.html, it might be
understood by others too, but I'm not sure.

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data analysis

2012-02-28 Thread Hans Ekbrand

On Mon, Feb 27, 2012 at 11:04:13PM -0800, nontokozo mhlanga wrote:
  Please assist me with  all the tests including risk factor analysis i  can
 use to analyse the enclosed database established from a questionnaire survey
 to test for the prevalence of tuberculosis in humans .

That's quite a general request. I think you should try to formulate a
specific question.

Have you read the posting-guide? http://www.R-project.org/posting-guide.html

Also, I don't think the list accepts attached files.

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] count how many row i have in a txt file in a directory

2012-02-26 Thread Hans Ekbrand

On Sun, Feb 26, 2012 at 03:03:58PM +0100, gianni lavaredo wrote:
 Dear Researchers,
 
 I have a large TXT (X,Y,MyValue) file in a directory and I wish to import
 row by row the txt in a loop to save only the data they are inside a buffer
 (using inside.owin of spatstat) and delete the rest. The first step before
 to create a loop row-by-row is to know how many rows there are in the txt
 file without load in R to save memory problem.
 
 some people know the specific function?

If the number of rows are many that even only three variables per row
will cause memory problems, then looping the file row-by-row will take
a very long time.

I would - instead of looping row-by-row - split the text file into
chunks small enough for a chunk to be read into R, and operated on
within R, without memory problems.

I create a test file of 10.000.000 rows

my.words - replicate(1, paste(LETTERS[sample.int(28, 10)], sep = , 
collapse = ))
my.df - data.frame(x=rnorm(1000), y=rnorm(1000), my.val=rep(my.words, 
1000))
write.csv(my.df, file = testmem.csv)

Split the file into smaller chunks, say 1.000.000 rows. I use the
split command in GNU coreutils,

$ split -l 100 testmem.csv

Loop through the cunks.

for(file.name in c(xaa, xab ...){
  chunk - read.csv(file = file.name)
  [ match and add all the interesting rows to an object ]
}

Here's an example that for each chunk prints its third row.

for(file.name in c(xaa, xab)){
  chunk - read.csv(file = file.name)
  print(chunk[3,])
}

With a chunk of 1.000.000 rows, R needed about 250 MB RAM to process this loop.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] count how many row i have in a txt file in a directory

2012-02-26 Thread Hans Ekbrand

On Sun, Feb 26, 2012 at 05:06:42PM +0100, gianni lavaredo wrote:
 thanks Hans.
 
 It's true your idea improve the speed in the analysis respect a row-by-row
 loop.
 
 Sorry if I ask these questions to better understand and better performening
 my code:
 
 1) split command in GNU coreutils, $ split -l 100 testmem.csv
 i never use this command. Is it possibile to coding in R or it's an
 external command?

external. split is - as I wrote - part of GNU coreutils.

 do you have some links where i can study this command. Thanks

http://www.gnu.org/software/coreutils/

 2) is it possible to work with txt file?

txt file is not a well defined concept, such a file could very well
be a csv file, see http://en.wikipedia.org/wiki/Comma-separated_values

?read.csv

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] count how many row i have in a txt file in a directory

2012-02-26 Thread Hans Ekbrand

On Sun, Feb 26, 2012 at 09:39:46AM -0800, Rui Barradas wrote:
 Hello,
 
  The first step before to create a loop row-by-row is to know
  how many rows there are in the txt file without load in R to save memory
  problem.
  
  some people know the specific function? 
  
 
 I don't believe there's a specific function.

As stated, OP does not need to know the number of lines in the file to
solve the problem. However, if you want to know that, I'd suggest the
command wc rather than writing a function in R to accomplish this.

wc is also part of GNU coreutils

$ wc -l foo.csv
1138200 foo.csv

 If you want to know how many rows are there in a txt file, try this
 function.
 
 numTextFileLines - function(filename, header=FALSE, sep=,, nrows=5000){
   tc - file(filename, open=rt)
   on.exit(close(tc))
   if(header){
   # cnames: column names (not used)
   cnames - read.table(file=tc, sep=sep, nrows=1, 
 stringsAsFactors=FALSE)
   # cnames - as.character(cnames)
   }
   n - 0
   while(TRUE){
   x - tryCatch(read.table(file=tc, sep=sep, nrows=nrows), 
 error=function(e)
 e)
   if (any(grepl(no lines available, unclass(x
   break
   if(nrow(x)  nrows){
   n - n + nrow(x)
   break
   }
   n - n + nrows
   }
   n
 }

But hey, programming R is fun, so why not?

--
Hans Ekbrand

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] which is the fastest way to make data.frame out of a three-dimensional array?

2012-02-25 Thread Hans Ekbrand

foo - rnorm(30*34*12)
dim(foo) - c(30, 34, 12)

I want to make a data.frame out of this three-dimensional array. Each dimension 
will be a variabel (column) in the data.frame.

I know how this can be done in a very slow way using for loops, like this:

x - rep(seq(from = 1, to = 30), 34)
y - as.vector(sapply(1:34, function(x) {rep(x, 30)}))
month - as.vector(sapply(1:12, function(x) {rep(x, 30*34)}))
my.df - data.frame(month, x=rep(x, 12), y=rep(y, 12), temp=rep(NA, 30*34*12))
my.counter - 1 
for(month in 1:12){
  for(i in 1:34){
for(j in 1:30){
  my.df$temp[my.counter] - foo[j,i,month]
  my.counter - my.counter + 1
}
  }
}

str(my.df)
'data.frame':   12240 obs. of  4 variables:
 $ month: int  1 1 1 1 1 1 1 1 1 1 ...
 $ x: int  1 2 3 4 5 6 7 8 9 10 ...
 $ y: int  1 1 1 1 1 1 1 1 1 1 ...
 $ temp : num  0.673 -1.178 0.54 0.285 -1.153 ...

(In the real world problem I had, data was monthly measurements of temperature 
and x, y was coordinates).

Does anyone care to share a faster and less ugly solution? 

TIA

-- 
Hans Ekbrand

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] which is the fastest way to make data.frame out of a three-dimensional array?

2012-02-25 Thread Hans Ekbrand

First, thank you both Bert and Petr for your excellent answers. 

Berts solution seems somewhat faster, and Petrs is - in my opion at
least - slightly more elegant.

 foo - rnorm(36 * 150 * 170)
 dim(foo) - c(36, 150, 170)
 n - dim(foo)
 
 system.time(my.df - data.frame(dat = as.vector(foo),
+ dim1 = rep(seq_len(n[1]), n[2]*n[3]),
+ dim2 = rep(rep(seq_len(n[2]), e=n[1]), n[3]),
+ dim3 = rep(seq_len(n[3]), e = n[1]*n[2])))
   user  system elapsed 
  0.932   0.156   1.090 
 
 system.time(my.df - cbind(temp=c(foo), expand.grid(dim1=1:n[1], dim2=1:n[2], 
 dim3=1:n[3])))
   user  system elapsed 
  0.980   0.252   1.244 
 

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Behaviour of 'source' with URLs and proxy

2011-10-05 Thread Hans Ekbrand

On Wed, Oct 05, 2011 at 12:44:12PM +0200, Renaud Gaujoux wrote:
 Is source supposed to work through a proxy?

This worked for me:

 Sys.setenv(http_proxy=http://192.168.0.252:8118;)
 source(http://pc5.socio.gu.se:84/enkel-kurva.r;, echo = T)
 my.vectory = c(1,30,2,3,3,4)
 my.vectorx = c(1,2,3,4,5,6)
 plot(y = my.vectory, x = my.vectorx, type = l)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Synchronizing R libraries on N machines?

2011-08-26 Thread Hans Ekbrand

On Thu, Aug 25, 2011 at 08:25:02AM -0500, Giovanni Petris wrote:
 Hello!
 
 I am using R on two different machines (under Ubuntu and OS X, but this
 is probably irrelevant) and I would like to keep the two installations
 'synchronized', in particular in terms of installed packages. For
 example, if I install package xxx on my Linux machine, I would like to
 find it installed also on my Mac, and vice versa. 
 
 I imagine this to be a fairly common problem, so I would like to ask if
 anybody has suggestions to share about it. Is there a way to make the
 synchronization automatic? Painless?

I have a number of machines in a home LAN that share /usr/local where
I have all but a few R-packages that are automatically installed by
the OS package-mangagement system (by installing the meta package
r-recommended).

I have the following snippet in my .Rprofile

lib.loc = /usr/local/lib/R/site-library/

so whenever a package is installed, all machines have access to it.

This will of course not work if the machines are running different
OS:es, so that is not irrelevant.


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lavaan: how to analyse residuals of a latent variable

2011-08-09 Thread Hans Ekbrand

Hi r-help,

I use lavaan:sem() for structural equation modelling with latent
variables. Below is a reproducible example (the code requires a
working installation of lavaan) where the latent variable criminality
is in focus. Besides criminality in general, I am specifically
interested one of the manifest variables that make up the latent
variable criminality, namely fire.setting.

My question is: how can I analyse the part of the variation in
fire.setting that is not included in the latent variable criminality?
Ideally I would want a new variable that captures just this. Then I
could model regressions with this variable as the dependent variable.

As far as I understand the output of the summary() - of which I have
reproduced a few lines - about half (0.499) of the variation in
fire.setting is included in the latent variable criminality.

   Estimate  Std.err  Z-value  P(|z|)   Std.lv  Std.all
Latent variables:
  criminality =~
fire.setting  0.3240.007   48.2230.0000.1890.499


I would like to analyse the other half of fire.setting, so to speak.


my.model - 
## Measurement model (definitions of the latent variables)
priviledged.parents =~ nr.parents.employed + parental.housing

school.adaption =~ enjoying.school + good.teachers +
   good.grades.important

school.grades =~ grade.language + grade.english + grade.craft +
 grade.math + grade.chemistry + grade.arts +
 grade.sports

criminality =~ vandalism + illegal.grafitti + shop.lifting +
   theft.from.automat + theft.from.school +
   theft.of.bicycle + theft.of.moped + theft.of.car +
   theft.from.car + theft.pick.pocket + burglary +
   buying.stolen.goods + selling.stolen.goods +
   wearing.knife + robbery + fire.setting +
   abuse.unknown.persons + abuse.family.members +
   used.knife + drugs.cannabis + drugs.other +
   drugs.thinner + drugs.steroids + selling.drugs.cannabis
   + selling.drugs.other

## Regressions
priviledged.parents ~ parental.migration + parental.class

school.adaption ~ parental.migration + parental.class + sex.girl +
  priviledged.parents

school.grades ~ parental.migration + parental.class + sex.girl +
priviledged.parents

criminality ~ parental.migration + parental.class + sex.girl +
  priviledged.parents + school.adaption + school.grades 

library(lavaan)
con - url(http://code.cjb.net/temp/lavaan.temp.RData;)
print(load(con))
close(con)
my.fit - sem(my.model, data = my.crim.set)
summary(my.fit, fit.measures = T, standardized = T)

-- 
Hans Ekbrand
Department of Sociology
University of Gothenburg
Sweden


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] importing spss-files [ was: Re: need your consult]

2011-08-09 Thread Hans Ekbrand

On Tue, Aug 09, 2011 at 02:28:22AM -0700, Mehrshad Koleini wrote:
 Dear Sir/Madam
 
 
 Hi. I am a general paediatrician, and I have read *some* chapters of the
 following books(1-3). I think SPSS lacks some features that may be important
 in data analysis (for example: interval of correlation coefficient in
 bivariate normal distribution, PRESS, and MSPR in cross-validation). I am
 thinking about changing SPSS to R:
 
 
1.SPSS is very expensive for me to update.
2.  My colleagues use SPSS, but I think data can be exchanged between
SPSS, and R, is this true?

Yes, but the data must be converted, which it not an entirely seamless
process, there might be quirks to be handled manually.

To import data from an SPSS file to R, read
http://cran.r-project.org/doc/manuals/R-data.html and
http://cran.r-project.org/web/packages/foreign/foreign.pdf

Basically, it can be as simple as

library(foreign)
foo - read.spss(file = data_set.sav)

now, your data is in object foo, which can be inspected with the
function str()

str(foo)

--
Hans Ekbrand


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lavaan: how to analyse residuals of a latent variable

2011-08-09 Thread Hans Ekbrand

On Tue, Aug 09, 2011 at 01:49:13PM +0200, yrosseel wrote:
 
 My question is: how can I analyse the part of the variation in
 fire.setting that is not included in the latent variable criminality?
 Ideally I would want a new variable that captures just this. Then I
 could model regressions with this variable as the dependent variable.
 
 You can add a regression line to your model syntax with
 'fire.setting' as the dependent variable:
 
 fire.setting ~ x1 + x2 + x3
 
 were x1-x3 are additional predictors that might influence the
 variable 'fire.setting'.

Can I include criminality among those and thereby get the common part
of criminality and fire.setting out of the way?

I tried adding the following regression formula:

fire.setting ~ parental.migration + parental.class + sex.girl + 
   priviledged.parents + school.adaption + school.grades +
   criminality

but I got:

Error in solve.default(E) :
Lapack routine dgesv: system is exactly singular 
 
[lavaan message:] could not compute standard errors! 
 
 You can still request a summary of the fit to inspect 
 the current estimates of the parameters.

However, the fit-object has regression estimates were criminality
seems to have about the same size as I would have thought, given the
covariation of fire.setting and criminality.

   Estimate  Std.err  Z-value  P(|z|)   Std.lv  Std.all

  fire.setting ~
parental.migr 0.001   0.0010.003
parental.clas-0.000  -0.000   -0.000
sex.girl -0.015  -0.015   -0.019
priviledged.p 0.066   0.0150.039
school.adapti 0.004   0.0020.005
school.grades-0.012  -0.010   -0.026
criminality   0.327   0.1910.505

Are the other estimates reasonable estimates of the part of variation
in fire-setting that does not co-variate with criminality?


--
Hans Ekbrand


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [Solved] Re: lavaan: how to analyse residuals of a latent variable

2011-08-09 Thread Hans Ekbrand

On Tue, Aug 09, 2011 at 03:30:17PM +0200, yrosseel wrote:
 
 Can I include criminality among those and thereby get the common part
 of criminality and fire.setting out of the way?
 
 No. You already regress fire.setting on criminality since it is an
 indicator in the measurement model of criminality. In other words,
 the 'criminality' part is already regressed out.

So, I get just what I want by simply regressing on fire.setting, that
is awesome!

Maybe this kind of usage of lavaan is not very common, but in order to
help others in my situation, is this documented somewhere? My
understanding of latent variable analysis is indeed limited, but I did
not understand that lavaan worked liked this when I read the
documentation.

Kind regards,

Hans Ekbrand


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] confint.multinom() slow?

2011-05-12 Thread Hans Ekbrand

Dear R-helpers,

I'm doing a bivariate analysis with two factors, both with relatively
many levels:

1. clustering, a factor with 35 levels
2. country, a factor with 24 levels

n = 12,855

my.fit - multinom(clustering ~ country, maxit=300)
converges after 280 iterations.

I would like to get CI:s for the odds ratios, and have tried confint()

my.cis - confint(my.fit)

I started confint() a few hours ago, but now I'm getting suspicious,
since it hasn't terminated yet. Perhaps I just lack the reasonable
patience, but is such a long computational time for confint() to be
expected here?

Hans Ekbrand



signature.asc
Description: OpenPGP digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Cluster analysis, factor variables, large data set

2011-03-31 Thread Hans Ekbrand

Dear R helpers,

I have a large data set with 36 variables and about 50.000 cases. The
variabels represent labour market status during 36 months, there are 8
different variable values (e.g. Full-time Employment, Student,...)

Only cases with at least one change in labour market status is
included in the data set.

To analyse sub sets of the data, I have used daisy in the
cluster-package to create a distance matrix and then used pam (or pamk
in the fpc-package), to get a k-medoids cluster-solution. Now I want
to analyse the whole set.

clara is said to cope with large data sets, but the first step in the
cluster analysis, the creation of the distance matrix must be done by
another function since clara only works with numeric data.

Is there an alternative to the daisy - clara route that does not
require as much RAM?

What functions would you recommend for a cluster analysis of this kind
of data on large data set?


regards,

Hans Ekbrand

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cluster analysis, factor variables, large data set

2011-03-31 Thread Hans Ekbrand

On Thu, Mar 31, 2011 at 07:06:31PM +0100, Christian Hennig wrote:
 Dear Hans,
 
 clara doesn't require a distance matrix as input (and therefore
 doesn't require you to run daisy), it will work with the raw data
 matrix using
 Euclidean distances implicitly.
 I can't tell you whether Euclidean distances are appropriate in this
 situation (this depends on the interpretation and variables and
 particularly on how they are scaled), but they may be fine at least
 after some transformation and standardisation of your variables.

The variables are unordered factors, stored as integers 1:9, where 

1 means Full-time employment
2 means Part-time employment
3 means Student
4 means Full-time self-employee
...

Does euclidean distances make sense on unordered factors coded as
integers?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cluster analysis, factor variables, large data set

2011-03-31 Thread Hans Ekbrand

On Thu, Mar 31, 2011 at 08:48:02PM +0200, Hans Ekbrand wrote:
 On Thu, Mar 31, 2011 at 07:06:31PM +0100, Christian Hennig wrote:
  Dear Hans,
  
  clara doesn't require a distance matrix as input (and therefore
  doesn't require you to run daisy), it will work with the raw data
  matrix using
  Euclidean distances implicitly.
  I can't tell you whether Euclidean distances are appropriate in this
  situation (this depends on the interpretation and variables and
  particularly on how they are scaled), but they may be fine at least
  after some transformation and standardisation of your variables.
 
 The variables are unordered factors, stored as integers 1:9, where 
 
 1 means Full-time employment
 2 means Part-time employment
 3 means Student
 4 means Full-time self-employee
 ...
 
 Does euclidean distances make sense on unordered factors coded as
 integers?

To be clear, here is an extract

 my.df.full[900:910, 16:19]
PL210F.first.year PL210G.first.year PL210H.first.year PL210I.first.year
900 2 2 1 2
901 1 1 1 1
902 1 1 1 1
903 2 2 2 2
904 1 1 1 1
905 2 2 2 2
906 7 8 2 7
907 5 5 5 5
908 1 1 1 1
909 1 1 1 1
910 1 1 1 1

 class(my.df.full[,16])
[1] integer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to make list() return a list of named elements

2010-10-04 Thread Hans Ekbrand

On Thu, Sep 30, 2010 at 09:10:16AM -0400, Gabor Grothendieck wrote:
 A data frame is a list in which every component (i.e. every column)
 must have the same length (i.e. the same number of rows).
 data.frame() does preserve names:
 
  data.frame(b, my.c)
  b my.c
 1 22.48
 2 12.29
 3 10.9   15
 4  8.51
 5  9.2   14

Thanks for your suggestion. However, the reason I used list() was that
the different vectors to return usually have different lengths.

Admittedly, I should have used another example that explicated this.

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
GnuPG key: 1024D/7050614E
Fingerprint: 1408 C8D5 1E7D 4C9C C27E 014F 7C2C 872A 7050 614E
Learn about secure email at http://www.gnupg.org


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to make list() return a list of named elements

2010-10-04 Thread Hans Ekbrand

On Thu, Sep 30, 2010 at 09:34:26AM -0300, Henrique Dallazuanna wrote:
 You should try:
 
 eapply(.GlobalEnv, I)[c('b', 'my.c')]

Great!

b - c(22.4, 12.2, 10.9, 8.5, 9.2)
my.c - sample.int(round(2*mean(b)), 4)

my.return - function (vector.of.variable.names) {
  eapply(.GlobalEnv, I)[vector.of.variable.names]
}

str(my.return(c(b,my.c)))
List of 2
 $ b   :Class 'AsIs'  num [1:5] 22.4 12.2 10.9 8.5 9.2
 $ my.c:Class 'AsIs'  int [1:4] 18 22 12 3

much nicer than list(b=b, my.c=my.c), especially in real cases with
longer variable names and a lot of variables to return.

Thanks Henrique!

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
Q. What is that strange attachment in this mail?
A. My digital signature, see www.gnupg.org for info on how you could
   use it to ensure that this mail is from me and has not been
   altered on the way to you.


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to make list() return a list of named elements

2010-10-04 Thread Hans Ekbrand

On Mon, Oct 04, 2010 at 07:45:23PM +0800, Berwin A Turlach wrote:
 G'day Hans,
 
 On Mon, 4 Oct 2010 11:28:15 +0200
 Hans Ekbrand h...@sociologi.cjb.net wrote:
 
  On Thu, Sep 30, 2010 at 09:34:26AM -0300, Henrique Dallazuanna wrote:
   You should try:
   
   eapply(.GlobalEnv, I)[c('b', 'my.c')]
  
  Great!
  
  b - c(22.4, 12.2, 10.9, 8.5, 9.2)
  my.c - sample.int(round(2*mean(b)), 4)
  
  my.return - function (vector.of.variable.names) {
eapply(.GlobalEnv, I)[vector.of.variable.names]
  }
 
 Well, if you are willing to create a vector with the variable names,
 then simpler solutions should be possible, i.e. solutions that only
 operate on the objects of interest and not on all objects in the global
 environment (which could be a lot depending on your style).  

Actually, what made me want this list-like function was when coding
the return() of the interesting results from a calculation function to
what I imagine is the global environment (I have only a vague
concept of that though). So, in the global environment there are very
few objects, while there are more objects in the function where this
list-like function will be used.

Your solution does look way cleaner the falling back to hidden stuff
as .GlobalEnv, so I will definately use it.

In addition, the returned list has is not of a strange class as in
Henriques example.

Thanks,

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
GPG Fingerprint: 1408 C8D5 1E7D 4C9C C27E 014F 7C2C 872A 7050 614E


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to make list() return a list of named elements

2010-10-04 Thread Hans Ekbrand

On Mon, Oct 04, 2010 at 07:51:10PM +0800, Berwin A Turlach wrote:
 R my.return - function (vector.of.variable.names) {  
  sapply(vector.of.variable.names, function(x) get(x))
}

Even better :-)

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
GPG Fingerprint: 1408 C8D5 1E7D 4C9C C27E 014F 7C2C 872A 7050 614E


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to make list() return a list of named elements

2010-10-04 Thread Hans Ekbrand

On Mon, Oct 04, 2010 at 10:07:06AM -0400, Gabor Grothendieck wrote:
 Some small tweaks.   If you use simplify=FALSE then it will guarantee
 that a list is returned:
 
sapply(my.names, get, simplify = FALSE)
 
 for example, compare the outputs of:
 
sapply(c(letters, LETTERS), get)
sapply(c(letters, LETTERS), get, simplify = FALSE)

Thanks Gabor, 

But get() fails to find my objects, though get() successfully finds
letters and LETTERS (but they are part of the global environment,
I assume).

a - c(not, this)
b - c(0,0)
my.test.function -  function() {
  a - 1:3
  b - c(x, y)
  sapply(c(a, b), get, simplify = FALSE)
}
my.test.function()
$a
[1] not  this

$b
[1] 0 0

rm(a,b)
my.test.function -  function() {
  a - 1:3
  b - c(x, y)
  sapply(c(a, b), get, simplify = FALSE)
}
my.test.function()

Error in FUN(c(a, b)[[1L]], ...) : object 'a' not found

If get() is what should be used, then how do you get it to find
objects in the environment of the function? I would prefer to write a
special my.return() and for get() to work there, perhaps some deep R
magic is needed. Something like this is what I aim for:

my.return - function(my.names) {
  sapply(my.names, get, simplify = FALSE)
}

my.function - function() {
...
summarize data
...
my.return(summary.one, summary.two ...)
}


Kind regards, Hans


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to make list() return a list of named elements

2010-10-04 Thread Hans Ekbrand

On Mon, Oct 04, 2010 at 04:10:46PM +0200, Christophe Pallier wrote:
 See llist from Hmisc package:
 
 library(Hmisc)
 a=rnorm(10)
 b=rnorm(5)
 llist(a,b)

Ah, that seems like what I want!

My old, ugly and redunant code looked like this

  list(utdrag=utdrag,
   
n.fires.at.sites.with.one.or.more.fires=n.fires.at.sites.with.one.or.more.fires,
   
more.than.one.fire.in.these.clusters=more.than.one.fire.in.these.clusters,
   
n.fires.in.clusters.with.more.than.one.fire=n.fires.in.clusters.with.more.than.one.fire,
   n.fires.in.top.ten.clusters=n.fires.in.top.ten.clusters,
   m=m,
   översta.fem.procenten=översta.fem.procenten,
   these.cluster.have.alot.of.members=these.cluster.have.alot.of.members,
   n.fires.in.hotspots=n.fires.in.hotspots,
   prop.concentrated.fires=prop.concentrated.fires,
   d=d,
   real.vector=real.vector,
   the.real=the.real,
   censored.real=censored.real,
   half.of.effect.at=half.of.effect.at)

The new, nice-looking code looks like this:

  llist(utdrag,
n.fires.at.sites.with.one.or.more.fires,
more.than.one.fire.in.these.clusters,
n.fires.in.clusters.with.more.than.one.fire,
n.fires.in.top.ten.clusters,
m,
översta.fem.procenten,
these.cluster.have.alot.of.members,
n.fires.in.hotspots,
prop.concentrated.fires,
d,
real.vector,
the.real,
censored.real,
half.of.effect.at)

Thank you all r-helpers!


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to make list() return a list of named elements

2010-09-30 Thread Hans Ekbrand

If I combine elements into a list

b - c(22.4, 12.2, 10.9, 8.5, 9.2)
my.c - sample.int(round(2*mean(b)), 5)
my.list - list(b, my.c)

the names of the elements seems to get lost in the process:

 str(my.list)
List of 2
 $ : num [1:5] 22.4 12.2 10.9 8.5 9.2
 $ : int [1:5] 11 8 6 9 20

If I explicitly name the elements at list-creation, I get what I want:

my.list - list(b=b, my.c=my.c)

 str(my.list)
List of 2
 $ b   : num [1:5] 22.4 12.2 10.9 8.5 9.2
 $ my.c: int [1:5] 11 8 6 9 20


Now, is there a way to get list() (or some other function) to
automatically name the elements?

I often use list() in return(), and I am getting tired of having to
repeat myself.

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
A. Because it breaks the logical sequence of discussion
Q. Why is top posting bad?


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] getting random integers

2010-04-30 Thread Hans Ekbrand

On Thu, Apr 29, 2010 at 12:43:42PM -0400, Sarah Goslee wrote:
 You can always take a look. If you use a much bigger sample size it will be
 obvious:
 
 hist(round(runif(100, min = 1, max = 10)))

Thank for this advice, apparently 1 and 10 had not the same chances of
being selected.


 I'd use instead:
 
 hist(sample(1:10, 100, replace=TRUE))

sample() is what I want, thank you.

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] getting random integers

2010-04-29 Thread Hans Ekbrand

I want 100 integers. Each integer, x, can be in the range 1 = x = 10.

Does the following code give 1 and 10 the same chances to be selected as
2:8?

round(runif(100, min = 1, max = 10))

--
Hans Ekbrand


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to word-wrap text in labels in plots?

2009-04-29 Thread Hans Ekbrand

c - structure(c(2L, 2L, 1L, 3L, 4L, 2L, 3L, 2L, 3L, 2L, 5L), .Label = c(foo, 
+ bar, a really really long variable label mostly here to show the need of 
word-wrapping text in labels, 
+ a not so important value, baz), class = factor)
plot(c)

Is there a way to get the long variable labels to automatically wrap so that 
all labels can be shown?

Alternatively, is there a way to get the labels truncated, possibly with .. 
appended?

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
Q. What is that strange attachment in this mail?
A. My digital signature, see www.gnupg.org for info on how you could
   use it to ensure that this mail is from me and has not been
   altered on the way to you.


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to word-wrap text in labels in plots?

2009-04-29 Thread Hans Ekbrand

Thanks to Jim and Eik!

I really appreciate your help, and I think can use your suggestions
and perhaps write a wrapper for plot that integrates them.

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
Q. What is that strange attachment in this mail?
A. My digital signature, see www.gnupg.org for info on how you could
   use it to ensure that this mail is from me and has not been
   altered on the way to you.


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] deleting rows provisionally

2009-04-24 Thread Hans Ekbrand

On Fri, Apr 24, 2009 at 04:50:48AM -0700, onyourmark wrote:
 
 Hi. Thanks very much for the reply and the good suggestion. It works well.
 But I don't get why the for loop is not deleting anything or making any
 assignments? Or I should say, doesn't answer3[-i,] delete entries from
 answer3 when the if condition is true?

Your for loop was:

for(i in 1:1537){if(answer2[i,1]==answer2[i,2]){answer3[-i,]}}

No, answer3[-i] does not remove item i from answer3, it returns an
anonymous temporary object which is identical to (answer3 without item
i). Since that object not saved, it is deleted when the loop enters
the next iteration. To actually *modify* answer3 you can use:

answer3 - answer3[-i]

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
Q. What is that strange attachment in this mail?
A. My digital signature, see www.gnupg.org for info on how you could
   use it to ensure that this mail is from me and has not been
   altered on the way to you.


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] stata == R - error messages

2009-04-24 Thread Hans Ekbrand

On Fri, Apr 24, 2009 at 01:50:04PM +0200, Rob Bakker wrote:
 Dear Peter,
 Also thank you for your quick reply. I did the following with no positive
 result:
 
 library(foreign)
 
 read.dta(choose.file(C:\Rklein))

a) quote the filename
b) include the suffix

rklein - read.dta(C:\Rklein.dta)

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
GPG Fingerprint: 1408 C8D5 1E7D 4C9C C27E 014F 7C2C 872A 7050 614E


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Student

2009-04-08 Thread Hans Ekbrand

On Wed, Apr 08, 2009 at 10:02:10AM +0200, alberto cassese wrote:
 Hi,
 I have  problem. In the function below (test and test2) i want the function
 test not to print the variable data but i want the function test2 to use the
 variable test$data.

 This is the creation of the variable data:
 
  matrice=c(1:10)
  matrice=matrix(matrice,nrow=5,ncol=2)
 
 This is the function test:
 
  test=function(data){
 + return(list(x=5,data=data))
 + }
 
 This is the function test2:
 
  test2=function(list){
 + bodri=list$data
 + bodri[1,2]=bodri[2,2]+1
 + return(bodri)
 + }
 
 Below there are the result:
 
  uno=test(matrice)
  due=test2(uno)
  uno
 $x
 [1] 5
 
 $data
  [,1] [,2]
 [1,]16
 [2,]27
 [3,]38
 [4,]49
 [5,]5   10
 
  due
  [,1] [,2]
 [1,]18
 [2,]27
 [3,]38
 [4,]49
 [5,]5   10
 
 
 What i want is:
 
  uno=test(matrice)
  due=test2(uno)
  uno
 $x
 [1] 5

x is a variable, 5 is variable data and you don't want variable data
printed?

  due
  [,1] [,2]
 [1,]18
 [2,]27
 [3,]38
 [4,]49
 [5,]5   10
 

Use uno[1], either directly or by creating a third variable from uno[1]

 one.and.a.half - uno[1]
 one.and.a.half
$x
[1] 5

Or, if you *really* want what that printed output from test(matrice),
create a class for your list-object, and add a special print method,
that will only print the first item of the list.


-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
A. Because it breaks the logical sequence of discussion
Q. Why is top posting bad?


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] read.spss, locale and encodings

2009-04-08 Thread Hans Ekbrand

I must be missing something obvious here:

According to the help page for read.spss, the reencode option is only
active when R is run under a UTF-8 locale.

read.spss can only import the SPSS file when run under a iso88591(5)
locale, under a UTF-8 locale I get:

Error in read.spss(wo.sav) : error reading system-file header
In addition: Warning message:
In read.spss(wo.sav) :
  wo.sav: position 143: Variable name begins with invalid character

This is under Debian GNU/Linux, the stable release.

foreign is version 8.27

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
GPG Fingerprint: 1408 C8D5 1E7D 4C9C C27E 014F 7C2C 872A 7050 614E


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.spss, locale and encodings

2009-04-08 Thread Hans Ekbrand

On Wed, Apr 08, 2009 at 03:03:06PM +0200, Peter Dalgaard wrote:
 Hans Ekbrand wrote:
 I must be missing something obvious here:

 According to the help page for read.spss, the reencode option is only
 active when R is run under a UTF-8 locale.

 Not in my version:

 reencode: logical: should character strings be re-encoded to the
   current locale.  The default, 'NA', means to do so in a UTF-8
   locale, only.  Alternatively character, specifying an
   encoding to assume.

OK, thanks for that correction, but the problem isn't solved, since
read.spss fails, see below. When read.spss succeeds, the options is
not useful, since then the current locale is iso88591(5).

 So, does it help with reencode=Latin1? Presumably this comes from  
 assuming UTF-8 when it isn't.

 Sys.getlocale()
[1] 
LC_CTYPE=sv_SE.UTF-8;LC_NUMERIC=C;LC_TIME=sv_SE.UTF-8;LC_COLLATE=sv_SE.UTF-8;LC_MONETARY=sv_SE.UTF-8;LC_MESSAGES=sv_SE.utf8;LC_PAPER=sv_SE.utf8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=sv_SE.utf8;LC_IDENTIFICATION=C
 test - read.spss(wo.sav, to.data.frame=TRUE, reencode=Latin1)
Error in read.spss(wo.sav, to.data.frame = TRUE, reencode = Latin1) : 
  error reading system-file header
In addition: Warning message:
In read.spss(wo.sav, to.data.frame = TRUE, reencode = Latin1) :
  wo.sav: position 143: Variable name begins with invalid character

Using another version of the dataset, where I have successfully
encoded the names to UTF-8, here is the problematic variable name:

names(Workorientation.2005.Swe)[143]
[1] KÖN1

 8.34 is used in the current prerelease. AFAIR, some issues with
 encodings were fixed recently.

Someone running foreign 8.34 that is willing to test my SPSS-file?

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
Q. What is that strange attachment in this mail?
A. My digital signature, see www.gnupg.org for info on how you could
 use it to ensure that this mail is from me and has not been
 altered on the way to you.


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.spss, locale and encodings

2009-04-08 Thread Hans Ekbrand

On Wed, Apr 08, 2009 at 04:17:51PM +0200, Peter Dalgaard wrote:
 Hans Ekbrand wrote:
 Someone running foreign 8.34 that is willing to test my SPSS-file?

 Someone with an SPSS file problem willing to help test the prereleases? :-)

http://sociologi.cjb.net/temp/test.sav

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
GPG Fingerprint: 1408 C8D5 1E7D 4C9C C27E 014F 7C2C 872A 7050 614E


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.spss, locale and encodings

2009-04-08 Thread Hans Ekbrand

On Wed, Apr 08, 2009 at 07:12:23PM +0200, Peter Dalgaard wrote:
 Apparently, you can work around it like this

 lc - Sys.setlocale(LC_CTYPE)
 Sys.setlocale(LC_CTYPE, da_DK)
 x - read.spss(~/Desktop/downloads/test.sav, reencode = latin1)
 Sys.setlocale(LC_CTYPE, lc)

 -- which doesn't strike me as particularly logical, but whatever works

THANKS a lot Peter! This works perfectly! I had been struggling with
this problem way too long...

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
GnuPG key: 1024D/7050614E
Fingerprint: 1408 C8D5 1E7D 4C9C C27E 014F 7C2C 872A 7050 614E
Learn about secure email at http://www.gnupg.org


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Converting a whole dataframe (including attributes) from latin1 to UTF-8

2009-04-07 Thread Hans Ekbrand

Hi list!

Short version: How do I convert a whole data.frame from latin1
encoding to utf8?

I get SPSS files with latin1 encoding. My OS is GNU/Linux and the
locale sv_SE.utf8, and I normally interface R with Emacs/ESS. I have
used the following hack to convert a data.frame in latin1 to utf8:

 Sys.setlocale(category = LC_ALL, locale = sv_SE.iso88591)
 foo - read.spss(foo.sav, to.data.frame=TRUE)
 write.table(foo, foo.data)
$ recode lat1..utf8 foo.data
 Sys.setlocale(category = LC_ALL, locale = sv_SE.utf8)
 foo - read.table(foo.data)

I have now found two problems with this approach: 

a) variable.labels is droped
b) the order of unordered factors is changed

I had just worked out a hack for a) when I realised b). b) is a
problem when the factors really is ordered, but not recognized as such
by read.spss (and/or not defined as such in SPSS, but since SPSS
respects the numeric values of the factors anyway, users don't need
to)

Rather than hack around b) too, I wonder if anyone on the list know
how to convert a whole data.frame from latin1 encoding to utf8?

TIA

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
A. Because it breaks the logical sequence of discussion
Q. Why is top posting bad?


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] PCA and categorical data

2009-03-06 Thread Hans Ekbrand

On Fri, Mar 06, 2009 at 09:46:17AM -, Ted Harding wrote:
 On 06-Mar-09 09:25:26, Prof Brian Ripley wrote:
  You might want to look into correspondence analysis, which has several 
  variants of PCA designed for categorical data.
 
 In particular, have a look at the results of
 
   RSiteSearch(correspondence)

I can recommend the packages ca and FactoMineR

http://cran.r-project.org/web/packages/ca/index.html
http://cran.r-project.org/web/packages/FactoMineR/index.html

http://www.jstatsoft.org/v20/i03
http://www.jstatsoft.org/v25/i01

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
GPG Fingerprint: 1408 C8D5 1E7D 4C9C C27E 014F 7C2C 872A 7050 614E


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] frequency table for multiple variables

2009-02-17 Thread Hans Ekbrand

Hi r-help!

Consider the following data-frame:

   var1 var2 var3 
1 314 
2 223 
3 223 
4 44   NA 
5 435 
6 223 
7 343 

How can I get R to convert this into the following?

Value 1  2  3  4  5 
var1  0  3  2  2  0
var2  1  3  1  2  0 
var3  0  0  4  1  1

TIA,

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
Q. What is that strange attachment in this mail?
A. My digital signature, see www.gnupg.org for info on how you could
   use it to ensure that this mail is from me and has not been
   altered on the way to you.


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] frequency table for multiple variables

2009-02-17 Thread Hans Ekbrand

On Tue, Feb 17, 2009 at 10:00:40AM -0600, Marc Schwartz wrote:
 on 02/17/2009 09:06 AM Hans Ekbrand wrote:
  Hi r-help!
  
  Consider the following data-frame:
  
 var1 var2 var3 
  1 314 
  2 223 
  3 223 
  4 44   NA 
  5 435 
  6 223 
  7 343 
  
  How can I get R to convert this into the following?
  
  Value 1  2  3  4  5 
  var1  0  3  2  2  0
  var2  1  3  1  2  0 
  var3  0  0  4  1  1
 
 
  t(sapply(DF, function(x) table(factor(x, levels = 1:5
  1 2 3 4 5
 var1 0 3 2 2 0
 var2 1 3 1 2 0
 var3 0 0 4 1 1
 
 
 The key is to turn each column into a factor with explicitly defined
 common levels for tabulation. This enables the table result to have a
 consistent format across each column, allowing for a matrix to be
 created, rather than a list.

Thanks alot, Marc. Neat and efficient, just what I wanted.

BTW, before I saw that you actually included code, I tried on my own,
and wrote this:

my.count - function(data.frame, levels) {
  result.df - data.frame(matrix(nrow=length(data.frame),ncol=levels))
  for (i in 1:length(data.frame)) {
result.df[i,] - table(factor(data.frame[[i]], levels = c(1:levels)))
  }
  result.df
}

which produces the same result. I take this to be a an instructive
example of unnecessary use of for-loops in R.

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
Q. What is that strange attachment in this mail?
A. My digital signature, see www.gnupg.org for info on how you could
   use it to ensure that this mail is from me and has not been
   altered on the way to you.


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pros and Cons of R

2008-05-22 Thread Hans Ekbrand

On Thu, May 22, 2008 at 02:07:01PM -0400, R P Herrold wrote:
 On Thu, 22 May 2008, Monica Pisica wrote:

[...]

 When a new R version is in place you 
 cannot up-grade your old R version, you have to do a new 
 installation and re-load all the packages you used to have 
 and delete / un-install the old version
 
 ummm -- this is of course a function of the package manager 
 and operating system being used, and not of R intrinsicly; 
 under an RPM package manager, this issue is not present

Neither under .deb based OS:es such as Ubuntu and Debian.

-- 
Hans Ekbrand (http://sociologi.cjb.net) [EMAIL PROTECTED]
Q. What is that strange attachment in this mail?
A. My digital signature, see www.gnupg.org for info on how you could
   use it to ensure that this mail is from me and has not been
   altered on the way to you.


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] howto import .xls and .ods

2008-05-11 Thread Hans Ekbrand

On Fri, May 02, 2008 at 07:35:37AM +0100, Prof Brian Ripley wrote:
 There is a *manual* on R Data Import/Export, not just an FAQ.
 
 This is the first request I have seen for .ods (whatever that is -- 

The most well-known application that uses this file format is the Calc
(Spreadsheet) part of the Open Office Suite.

Pasted from http://en.wikipedia.org/wiki/OpenDocument

  OpenDocument Spreadsheet  
Image:X 
  File extension.ods
application/vnd.
Internet media type oasis.opendocument. 
spreadsheet 
   Developed by Sun Microsystems, OASIS 
  Type of formatSpreadsheet 
   Extended fromXML 

-- 
Hans Ekbrand (http://sociologi.cjb.net) [EMAIL PROTECTED]
GPG Fingerprint: 1408 C8D5 1E7D 4C9C C27E 014F 7C2C 872A 7050 614E


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Get inside a function the name of a variable called as argument?

2008-04-29 Thread Hans Ekbrand

On Tue, Apr 29, 2008 at 03:53:02PM +0200, Julien Roux wrote:
 Hi list,
 I created a function to plot my data:
 plot_function(vector)
 I want to write the name of the argument vector in the legend/title of 
 the plot.
 For example if I call:
  plot_function(my_vector)
 I want my_vector to be written in the legend or title and so retrieve 
 the name of this object as a string.
 
 Is it possible to achieve this?

While it might be possible, I think it would be better to use an extra
argument for this:

plot_function(my_vector, title = my_title)

Functions should be general, and relying on the name of the variable
makes your function less general. What if you in the future want to
use plot_function with an anynmous vector created dynamically? e.g by
combining two other vectors:

plot_function(c(foo, bar))

--
Hans Ekbrand (http://sociologi.cjb.net) [EMAIL PROTECTED]
GPG Fingerprint: 1408 C8D5 1E7D 4C9C C27E 014F 7C2C 872A 7050 614E


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problems with rgl in Ubuntu 'gutsy'

2008-03-18 Thread Hans Ekbrand

On Mon, Mar 17, 2008 at 09:11:05PM -, Foadi, J (James) wrote:
 On Mon, Mar 17, 2008 at 05:43:58PM -, Foadi, J (James) wrote:

[...]

  | all rgl windows device do not show the outer frame, so I cannot move or 
  resize the window.
  | In addition to that, the image is like frozen, it can't be either 
  rotated or shifted at all.
  | 
  | In short, rgl seems not to behave according to expectations. No error 
  messages appear at any stage.
  | 
  | Has anyone a clue on what's going wrong here?
  
  Did you try the pre-built version, ie 'sudo apt-get install r-cran-rgl' ?
 
 Does X do gl-rendering OK with other applications? (what does
 
 $ glxgears -info
 
 I've tried what you suggest. This is the output:
 
 [EMAIL PROTECTED]:~/workR$ glxgears -info
 GL_RENDERER   = Mesa DRI Intel(R) 945GM 20061017 x86/MMX/SSE2
 GL_VERSION= 1.3 Mesa 7.0.1

[...]

 3976 frames in 5.0 seconds = 795.106 FPS
 4017 frames in 5.0 seconds = 803.364 FPS
 3988 frames in 5.0 seconds = 797.593 FPS
 
 
 And a windows with moving gears appears, and it has a frame. But it doesn't 
 click and drag easily.

On my system that window can be moved and even resized without
problems (keeping a high rendering rate). Sorry, but I my test didn't
give any conclusive new facts about the cause of the problem.

-- 
Hans Ekbrand (http://sociologi.cjb.net) [EMAIL PROTECTED]
Q. What is that strange attachment in this mail?
A. My digital signature, see www.gnupg.org for info on how you could
   use it to ensure that this mail is from me and has not been
   altered on the way to you.


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] grouped colSums without for loops?

2008-03-18 Thread Hans Ekbrand

On Tue, Mar 18, 2008 at 05:57:59AM -0500, jim holtman wrote:
 Is this what you want?
 
  lapply(split(d, d$foo), function(x) colSums(x[,-1]))

Yes! Thank you! the *apply functions seem very powerful, thanks again
for giving me a hint on how to use them.

-- 
Hans Ekbrand (http://sociologi.cjb.net) [EMAIL PROTECTED]
Q. What is that strange attachment in this mail?
A. My digital signature, see www.gnupg.org for info on how you could
   use it to ensure that this mail is from me and has not been
   altered on the way to you.


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] grouped colSums without for loops?

2008-03-18 Thread Hans Ekbrand

On Tue, Mar 18, 2008 at 12:05:23PM +0100, Albert Greinoecker wrote:
 try:
 aggregate(d[,2:3], by=list(d$foo), FUN=sum)

Great! Now I can get a data.frame as well as a list, thanks!

-- 
Hans Ekbrand (http://sociologi.cjb.net) [EMAIL PROTECTED]
Signature generated by Signify v1.14.  For this and more, visit 
http://www.debian.org/


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] hclust graphics - plotting many points

2008-03-11 Thread Hans Ekbrand

On Mon, Mar 10, 2008 at 10:19:01AM -, michael watson (IAH-C) wrote:
 I'd recommend outputting either as pdf or as a windows metafile 

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 On Behalf Of Karin Lagesen
 Sent: 10 March 2008 09:54
 To: r-help@r-project.org
 Subject: [R] hclust graphics - plotting many points

 Hello.

 I have a distance matrix with lots of distances that I use hclust to
 organise. I then plot the results using the plot method of hclust.

 However, the plot itself takes around 20 mins to make due to there
 being ~700 things in the matrix that I have distances for. I thus
 would like to dump this to some graphics format which will let me
 examine this further.

 I tried dumping it to postscript:

 postscript(myfile.ps, height = 50, pointsize=5)
 plot(my_hc_object)
 dev.off()

 What happens is that since most of the items in the matrix have a
 distance of zero to something everything just becomes a black smear on
 the bottom where I cannot distinguish anything from anything else. I
 thus tried increasing the heigth and/or width and also downscaling the
 pointsize. None of these improved anything much. 

 So, now I am wondering if any of you have any tips for how I can get
 something like I get in the x11() window which I can also store and
 potentially show other people.

Don't you have the problem of too small distances in the X11() window?

I've had similar problems with a graph in graphviz, where I found it
easier to get what I wanted using a png-driver instead of postscript
driver.

Png doesn't scale well, but it might be worth a try.

png(file=myfile.png, width=3000, height=2250)
plot(my_hc_object)
dev.off()

-- 
Hans Ekbrand (http://sociologi.cjb.net) [EMAIL PROTECTED]
GPG Fingerprint: 1408 C8D5 1E7D 4C9C C27E 014F 7C2C 872A 7050 614E

signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plot using colors

2008-03-03 Thread Hans Ekbrand

On Mon, Mar 03, 2008 at 02:03:07AM -0800, mysimbaa wrote:
 
 Dear R users,
 I have a problem since I try to plot my datas with different colors.
 
 plot(tvar, var, xlab=zeit [s],ylab=Variation [%],  col = ifelse(var =
 varstability, 'green','red'))
 this works well!
 
 But since I add a type=l to my plot, it will color all the plot with
 green!!!

Please include this too.

-- 
Hans Ekbrand (http://sociologi.cjb.net) [EMAIL PROTECTED]
GPG Fingerprint: 1408 C8D5 1E7D 4C9C C27E 014F 7C2C 872A 7050 614E


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] simple usage of for

2008-02-19 Thread Hans Ekbrand

On Tue, Feb 19, 2008 at 04:52:19PM +0200, K. Elo wrote:
 Hi,
 
 Hans Ekbrand wrote (19.2.2008):
  I tried the following small code snippet which I copied from the
 
  Introduction to R:
   for (i in 2:length(meriter)) { table(meriter[[1]], meriter[[i]]) }
 
 Try:
 for (i in 2:length(meriter)) { print(table(meriter[[1]], 
 meriter[[i]])) }

It works, thanks!

--
Hans Ekbrand


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] simple usage of for

2008-02-19 Thread Hans Ekbrand

On Tue, Feb 19, 2008 at 10:04:13AM -0500, Duncan Murdoch wrote:
 On 2/19/2008 9:24 AM, Hans Ekbrand wrote:

[...]

  I tried the following small code snippet which I copied from the
  Introduction to R:
  
  for (i in 2:length(meriter)) { table(meriter[[1]], meriter[[i]]) }
 
 Where did you find that?  I don't see anything like it.  (If there is 
 something like that, it should be fixed.)

 If you are referring to this snippet:
 
  for (i in 1:length(yc)){
  plot(xc [[i ]],yc [[i ]]);
  abline(lsfit(xc [[i ]],yc [[i ]]))
 }

Yes, I copied the for construct from that snippet.

 then it has the important difference that plot() and abline() both have 
 side effects (they do plotting), whereas table() doesn't.

I see. Thanks for the explanation.

-- 
Hans Ekbrand


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] simple usage of for

2008-02-19 Thread Hans Ekbrand

Hi list

I have a data frame I would like to loop over. To begin with I would
like crosstabulations using the first variabel in the data frame,
which is called meriter.

 table(meriter[[1]], meriter[[3]])
  
  ja nej
  Annan 0  2   1
  Avdelningen fÃ¶r teknik- och vetenskapsstudier 0  5   1
  CEFOS 0  6   3
  FÃ¶rvaltningshÃ¶gskolan 0 13   6
  Institutionen fÃ¶r globala studier 0 20  12
  Institutionen fÃ¶r journalistik och masskommunikation  0  5  17
  Institutionen fÃ¶r socialt arbete  1 19  35
  Psykologiska institutionen0 24  21
  Sociologiska institutionen0 16  12
  Statsvetenskapliga institutionen  0 19  12


I tried the following small code snippet which I copied from the
Introduction to R:

 for (i in 2:length(meriter)) { table(meriter[[1]], meriter[[i]]) }
 

And there is no output at all, just a new prompt.

I added a print statement just to check the loop construct, and it
seems to work.

 for (i in 2:length(meriter)) { print(i); table(meriter[[1]], meriter[[i]]) }
[1] 2
[1] 3
[1] 4

But I get no tables :-(

What do I do wrong?

-- 
Hans Ekbrand (http://sociologi.cjb.net) [EMAIL PROTECTED]
GPG Fingerprint: 1408 C8D5 1E7D 4C9C C27E 014F 7C2C 872A 7050 614E


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Solution: ploting a comparison of two scores, including the labels in the plot

2007-11-06 Thread Hans Ekbrand

Thanks to Greg Snow and John Kane I now have a working function that
does what I wanted, that is compares two scores in a plot.

Here is the function:

## compare.ratings: plots two lists corresponding to two different
## ratings. For each element, a line connects the position of that
## element in the two lists.

compare.ratings - function(data.frame=df, vector1=rating1, vector2=rating2, 
vector3=labels) {
  treshold - 0.1

  data.frame - data.frame[sort.list(data.frame[[vector2]]),]

  for(i in 2:length(data.frame[,vector2])) {
 data.frame[i,vector2] - data.frame[i,vector2] + (treshold * (i-1))
   }

  data.frame - data.frame[sort.list(data.frame[[vector1]]),]
  
  for(i in 1:length(data.frame[,vector1])) {
data.frame[i,vector1] - data.frame[i,vector1] + (treshold * (i-1))
  }

  tmp - c(rbind( data.frame[[vector1]], data.frame[[vector2]], NA ))
  tmp2 - rep( c(1,2,NA), nrow(data.frame) )
 
  plot(tmp2, tmp, type='b', xlim=c(0,3), xlab='', ylab='', lwd=0.5)
  text(0.9, data.frame[[vector1]], data.frame[[vector3]], adj=1, cex=0.75)
  text(2.1, data.frame[[vector2]], data.frame[[vector3]], adj=0, cex=0.75)

}

-- 
Hans Ekbrand (http://sociologi.cjb.net) [EMAIL PROTECTED]
A. Because it breaks the logical sequence of discussion
Q. Why is top posting bad?


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ploting a comparison of two scores, including the labels in the plot

2007-11-05 Thread Hans Ekbrand

On Mon, Nov 05, 2007 at 10:51:08AM -0700, Greg Snow wrote:
 Does the following do what you want (or at least start you in the correct 
 direction)?
 
 mydata - data.frame( job=c(Ambassadör,Läkare,Domare,  
 Professor,Advokat,Pilot,Verkställande direktör,Forskare,
 Civilingenjör,Statsråd), SAMHM= c(8.32, 8.15, 8.14, 8.13, 7.95,
  7.81, 7.78, 7.60, 7.47, 7.41),INDM= c( 7.2771, 8.1029, 7.5965,
  7.5618, 7.1876, 7.4380, 6.8361, 7.6630, 6.8802, 6.3916))
 
  
 tmp - c(rbind( mydata$SAMHM, mydata$INDM, NA ))
 tmp2 - rep( c(1,2,NA), nrow(mydata) )
 
 plot(tmp2, tmp, type='b', xlim=c(0,3), xlab='', ylab='rating')
 text(0.9, mydata$SAMHM, mydata$job, adj=1, cex=0.75)
 text(2.1, mydata$INDM, mydata$job, adj=0, cex=0.75)

Yes, definately! Thanks Greg, now I'll just increase the smallest
differences to a minimum so the labels becomes readable.

-- 
Hans Ekbrand (http://sociologi.cjb.net) [EMAIL PROTECTED]
A. Because it breaks the logical sequence of discussion
Q. Why is top posting bad?


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ploting a comparison of two scores, including the labels in the plot

2007-11-01 Thread Hans Ekbrand

Hello r-help!

I have data with two kind of ratings on status of 100 occupations. The
first kind of rating is on the percieved objective status that these
occupations have in society at large, and the second kind or rating is
on the status that the respondents think that these occuption *should*
have.

The ratings were originally integer values in the rage 1-9, but in the
current data, I use their mean values.

Here is an printout for the first 10 occupations: (the occupation
names are in swedish)

 data.frame(myobj[1:10, c(YRKE, SAMHM, INDM)], row.names = YRKE)
  SAMHM   INDM
Ambassadör 8.32 7.2771
Läkare (doctor)  8.15 8.1029
Domare (judge)   8.14 7.5965
Professor  8.13 7.5618
Advokat (lawyer) 7.95 7.1876
Pilot  7.81 7.4380
Verkställande direktör 7.78 6.8361
Forskare (scientist  7.60 7.6630
Civilingenjör (engineer) 7.47 6.8802
Statsråd (minister)  7.41 6.3916
 

I would like to make a plot with two lists. The first list should list
the occupations ordered by SAMHM (as in the printout above) and the
values of SAMH. The linespacing in this list should be increased by
the difference in SAMH between the the occupations (i.e. between
Ambassadör and Läkare (eng. doctor) there should be a larger
linespaceing than between Läkare and Domare (eng. judge)).

The second list should be like the first, but based on INDM instead
of SAMH.

These two list should ideally be plotted side by side with lines
connecting each occuption.

Here is an ascii-art illustration of what I intend (excluding the
connecting lines, which are hard to draw with ascii :-)

--
Ambassadör 

Läkare (doctor)  
Domare (judge)
Professor  Läkare

Advokat (lawyer) 

Pilot 
Verkställande direktör 

Forskare (scientist) Forskare
   Domare
Civilingenjör (engineer) Professor
Statsråd (minister)  Pilot
   Ambassadör
   Advokat


   Civilingenjör
   Verkställande direktör


   Statsråd
--

If printing strings (labels) with different linespacing turns out to
be problematic, another solution would be to print a list of the
occupations ordered by SAMH, points of SAMH values (with
Y=SAMH), points of INDM (with Y=INDM) and a list of occupations
ordered by INDM, with a line for each occupation connecting the
labels with the points and the two points that represents the
occupation.

Since there are a lot of functions for ploting and I am new to R, I
would like advise on what packages/functions that should be used to
get what I want (if what I want is possible to achieve with R, if it
is not, then please let me know).

Sample code is, of course, also very much appreciated.

kind regards,

-- 
Hans Ekbrand (http://sociologi.cjb.net) [EMAIL PROTECTED]
A. Because it breaks the logical sequence of discussion
Q. Why is top posting bad?


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ploting a comparison of two scores, including the labels in the plot

2007-11-01 Thread Hans Ekbrand

On Thu, Nov 01, 2007 at 02:52:08PM -0400, John Kane wrote:
 I gave it a try with conventional plot and it does not
 look easy to get a good result.

Thanks alot John Kane!

While I see what you mean, I think your solution does a good job and
provides a basis for me to work on. If someone would recommend another
plotting function (or package) to try, I would still be interested.

 x - YRKE SAMHM INDM
 Ambassadör 8.32 7.2771
 Läkare 8.15 8.1029
 Domare 8.14 7.5965
 Professor  8.13 7.5618
 Advokat7.95 7.1876
 Pilot  7.81 7.4380
 Verkställande.direktör 7.78 6.8361
 Forskare   7.60 7.6630
 Civilingenjör  7.47 6.8802
 Statsråd   7.41 6.3916 
 
 status - read.table(textConnection(x), header=TRUE)
 xx1 - c(rep(1,10))
 xx2 - c(rep(2,10))
 
 
 plot(xx1, status[,2], xaxt='s', yaxt='s',
 xlim=c(.5,2.5),
  ylim=c(min(status[,3]),max(status[,2])),
 type='p', xlab=, ylab=)
 points(xx2,status[,3])
 segments(xx1,status[,2],xx2,status[,3])
 text(xx1-.1,status[,2], labels=status[,1], cex=.6)
 text(xx2+.1, status[,3], labels=status[,1], cex=.6)

-- 
Hans Ekbrand (http://sociologi.cjb.net) [EMAIL PROTECTED]
Q. What is that strange attachment in this mail?
A. My digital signature, see www.gnupg.org for info on how you could
   use it to ensure that this mail is from me and has not been
   altered on the way to you.


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

60 matches

Mail list logo