David , Duncan

Thanks for the swift response.

You guys hit the nail on the head. That's exactly what the problem was.

All the best
Steve
----- Original Message ----- From: "David Winsemius" <dwinsem...@comcast.net>
To: "Duncan Murdoch" <murd...@stats.uwo.ca>
Cc: "Steve Sidney" <sbsid...@mweb.co.za>; <r-help@r-project.org>
Sent: Monday, January 11, 2010 3:49 PM
Subject: Re: [R] Help with Order



On Jan 11, 2010, at 7:49 AM, Duncan Murdoch wrote:

On 11/01/2010 7:37 AM, Steve Sidney wrote:
Dear List
As a fairly new R programmer I seem to have run into a strange problem - probably my inexperience with R After reading and merging successive files into a single data frame, I find that order does not sort the data as expected. I have multiple references in each file but each file refers to measurement data obtained at a different time.
Here's the code
library(reshape)
# Enter file name to Read & Save data
FileName=readline("Enter File name:\n")
# Find first occurance of file
for ( round1 in 1 : 6) {
ReadFile=paste(round1,"C_",FileName,"_Stats.csv", sep="")
if (file.exists(ReadFile))
break
}
x = data.frame(read.csv(ReadFile, header=TRUE),rnd=round1)
for ( round2 in (round1+1) : 6) {
#
ReadFile=paste(round2,"C_",FileName,"_Stats.csv", sep="")
if (file.exists(ReadFile)) {
y = data.frame(read.csv(ReadFile, header=TRUE),rnd = round2)
   if (round2 == (round1 +1))
   z=data.frame(merge(x,y,all=TRUE))
   z=data.frame(merge(y,z,all=TRUE))
}
}
ordered = order(z$lab_id)

Following Duncan's hypothesis, perhaps change this to :
ordered = order(as.character(z$lab_id))

results = z[ordered,]
res = data .frame ( lab = results [,"lab_id "],bw=results[,"ZBW"],wi=results[,"ZWI"],pf_zbw=0,pf_zwi=0,r = results[,"rnd"])
#
# Establish no of samples recorded
nsmpls = length(res[,c("lab")])
# Evaluate Z_scores for Between Lab Results
for ( i in 1 : nsmpls) {
if (res[i,"bw"] > 3 | res[i,"bw"] < -3)
res[i,"pf_zbw"]=1
}
# Evaluate Z_scores for Within Lab Results
for ( i in 1 : nsmpls) {
if (res[i,"wi"] > 3 | res[i,"wi"] < -3)
res[i,"pf_zwi"]=1
}
dd = melt(res, id=c("lab","r"), "pf_zbw")
b = cast(dd, lab ~ r)
If anyone could see why the ordering only works for about 55 of 70 records and could steer me in the right direction I would be obliged

I can't try out your code, but I'd guess it's due to conversion of strings to factors. Sorting factors will sort them by their numerical value, not by the strings.

So the solution is to set stringsAsFactors=FALSE, either in each read.csv call, or globally with options(stringsAsFactors=FALSE).

Duncan Murdoch



David Winsemius, MD
Heritage Laboratories
West Hartford, CT



______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to