[R] SQL Changing Data Type

2011-06-24 Thread GL
Passing in two dates to a sql statement (sqldf). Is returning a factor. Tried
setting back to a Date via as.Date, but get an error the error: character
string is not in a standard unambiguous format. Any thoughts appreciated. 

Code/Results listed below:

 summary(df.possible.combos)
  Date Hour  
 Min.   :2011-03-01   Min.   : 0.00  
 1st Qu.:2011-03-23   1st Qu.: 5.75  
 Median :2011-04-14   Median :11.50  
 Mean   :2011-04-14   Mean   :11.50  
 3rd Qu.:2011-05-06   3rd Qu.:17.25  
 Max.   :2011-05-31   Max.   :23.00  
 summary(df.aggregate)   
  Date Hour   x 
 Min.   :2011-03-01   16 : 82   Min.   : 1.000  
 1st Qu.:2011-03-22   17 : 82   1st Qu.: 1.000  
 Median :2011-04-13   18 : 82   Median : 2.000  
 Mean   :2011-04-14   19 : 79   Mean   : 4.195  
 3rd Qu.:2011-05-07   20 : 76   3rd Qu.: 7.000  
 Max.   :2011-05-31   7  : 75   Max.   :20.000  
  (Other):377   
 #merge raw data and all possible combinations
   df.final - sqldf('select Date, Hour, x as RoomsInUse from
 df.aggregate
+ left join df.possible.combos using (Hour, Date)')
 summary(df.final)
  Date  Hour   RoomsInUse
 15069.0: 16   16 : 82   Min.   : 1.000  
 15114.0: 16   17 : 82   1st Qu.: 1.000  
 15063.0: 15   18 : 82   Median : 2.000  
 15082.0: 15   19 : 79   Mean   : 4.195  
 15125.0: 15   20 : 76   3rd Qu.: 7.000  
 15044.0: 14   7  : 75   Max.   :20.000  
 (Other):762   (Other):377   
 thedate - as.Date(df.final$Date)
Error in charToDate(x) : 
  character string is not in a standard unambiguous format
 


--
View this message in context: 
http://r.789695.n4.nabble.com/SQL-Changing-Data-Type-tp3623508p3623508.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sqldf hanging on macintosh - works on windows

2010-11-02 Thread GL

Marc: Installing Simon's package worked perfectly. Thanks so much! 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/sqldf-hanging-on-macintosh-works-on-windows-tp3022193p3023736.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] class changed after execution with sqldf

2010-11-02 Thread GL

When I run sqldf to merge two datasets, it's changing the Date (class date)
to a numeric value (class factor). Not sure why. Appreciate any insight.
Console output for two datasets and the merged dataset (via sqldf) listed
below.


 summary(df.aggregate)
  Date Hourx
 Min.   :2010-07-01   0  :  64   Min.   : 0.00  
 1st Qu.:2010-07-25   1  :  64   1st Qu.: 1.00  
 Median :2010-08-16   2  :  64   Median : 9.00  
 Mean   :2010-08-16   3  :  64   Mean   :11.77  
 3rd Qu.:2010-09-08   4  :  64   3rd Qu.:23.00  
 Max.   :2010-09-30   5  :  64   Max.   :32.00  
  (Other):1152  
 class(df.aggregate$Date)
[1] Date
 summary(df.possible.combos)
  Date Hour  
 Min.   :2010-07-01   Min.   : 0.00  
 1st Qu.:2010-07-25   1st Qu.: 5.75  
 Median :2010-08-16   Median :11.50  
 Mean   :2010-08-16   Mean   :11.50  
 3rd Qu.:2010-09-08   3rd Qu.:17.25  
 Max.   :2010-09-30   Max.   :23.00  
 class(df.possible.combos$Date)
[1] Date
 #merge raw data and all possible combinations
   df.final - sqldf('select Date, Hour, x as RoomsInUse from
 df.possible.combos
+ left join df.aggregate using (Hour, Date)')
 summary(df.final)
  Date   Hour RoomsInUse   
 14791.0:  24   Min.   : 0.00   Min.   : 0.00  
 14792.0:  24   1st Qu.: 5.75   1st Qu.: 1.00  
 14796.0:  24   Median :11.50   Median : 9.00  
 14797.0:  24   Mean   :11.50   Mean   :11.77  
 14798.0:  24   3rd Qu.:17.25   3rd Qu.:23.00  
 14799.0:  24   Max.   :23.00   Max.   :32.00  
 (Other):1392  
 class(df.final$Date)
[1] factor

-- 
View this message in context: 
http://r.789695.n4.nabble.com/class-changed-after-execution-with-sqldf-tp3024592p3024592.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] class changed after execution with sqldf

2010-11-02 Thread GL

Forgot to mention. This works in the PC implementation of R. The results I'm
seeing here are in Mac OS X with X11 and tcl/tk installed. 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/class-changed-after-execution-with-sqldf-tp3024592p3024602.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sqldf hanging on macintosh - works on windows

2010-11-01 Thread GL

Have a long script that runs fine on windows (32 bit). When I try to run in
on two different macs (64 bit), however, it hangs with identical behavior.

I start with:
library(sqldf)

This results in messages:
Loading required package: DBI
Loading required package: RSQLite
Loading required package: RSQLite.extfuns
Loading required package: gsubfn
Loading required package: proto
Loading required package: chron

I then read some data, etc.

I execute the following:

#merge raw data and all possible combinations
  df.final - sqldf('select Date, Hour, x as RoomsInUse from
df.possible.combos
left join df.aggregate using (Hour, Date)')

I receive the messages:
Loading required package: tcltk
Loading Tcl/Tk interface ... 
+ 

Then I get into some kind of loop. Message at bottom ribbon says:

executing:
try(gsub('\\s+','',paste(capture.output(print(arg(summary))),collapse=)),silent=TRUE)

On the pc implementation it runs flawlessly, and quickly. 

Truly appreciate any ideas.

-- 
View this message in context: 
http://r.789695.n4.nabble.com/sqldf-hanging-on-macintosh-works-on-windows-tp3022193p3022193.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sqldf hanging on macintosh - works on windows

2010-11-01 Thread GL

added library(RH2)

Still get message:

Loading required package: tcltk
Loading Tcl/Tk interface
+

directly after sqldf statement 

   df.final - sqldf('select Date, Hour, x as RoomsInUse from
 df.possible.combos
+ left join df.aggregate using (Hour, Date)')

There is no progress spinner. If I hit enter I get a 

At that point I start to enter any command (just summary, for instance), I
get the progress spinner, the
try(gsub('\\s+','',paste(capture.output(print(arg(summary))),collapse=)),silent=TRUE)
 message in the bottom ribbon, and the system apparently hangs. 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/sqldf-hanging-on-macintosh-works-on-windows-tp3022193p3022233.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sqldf hanging on macintosh - works on windows

2010-11-01 Thread GL


 
 library(sqldf)
Loading required package: DBI
Loading required package: RSQLite
Loading required package: RSQLite.extfuns
Loading required package: gsubfn
Loading required package: proto
Loading required package: chron
 debug(sqldf)  
   df.final - sqldf('select Date, Hour, x as RoomsInUse from
 df.possible.combos
+ left join df.aggregate using (Hour, Date)')
debugging in: sqldf(select Date, Hour, x as RoomsInUse from
\df.possible.combos\\nleft join \df.aggregate\ using (Hour, Date))
debug: {
as.POSIXct.character - function(x) structure(as.numeric(x), 
class = c(POSIXt, POSIXct))
as.Date.character - function(x) structure(as.numeric(x), 
class = Date)
as.Date.numeric - function(x, origin = 1970-01-01, ...)
base::as.Date.numeric(x, 
origin = origin, ...)
as.dates.character - function(x) structure(as.numeric(x), 
class = c(dates, times))
as.times.character - function(x) structure(as.numeric(x), 
class = times)
overwrite - FALSE
request.open - missing(x)  is.null(connection)
request.close - missing(x)  !is.null(connection)
request.con - !missing(x)  !is.null(connection)
request.nocon - !missing(x)  is.null(connection)
dfnames - fileobjs - character(0)
if (request.close || request.nocon) {
on.exit({
dbPreExists - attr(connection, dbPreExists)
dbname - attr(connection, dbname)
if (!missing(dbname)  !is.null(dbname)  dbname == 
:memory:) {
dbDisconnect(connection)
} else if (!dbPreExists  drv == sqlite) {
dbDisconnect(connection)
file.remove(dbname)
} else {
for (nam in dfnames) dbRemoveTable(connection, 
  nam)
for (fo in fileobjs) dbRemoveTable(connection, 
  fo)
dbDisconnect(connection)
}
})
if (request.close) {
if (identical(connection, getOption(sqldf.connection))) 
options(sqldf.connection = NULL)
return()
}
}
if (request.open || request.nocon) {
if (is.null(drv)) {
drv - if (package:RpgSQL %in% search()) {
pgSQL
}
else if (package:RMySQL %in% search()) {
MySQL
}
else if (package:RH2 %in% search()) {
H2
}
else SQLite
}
drv - tolower(drv)
if (drv == mysql) {
m - dbDriver(MySQL)
connection - if (missing(dbname) || dbname == :memory:) {
dbConnect(m)
}
else dbConnect(m, dbname = dbname)
dbPreExists - TRUE
}
else if (drv == pgsql) {
m - dbDriver(pgSQL)
if (missing(dbname) || is.null(dbname)) {
dbname - getOption(RpgSQL.dbname)
if (is.null(dbname)) 
  dbname - test
}
connection - dbConnect(m, dbname = dbname)
dbPreExists - TRUE
}
else if (drv == h2) {
m - H2()
if (missing(dbname) || is.null(dbname)) 
dbname - :memory:
dbPreExists - dbname != :memory:  file.exists(dbname)
connection - if (missing(dbname) || dbname == :memory:) {
dbConnect(m, jdbc:h2:mem:, sa, )
}
else {
jdbc.string - paste(jdbc:h2, dbname, sep = :)
dbConnect(m, jdbc.string)
}
}
else {
m - dbDriver(SQLite)
if (missing(dbname)) 
dbname - :memory:
dbPreExists - dbname != :memory:  file.exists(dbname)
if (is.null(getOption(sqldf.dll))) {
dll - Sys.which(libspatialite-1.dll)
if (dll != ) 
  options(sqldf.dll = dll)
else options(sqldf.dll = FALSE)
}
dll - getOption(sqldf.dll)
if (length(dll) != 1 || identical(dll, FALSE) || 
nchar(dll) == 0) {
dll - FALSE
}
else {
if (dll == basename(dll)) 
  dll - Sys.which(dll)
}
options(sqldf.dll = dll)
if (!identical(dll, FALSE)) {
connection - dbConnect(m, dbname = dbname,
loadable.extensions = TRUE)
s - sprintf(select load_extension('%s'), dll)
dbGetQuery(connection, s)
}
else connection - dbConnect(m, dbname = dbname)
init_extensions(connection)
}
attr(connection, dbPreExists) - dbPreExists
if (missing(dbname)  drv == sqlite) 
dbname - :memory:
attr(connection, dbname) - dbname
if (request.open) {
options(sqldf.connection = connection)

[R] AHRQ - Creation of Comorbidity Variables

2010-09-07 Thread GL

If there are any other users who use AHRQ's SAS code comoanaly2010 and
comformat2010 to create comorbidity variables, I thought you might be
interested in the following PRELIM code we wrote to mimic its functionality
in R. It seems to yield similar results, but may contain errors. Please feel
free to comment (kindly) or enhance. I'm sure there are better ways to skin
this cat, but we at least took a stab at it. Thought this would be a good
use of the community if there are any other interested users. 



# Function flag 
#
# Intended to provide functionality from AHRQ comformat2010 comoanaly2010
#
# Input, dataframe with
#   id in column 1
#   msdrg in column 2
#   diagnosis in columns 4-53
# Output, numeriuc list with id and one element per cc
# dimnames = c('ID',
# 'CHF','VALVE','PULMCIRC','PERIVASC',
# 'HTN_C','PARA','NEURO','CHRNLUNG','DM',
# 'DMCX','HYPOTHY','RENLFAIL','LIVER','ULCER',
# 'AIDS','LYMPH','METS','TUMOR','ARTH',
# 'ANEMDEF','ALCOHOL','DRUG','PSYCH','DEPRESS') )

flag = function(data, k) {
data = data[k, ]
print(data)
print(k)
id = as.matrix(data[1])
DX = data[4:53]
DX = as.matrix(DX)
DRG = as.matrix(data[2])
   
##format
chf = c(39891, 4280:4289, 42800:42889)
v1 = paste(0, 9320:9324, sep = )
v5 = paste(V422, , sep = )
v6 = paste(V433, , sep = )
valve = c(v1, 3940:3971, 39400:39709, 3979, 4240:4249, 42400:42499, 
7463:7466, 74630:74659, v5, v6)
pulmcirc = c(41511:41519, 4160:4169, 41600:41689, 4179)
p3 = paste(c(4471, 5571, 5579, V434), , sep = )
perivasc = c(4400:4409, 44000:44089, 4411:4419, 44100:44189, 
4420:4429, 44200:44289, 4431:4439, 44310:44389, 44421:44422, 
p3, 449)
htn = c(4011, 4019, 64200:64204)
htncx = c(4010, 4372)
   

#  the following are special, temporary formats used in the creation of
the 
#  hypertension complicated comorbidity when overlapping with congestive 
#  heart failure or renal failure occurs. These temporary formats are 
#  referenced in the program called comoanaly2009.txt
   

htnpreg = c(64220:64224)
htnwochf = c(40200, 40210, 40290, 40509, 40519, 40599)
htnwchf = c(40201, 40211, 40291)
hrenworf = c(40300, 40310, 40390, 40501, 40511, 40591, 64210:64214)
hrenwrf = c(40301, 40311, 40391)
hhrwohrf = c(40400, 40410, 40490)
hhrwchf = c(40401, 40411, 40491)
hhrwrf = c(40402, 40412, 40492)
hhrwhrf = c(40403, 40413, 40493)
ohtnpreg = c(64270:64274, 64290:64294)
   

para = c(3420:3449, 34200:34489, 43820:43853, 78072)
neuro = c(3300:3319, 33000:33189, 3340:3359, 33400:33589, 
3411:3419, 34110:34189, 3452:3453, 34520:34529, 3320, 
3334, 3335, 3337, 3380, 7687, 7803, 7843, 340, 33371, 
33372, 33379, 33385, 33394, 34500:34511, 34540:34591, 
34700:34701, 34710:34711, 64940:64944, 76870:76873, 78031, 
78032, 78039, 78097)
chrnlung = c(490:492, 4900:4928, 49000:49279, 49300:49392, 
494, 4940:4941, 49400:49409, 496:505, 4950:5049, 49500:50499, 
5064)
dm = c(25000:25033, 64800:64804, 24900:24931)
dmcx = c(25040:25093, 7751, 24940:24991)
hypothy = c(243:244, 2430:2442, 24300:24419, 2448, 2449)
renlfail3 = paste(c(V420, V451, V568), , sep = )
renlfail4 = paste(V, c(4511:4512), sep = )
renlfail5 = paste(V, c(560:563, 5600:5632), sep = )
renlfail = c(5853:5856, 5859, 586, renlfail3, renlfail4, 
renlfail5)
liver1 = paste(0, c(7022, 7023, 7032, 7033, 7044, 7054), 
sep = )
liver = c(liver1, 4560, 4561, 45620, 45621, 5710, 5712, 5713, 
57140:57149, 5715:5716, 5718:5719, 5723, 5728, V427)
ulcer1 = paste(531, c(41, 51, 61, 70, 71, 91), sep = )
ulcer2 = paste(532, c(41, 51, 61, 70, 71, 91), sep = )
ulcer3 = paste(533, c(41, 51, 61, 70, 71, 91), sep = )
ulcer4 = paste(534, c(41, 51, 61, 70, 71, 91), sep = )
ulcer = c(ulcer1, ulcer2, ulcer3, ulcer4)
aids = paste(0, c(42:44, 420:449, 4200:4289), sep = )
lymph = c(2:20238, 20250:20301, 20302:20382, 2386, 2733)
mets = c(1960:1991, 19600:19909, 20970:20975, 20979, 78951)
tumor = c(1400:1729, 1740:1759, 14000:17289, 17400:17589, 
20900:20924, 20931:20936, 25801:25803, 2093, 20925:20929, 
179:195, 1790:1958, 17900:19579)
arth = c(7010, 7100:7109, 7140:7149, 7200:7209, 71000:71089, 
71400:71489, 72000:72089, 725)
c1 = paste(c(2860:2869, 2871, 2873:2875), , sep = )
coag = c(2860:2869, 2871, 2873:2875, 28600:28689, 28730:28749, 
64930:64934, 28984)
ob3 = paste(V, c(8530:8549, 8554), sep = )

[R] sapply/lapply instead of loop

2010-08-10 Thread GL

Using the input below, can I do something more elegant (and more efficient)
than the loop also listed below to pad strings to a width of 5? The true
matrix is about 300K rows and 31 columns. 

###
#INPUT
###
 temp
DX1   DX2   DX3
1 13761  8125 49178
2 63371   v75 22237
3 51745 77703 93500
4 64081 32826   v72
5 78477 43828 87645
 

###
#CODE
###

ssize - c(nrow(temp), ncol(temp))
aa - c(1:ssize[2])
aa - paste(DX, aa, sep = )
record - matrix(?, nrow = ssize, ncol = ssize[2])
colnames(record) - aa

mm - 0
#for (j in 1:1) {
for (j in 1:ssize[1]) {
mm - j
a - as.character(as.matrix(as.data.frame(temp[j,])))
len2 - sum(a != ?)
mi - 0
for (k in 1:len2) {
aa - a[k]
a0 - 5 - nchar(aa)
if (a0  0) {
for (st in 1:a0) {
  aa - paste(aa,  , sep = )
}
}
record[j, k] - aa
}
}

###
#OUTPUT
###

DX1   DX2   DX3
1 13761  8125 49178
2 63371   v75 22237
3 51745 77703 93500
4 64081 32826   v72
5 78477 43828 87645
-- 
View this message in context: 
http://r.789695.n4.nabble.com/sapply-lapply-instead-of-loop-tp2320265p2320265.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sapply/lapply instead of loop

2010-08-10 Thread GL

Both of those approaches seem to return (  v75) instead of (v75  ). 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/sapply-lapply-instead-of-loop-tp2320265p2320305.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sapply/lapply instead of loop

2010-08-10 Thread GL

That works great, and is ever so much simpler. Thanks much!
-- 
View this message in context: 
http://r.789695.n4.nabble.com/sapply-lapply-instead-of-loop-tp2320265p2320317.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Intersecting list vs rows in matrix

2010-08-10 Thread GL

Know that if I have List_1 and List_2 that I can check to see if the
intersect via the code below:

List _1: 
a, b, c, d, e, f, g 
List_2: 
z, y, x, w, v, u, b 
length(intersect(List_1, List_2))  0
return = true

If instead I wanted to check a dataframe that is a list of lists, how
would I do that by record without looping?

List _1: 
a, b, c, d, e, f, g 

List_2: 
z, y, x, w, v, u, b 
y, z, w, v, v, u, m
z, y, x, a, b, c
.
.
.

return
true
false
true
,
,
,

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Intersecting-list-vs-rows-in-matrix-tp2320427p2320427.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Intersecting list vs rows in matrix

2010-08-10 Thread GL

Very cool. Thanks! 
-Original Message-
From: Henrique Dallazuanna [via R] 
ml-node+2320470-1037864429-108...@n4.nabble.com
To: Lipori, Gigi pfl...@shands.ufl.edu

Sent: 08/10/2010 05:18:25 PM
Subject: Re: Intersecting list vs rows in matrix




Try this:

 colSums(apply(List_2, 1, is.element, List_1))   0

On Tue, Aug 10, 2010 at 5:42 PM, GL pfl...@shands.ufl.edu wrote:


 Know that if I have List_1 and List_2 that I can check to see if the
 intersect via the code below:

 List _1:
 a, b, c, d, e, f, g
 List_2:
 z, y, x, w, v, u, b
 length(intersect(List_1, List_2))  0
 return = true

 If instead I wanted to check a dataframe that is a list of lists, how
 would I do that by record without looping?

 List _1:
 a, b, c, d, e, f, g

 List_2:
 z, y, x, w, v, u, b
 y, z, w, v, v, u, m
 z, y, x, a, b, c
 .
 .
 .

 return
 true
 false
 true
 ,
 ,
 ,

 --
 View this message in context:
 http://r.789695.n4.nabble.com/Intersecting-list-vs-rows-in-matrix-tp2320427p2320427.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
View message @ 
http://r.789695.n4.nabble.com/Intersecting-list-vs-rows-in-matrix-tp2320427p2320470.html

To unsubscribe from Intersecting list vs rows in matrix, click 
http://r.789695.n4.nabble.com/template/NodeServlet.jtp?tpl=unsubscribe_by_codenode=2320427code=cGZsdWdnQHNoYW5kcy51ZmwuZWR1fDIzMjA0Mjd8LTIwMDU4OTM4Nw==


-- 
View this message in context: 
http://r.789695.n4.nabble.com/Intersecting-list-vs-rows-in-matrix-tp2320427p2320642.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] REmove level with zero observations

2010-08-03 Thread GL

If I have a column with 2 levels, but one level has no remaining
observations. Can I remove the level? 

Had intended to do it as listed below, but soon realized that even though
there are no observations, the level is still there. 

For instance

summary(dbs3.train.sans.influential.obs$HAC) 

yields

0 ,1 
4685,0 

nlevels(dbs3.train.sans.influential.obs$HAC)

yields
[1] 2

drop.list - NULL
for (i in 1:ncol(dbs3.train.sans.influential.obs)) {   
 if (nlevels(dbs3.train.sans.influential.obs[,i])  2) {drop.list -
cbind(drop.list,i)}}

yields 
nothing because HAC still has two levels, even though there aren't any
observations in on of the levels.

What I want to do is loop through all columns that are factors and create a
list of items to drop because there will subsequently be  2 levels when I
try to run a linear model.


-- 
View this message in context: 
http://r.789695.n4.nabble.com/REmove-level-with-zero-observations-tp2312553p2312553.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] REmove level with zero observations

2010-08-03 Thread GL

Ended up working as follows:

dbs3.train.sans.influential.obs -
drop.levels(dbs3.train.sans.influential.obs)

drop.list - NULL
for (i in 4:ncol(dbs3.train.sans.influential.obs)) {   
 if (nlevels(dbs3.train.sans.influential.obs[,i])  2) {drop.list -
cbind(drop.list,i)}}

dbs3.train.sans.influential.obs -
dbs3.train.sans.influential.obs[-c(drop.list)]
-- 
View this message in context: 
http://r.789695.n4.nabble.com/REmove-level-with-zero-observations-tp2312553p2312821.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Remove observations deemed influential by influential.measure

2010-06-29 Thread GL

dbs is an existing dataframe. I fit a lm and looked at influential
observations. I want now to delete the influential observations from dbs,
fit another lm, and see how different the results are. What is the syntax to
remove the influential observations from dbs?

fit - lm(NI ~ PG + log(TG), data=dbs)
fit.influential.observations - influence.measures(fit)

dbs.without.influential.observations - ?

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Remove-observations-deemed-influential-by-influential-measure-tp2272474p2272474.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Remove observations deemed influential by influential.measure

2010-06-29 Thread GL

dbs_influential_obs - which(apply(fit.influential.observations$is.inf, 1,
any))
dbs_sans_influential_obs - dbs1[-dbs_influential_obs,]

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Remove-observations-deemed-influential-by-influential-measure-tp2272474p2272524.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Basic question - more efficient method than loop?

2010-06-28 Thread GL

I'm guessing there's a more efficient way to do the following using the index
features of R. Appreciate any thoughts

for (i in 1:nrow(dbs1)){
if(dbs1$Payor[i] %in% Payor.Group.Medicaid) dbs1$Payor.Group[i] =
Medicaid
if(dbs1$Payor[i] %in% Payor.Group.Medicare) dbs1$Payor.Group[i] =
Medicare
if(dbs1$Payor[i] %in% Payor.Group.Commercial) dbs1$Payor.Group[i] =
Commercial
if(dbs1$Payor[i] %in% Payor.Group.Workers.Comp) dbs1$Payor.Group[i] =
Workers Comp
if(dbs1$Payor[i] %in% Payor.Group.Self.Pay) dbs1$Payor.Group[i] = Self
Pay
if(dbs1$Adm.Source[i] %in% Adm.Source.Group.Newborn)
dbs1$Adm.Source.Group[i] = Newborn
if(dbs1$Adm.Source[i] %in% Adm.Source.Group.ED) dbs1$Adm.Source.Group[i]
= ED
if(dbs1$Adm.Source[i] %in% Adm.Source.Group.Routine)
dbs1$Adm.Source.Group[i] = Routine
if(dbs1$Adm.Source[i] %in% Adm.Source.Group.Transfer)
dbs1$Adm.Source.Group[i] = Transfer
}
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Basic-question-more-efficient-method-than-loop-tp2271096p2271096.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Basic question - more efficient method than loop?

2010-06-28 Thread GL

Perfect. Thanks!
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Basic-question-more-efficient-method-than-loop-tp2271096p2271153.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Return value associated with a factor

2010-06-21 Thread GL

I am using the code below to extract census tract information.
save.tract$state, save.tract$county and save.tract$tract are returned as
factors. In the last three statements, I need to save the actual value of
the factor, but, instead, the code is yielding the position of the factor.
How do I instead return the value of the factor?

By way of example, for Lon=-82.49574 and Lat=29.71495, the code returns
state = 1, county = 1 and tract = 161. The desired results are state=12,
county = 001 tract = 002201.


#set libraries   
library(UScensus2000tract)
library(gdata)
data(florida.tract)


#read input
dbs.in = read.delim(addresses_coded_new.txt, header = TRUE, sep = \t, 
 quote=\, dec=.)
 
#instantiate output
more.columns - data.frame( state=double(0), 
county=double(0), 
tract=double(0)) 

dbs.in - cbind(dbs.in,more.columns)   

#fiure out how many times to loop
j - nrow(dbs.in)

#loop through each lab/long and assign census tract

for (i  in 1:j) {  

index-overlay(SpatialPoints(cbind(dbs.in$Lon[i],dbs.in$Lat[i])),florida.tract)
 save.tract-florida.tract[index,] 
 dbs.in$state[i] - save.tract$state #this is returning the position
in the list instead of the value
 dbs.in$county[i] - save.tract$county #this is returning the
position in the list instead of the value
 dbs.in$tract[i] - save.tract$tract #this is returning the position
in the list instead of the value
}


-- 
View this message in context: 
http://r.789695.n4.nabble.com/Return-value-associated-with-a-factor-tp2262605p2262605.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Return value associated with a factor

2010-06-21 Thread GL

Works great. Thanks much! 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Return-value-associated-with-a-factor-tp2262605p2262656.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question about user define function

2010-06-15 Thread GL

Have the following function that is called by the statement below. Trying to
return the two dataframes, but instead get one large list including both
tables. 

ReadInputDataFrames - function() {

  dbs.this= read.delim(this.txt, header = TRUE, sep = \t, quote=\,
dec=.)
  dbs.that=  read.delim(that.txt, header = TRUE, sep = \t, quote=\,
dec=.)
  c(this= dbs.this,patdb = dbs.that)
  
}

Called by: 

c - ReadInputDataFrames()

Results:

str(c) yields a list of 106 items $this.variabe1..53, $that$variable1..53


-- 
View this message in context: 
http://r.789695.n4.nabble.com/Question-about-user-define-function-tp2256513p2256513.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Are any values in one list contained within a second list

2010-06-13 Thread GL

Silly question, but, can I test to see if any value of list a is contained in
list b without doing a loop? A loop is easy enough, but wanted to see if
there was a cleaner way. By way of example:

List 1: a, b, c, d, e, f, g

List 2: z, y, x, w, v, u, b

Return true, since both lists contain b

List 1: a, b, c, d, e, f, g

List 2: z, y, x, w, v, u, t

Return false, since the lists have no mutual values



-- 
View this message in context: 
http://r.789695.n4.nabble.com/Are-any-values-in-one-list-contained-within-a-second-list-tp2253637p2253637.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bwplot in loop

2010-05-17 Thread GL

Subsequently saw this in FAQs

See FAQ:
http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-do-lattice_002ftrellis-graphics-not-work_003f
-- 
View this message in context: 
http://r.789695.n4.nabble.com/bwplot-in-loop-tp2220020p2220034.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] AHRQ Patient Quality Indicators

2010-03-31 Thread GL

Is anyone aware of R code that mimic's AHRQ's SAS code for their Prevention
Quality Indicators (PQI)? Don't see it anywhere, but wanted to see if anyone
else knew of anything. Many thanks
-- 
View this message in context: 
http://n4.nabble.com/AHRQ-Patient-Quality-Indicators-tp1747243p1747243.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Subscripting

2010-02-10 Thread GL

Dataframe1 contains a list of specific dates. Dataframe2 contains a large
dataset, one element of which is Date. How do I create a subset of
Dataframe2 that excludes the dates from Dataframe1? I know how to do it with
a left outer join vs null in SQL, but I can't figure out how to do it more
directly via the subcripts that already exist? 

Dateframe1

Date
1/1/2010
1/18/2010


Dataframe2

Date Attribute Count
1/1/2010 Red 5
1/15/2010 Green 2
1/18/2010 Purple 8
1/19/2010 Yellow 3

ResultingDataframe (Dataframe2 minus the rows that have Dates in Dataframe1)

Date Attribute Count
1/15/2010 Green 2
1/19/2010 Yellow 3

-- 
View this message in context: 
http://n4.nabble.com/Subscripting-tp1476330p1476330.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Print lattice output to table?

2010-01-28 Thread GL

I have beautiful box and whisker charts formatted with lattice, which is
obviously calculating summary statistics internally in order to draw the
charts. Is there a way to dump the associated summary tables that are being
used to generate the charts? Realize I could use tapply or such to get
something similar, but I have all the groupings and such already configured
to generate the charts. Simply want to dump those values to a table so that
I don't have to interpolate where the 75th percentile is on a visual chart.
Appreciate any thoughts..
-- 
View this message in context: 
http://n4.nabble.com/Print-lattice-output-to-table-tp1375040p1375040.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Print lattice output to table?

2010-01-28 Thread GL

That works great. Thanks!
-- 
View this message in context: 
http://n4.nabble.com/Print-lattice-output-to-table-tp1375040p1380862.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] question on sqldf syntax

2010-01-25 Thread GL

trying to structure sql to merge two datasets. structure follows:

dbs.possible.combos (all possible combinations of dates and places)
Date Place
1/1/10 N-01
1/1/10 S-02
1/2/10 N-01
1/2/10 S-02
etc...

dbs.aggregate (the raw data aggregated by date and location)
Date Place Days
1/1/10 N-01 6
1/1/10 S-02 10
1/2/10 S-02 5


Trying to merge so I look-up the values for each possible combo 
dbs.final - sqldf(select dbs.possible.combos$Date,
dbs.possible.combos$Place, dbs.possible.combos$Days FROM dbs.possible.combos
LEFT JOIN dbs.aggregate ON (dbs.possible.combos$Place = dbs.aggregate$Place)
AND (dbs.possible.combos$Date = dbs.aggregate$Date))

Resulting in: 
Error in sqliteExecStatement(con, statement, bind.data) : 
  RS-DBI driver: (error in statement: near .: syntax error)

What am I getting wrong in the syntax?
  



-- 
View this message in context: 
http://n4.nabble.com/question-on-sqldf-syntax-tp1289707p1289707.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question on sqldf syntax

2010-01-25 Thread GL

Actually, better sql would likely be: 

dbs.final - sqldf(select * from dbs.possible.combos left join
dbs.aggregate using (Date,Place))

but this still doesn't work
-- 
View this message in context: 
http://n4.nabble.com/question-on-sqldf-syntax-tp1289707p1289718.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question on Merge/Lookup

2010-01-22 Thread GL

I need to merge three datasets and don't know how. If I were using SQL, I
would use df3, look up the characteristics of each date in df1 and the value
for each observation in df2. 


df1 - unique list of Dates and characteristics of those dates
Date, MM, WW, DOW


df2 - the raw data
Date, Place, Value


df3 - all posibile combinations of Date + Place (via
expand.grid(unique(df2$Date),unique(df2$Place))
Date, Place

I need to end up with:

Date, MM, WW, DOW, PLace, Value (plug 0 if combo doesn't exist in
raw data).

Appreciate any help!


-- 
View this message in context: 
http://n4.nabble.com/Question-on-Merge-Lookup-tp1112384p1112384.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Limiting number of tickmarks in lattice bwplot

2010-01-11 Thread GL

Have a simple bwplot with 24 ordered factors across the x axis. I would like
to only label every 4th tick mark so that the labels fit. I tried
scales=list(x=list(tick.number=6)), but I still seem to get 24 tickmarks and
24 labels. Full code is below:


bwplot(SumOfIn.Use ~ Hour | Period,
scales=list(x=list(tick.number=6)),horizontal=FALSE,las=2,main=Rooms
Running,sub=Timeframe: 8/09 - 12/09,xlab=Hour of Day,ylab=Rooms
Running,ex.main=0.7,cex.axis=0.5,data=dbs.weekday,as.table=TRUE,layout=c(2,1))

-- 
View this message in context: 
http://n4.nabble.com/Limiting-number-of-tickmarks-in-lattice-bwplot-tp1011515p1011515.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Correcting for missing data combinations

2009-12-11 Thread GL

I can think of many brute-force ways to do this outside of R, but was
wondering if there was a simple/elegant solution within R instead.

I have a table that looks something like the following:

Factor1 Factor2 Value
A   11/11/2009  5
A   11/12/2009  4
B   11/11/2009  7
B   11/13/2009  8

From that I need to generate all permutations of Factor1 and Factor2 and
force a 0 for any combination that doesn’t exist in the actual data table.
By way of example, I’d like the output for above to end up as:

 Factor1Factor2 Value
A   11/11/2009  5
A   11/12/2009  4
A   11/13/2009  0
B   11/11/2009  7
B   11/12/2009  0
B   11/13/2009  8

Truly appreciate any thoughts.

-- 
View this message in context: 
http://n4.nabble.com/Correcting-for-missing-data-combinations-tp961301p961301.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.