hello R ussers,

i have the same problem with my data,
for aal the different variables, i have the same number of cases, but the 
are often out of detectionlimits so they produce "na's" .
so the data looks like this:

case    var1    var2    var3    var4 ... 
1       9       9       13      11
2       15      9       15      13 
3       na      na      12      9
4       8       6       na      na
5       14      10      na      na
6       20      15      17      15      ..
..

What i would like to do for data exploration, is to compare each possible 
pair of variables, get their correlation coefficient, the intercept and 
the slope of regression line. yet for every variable the messurements are 
lnked thruogh theyr case. it is the same sample just a diferent test.

Now  i select a subsets  of variables out of the original dataset, and use 
 : 
          value_x1 = subset(dataset_1,select=lg_value)
          value_y1 =subset(dataset_2,select=lg_value)

Then i to mold an lm model, inorder to get estimates for the slope ans 
intercept
        model_1 <- lm (value_y1[,1]~ value_x1[,1]  )

This is what R tell's me:
                        "Error in model.frame(formula, rownames, 
variables, varnames, extras, extranames,  : 
                                  variable lengths differ (found for 
'value_x1[, 1]')"

Is there perhapes a way of binding the selected subsets together, still 
linked to their case, so that the na's can be discarded by R automaticaly?
I have been trying to use SQLiteDF and the other sql func's of R, but i 
don't realy understand them.
If someone out there knows how to use sql, in R, i d be delited if he or 
she could explain it to me, more understandible then the manuals i find on 
the web.
Here is what io would want sql to do .


My data is in columns, one column holds all the case numbers, one the 
messured values, one all the testtypes and one the timeperiod and then one 
column for the lab's that preformed the test. is is stored in a txt file. 
So it is a long 5 column data table.
Now is it possible to make a cross table holding the case nr's, and 
timeperiod in 2 column's, and then have a different column for every test?
so if there are 4 tests and 4 lab's, it would give 16 columns.
I've tryed it in access, but it gave me andless loops of repeated values. 
and creating new data files is dangerous, 'litle mistakes made while 
copying ' or manipultaions made to one file and not to the other'.
.

kind regards,
Tom



Disclaimer: click here
        [[alternative HTML version deleted]]

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to