Re: [R] Determining a basal correct count

2010-11-24 Thread David Herzberg
Phil, I wanted to thank you for this solution. I was working on other projects 
for the past couple of weeks, but I was finally able to come back to this and 
get it to work. The key was excluding a couple of columns on the right margin 
of the data base, the syntax wouldn't process those columns, but when I took 
them out it ran fine.

Best,

David S. Herzberg, Ph.D.
Vice President, Research and Development 
Western Psychological Services
12031 Wilshire Blvd.
Los Angeles, CA 90025-1251
Phone: (310)478-2061 x144
FAX: (310)478-7838
email: dav...@wpspublish.com



-Original Message-
From: Phil Spector [mailto:spec...@stat.berkeley.edu] 
Sent: Thursday, October 28, 2010 4:59 PM
To: David Herzberg
Cc: r-help@r-project.org
Subject: Re: [R] Determining a basal correct count

David -
I think changing

apply(x,1,function(x)rle(x[which(x==1)[1]:length(x)])$lengths[1])

to

apply(x,1,function(x)if(!any(x==1)) 0 else 
rle(x[which(x==1)[1]:length(x)])$lengths[1])

solves the problem.
 - Phil


On Thu, 28 Oct 2010, David Herzberg wrote:

> Thank you Phil - I'll give this a try. I do have some empty rows, so 
> I'll have to deal with that eventually.
> 
> Dave
> 
> Sent via DROID X
> 
> 
> -Original message-
>   From: Phil Spector 
>   To: David Herzberg 
>   Cc: "r-help@r-project.org" 
>   Sent: Thu, Oct 28, 2010 23:39:34 GMT+00:00
>   Subject: Re: [R] Determining a basal correct count
> 
> David -
>     I *think*
> 
>    apply(x,1,function(x)rle(x[which(x==1)[1]:length(x)])$lengths[1])
> 
> gives you what you want, but without a reproducible example it's hard 
> to say.  It will fail if there are no 1s in a given row.
> 
>  - Phil Spector
>   Statistical Computing 
> Facility
>   Department of Statistics
>       UC Berkeley
>   spec...@stat.berkeley.edu
> 
> 
> On Thu, 28 Oct 2010, David Herzberg wrote:
> 
> > Here's another interesting problem: if you recall I have a data 
> > frame
> (LCvars1) that consists of about 1500 cases (rows) of data from kids 
> who took a test of listening comprehension. The columns are their 
> scores (1 = correct, 0 = incorrect,  . = missing) on 140 test items. 
> The items are numbered sequentially and are ordered by increasing 
> difficulty as you go from left to right across the columns.
> >
> > I used the following (thanks to Peter Ehlers for this solution):
> >
> > First1ItemNo <- as.vector(
> >  apply(
> >  LCvars1, 1, match, x=1
> >  ))
> >
> > to make R go through the columns from left to right and record into 
> > a
> vector the column number of the first '1' response for each case.
> >
> > Now, for each case (row), I want R to START with the column that
> contains the first '1' response, and continue to the right and count 
> the number of consecutive columns containing '1' responses. At the next '0'
> or '.', I want R to record the count of consecutive '1's, and the skip 
> to the next row and begin the process anew.
> >
> > Thanks in advance for your help,
> >
> > David S. Herzberg, Ph.D.
> > Vice President, Research and Development Western Psychological 
> > Services
> > 12031 Wilshire Blvd.
> > Los Angeles, CA 90025-1251
> > Phone: (310)478-2061 x144
> > FAX: (310)478-7838
> > email: dav...@wpspublish.com<mailto:dav...@wpspublish.com>
> >
> >
> >
> >    [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> 
>
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Determining a basal correct count

2010-11-01 Thread David Herzberg
Phil, when I run:

apply(x,1,function(x)if(!any(x==1)) 0 else 
rle(x[which(x==1)[1]:length(x)])$lengths[1])

on my data set it returns:

ERROR: length(x)])$lengths[1]

Any thoughts?

-Original Message-
From: Phil Spector [mailto:spec...@stat.berkeley.edu] 
Sent: Thursday, October 28, 2010 4:59 PM
To: David Herzberg
Cc: r-help@r-project.org
Subject: Re: [R] Determining a basal correct count

David -
I think changing

apply(x,1,function(x)rle(x[which(x==1)[1]:length(x)])$lengths[1])

to

apply(x,1,function(x)if(!any(x==1)) 0 else 
rle(x[which(x==1)[1]:length(x)])$lengths[1])

solves the problem.
 - Phil


On Thu, 28 Oct 2010, David Herzberg wrote:

> Thank you Phil - I'll give this a try. I do have some empty rows, so 
> I'll have to deal with that eventually.
> 
> Dave
> 
> Sent via DROID X
> 
> 
> -Original message-
>   From: Phil Spector 
>   To: David Herzberg 
>   Cc: "r-help@r-project.org" 
>   Sent: Thu, Oct 28, 2010 23:39:34 GMT+00:00
>   Subject: Re: [R] Determining a basal correct count
> 
> David -
>     I *think*
> 
>    apply(x,1,function(x)rle(x[which(x==1)[1]:length(x)])$lengths[1])
> 
> gives you what you want, but without a reproducible example it's hard 
> to say.  It will fail if there are no 1s in a given row.
> 
>  - Phil Spector
>   Statistical Computing 
> Facility
>   Department of Statistics
>   UC Berkeley
>   spec...@stat.berkeley.edu
> 
> 
> On Thu, 28 Oct 2010, David Herzberg wrote:
> 
> > Here's another interesting problem: if you recall I have a data 
> > frame
> (LCvars1) that consists of about 1500 cases (rows) of data from kids 
> who took a test of listening comprehension. The columns are their 
> scores (1 = correct, 0 = incorrect,  . = missing) on 140 test items. 
> The items are numbered sequentially and are ordered by increasing 
> difficulty as you go from left to right across the columns.
> >
> > I used the following (thanks to Peter Ehlers for this solution):
> >
> > First1ItemNo <- as.vector(
> >  apply(
> >  LCvars1, 1, match, x=1
> >  ))
> >
> > to make R go through the columns from left to right and record into 
> > a
> vector the column number of the first '1' response for each case.
> >
> > Now, for each case (row), I want R to START with the column that
> contains the first '1' response, and continue to the right and count 
> the number of consecutive columns containing '1' responses. At the next '0'
> or '.', I want R to record the count of consecutive '1's, and the skip 
> to the next row and begin the process anew.
> >
> > Thanks in advance for your help,
> >
> > David S. Herzberg, Ph.D.
> > Vice President, Research and Development Western Psychological 
> > Services
> > 12031 Wilshire Blvd.
> > Los Angeles, CA 90025-1251
> > Phone: (310)478-2061 x144
> > FAX: (310)478-7838
> > email: dav...@wpspublish.com<mailto:dav...@wpspublish.com>
> >
> >
> >
> >    [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> 
>
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Determining a basal correct count

2010-10-28 Thread David Herzberg
Thank you Phil - I'll give this a try. I do have some empty rows, so I'll have 
to deal with that eventually.

Dave

Sent via DROID X


-Original message-
From: Phil Spector 
To: David Herzberg 
Cc: "r-help@r-project.org" 
Sent: Thu, Oct 28, 2010 23:39:34 GMT+00:00
Subject: Re: [R] Determining a basal correct count

David -
I *think*

   apply(x,1,function(x)rle(x[which(x==1)[1]:length(x)])$lengths[1])

gives you what you want, but without a reproducible example it's
hard to say.  It will fail if there are no 1s in a given row.

 - Phil Spector
  Statistical Computing Facility
  Department of Statistics
  UC Berkeley
  spec...@stat.berkeley.edu


On Thu, 28 Oct 2010, David Herzberg wrote:

> Here's another interesting problem: if you recall I have a data frame 
> (LCvars1) that consists of about 1500 cases (rows) of data from kids who took 
> a test of listening comprehension. The columns are their scores (1 = correct, 
> 0 = incorrect,  . = missing) on 140 test items. The items are numbered 
> sequentially and are ordered by increasing difficulty as you go from left to 
> right across the columns.
>
> I used the following (thanks to Peter Ehlers for this solution):
>
> First1ItemNo <- as.vector(
>  apply(
>  LCvars1, 1, match, x=1
>  ))
>
> to make R go through the columns from left to right and record into a vector 
> the column number of the first '1' response for each case.
>
> Now, for each case (row), I want R to START with the column that contains the 
> first '1' response, and continue to the right and count the number of 
> consecutive columns containing '1' responses. At the next '0' or '.', I want 
> R to record the count of consecutive '1's, and the skip to the next row and 
> begin the process anew.
>
> Thanks in advance for your help,
>
> David S. Herzberg, Ph.D.
> Vice President, Research and Development
> Western Psychological Services
> 12031 Wilshire Blvd.
> Los Angeles, CA 90025-1251
> Phone: (310)478-2061 x144
> FAX: (310)478-7838
> email: dav...@wpspublish.com<mailto:dav...@wpspublish.com>
>
>
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Determining a basal correct count

2010-10-28 Thread David Herzberg
Here's another interesting problem: if you recall I have a data frame (LCvars1) 
that consists of about 1500 cases (rows) of data from kids who took a test of 
listening comprehension. The columns are their scores (1 = correct, 0 = 
incorrect,  . = missing) on 140 test items. The items are numbered sequentially 
and are ordered by increasing difficulty as you go from left to right across 
the columns.

I used the following (thanks to Peter Ehlers for this solution):

First1ItemNo <- as.vector(
  apply(
  LCvars1, 1, match, x=1
  ))

to make R go through the columns from left to right and record into a vector 
the column number of the first '1' response for each case.

Now, for each case (row), I want R to START with the column that contains the 
first '1' response, and continue to the right and count the number of 
consecutive columns containing '1' responses. At the next '0' or '.', I want R 
to record the count of consecutive '1's, and the skip to the next row and begin 
the process anew.

Thanks in advance for your help,

David S. Herzberg, Ph.D.
Vice President, Research and Development
Western Psychological Services
12031 Wilshire Blvd.
Los Angeles, CA 90025-1251
Phone: (310)478-2061 x144
FAX: (310)478-7838
email: dav...@wpspublish.com



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conditional looping over a set of variables in R

2010-10-27 Thread David Herzberg
Peter, thanks for this elegant solution that works well and handles the empty 
cases. However, the vector it returns includes both the row (case) numbers and 
the target result (number of column of first "1"). How can I strip out the row 
numbers and leave only the target result.

Regards,

David S. Herzberg, Ph.D.
Vice President, Research and Development 
Western Psychological Services
12031 Wilshire Blvd.
Los Angeles, CA 90025-1251
Phone: (310)478-2061 x144
FAX: (310)478-7838
email: dav...@wpspublish.com



-Original Message-
From: Peter Ehlers [mailto:ehl...@ucalgary.ca] 
Sent: Tuesday, October 26, 2010 9:23 AM
To: David Herzberg
Cc: Petr PIKAL; r-help@r-project.org
Subject: Re: [R] Conditional looping over a set of variables in R

I would still recommend

  vector_of_column_number <- apply(yourdata, 1, match, x=1)

as the simplest way if you only want the number of the column that has the 
first 1 or "1" (the call works as is for both numeric and character data). Rows 
which have no 1s will return a value of NA.

Anything wrong with it?

   -Peter Ehlers

On 2010-10-26 07:50, David Herzberg wrote:
>
> Thank you - I will try this solution as well.
>
> Sent via DROID X
>
>
> -Original message-
> From: Petr PIKAL
> To: David Herzberg
> Cc: Adrienne Wootten, 
> "r-help@r-project.org"
> Sent: Tue, Oct 26, 2010 06:43:09 GMT+00:00
> Subject: Re: [R] Conditional looping over a set of variables in R
>
> Hi
>
> r-help-boun...@r-project.org napsal dne 25.10.2010 20:41:55:
>
>> Adrienne, there's one glitch when I implement your solution below. 
>> When
> the
>> loop encounters a case with no data at all (that is, all 140 item
> responses
>> are missing), it aborts and prints this error message: " ERROR: 
>> argument
> is
>> of length zero".
>>
>> I wonder if there's a logical condition I could add that would enable 
>> R
> to
>> skip these empty cases and continue executing on the next case that
> contains data.
>>
>> Thanks, Dave
>>
>> David S. Herzberg, Ph.D.
>> Vice President, Research and Development Western Psychological 
>> Services
>> 12031 Wilshire Blvd.
>> Los Angeles, CA 90025-1251
>> Phone: (310)478-2061 x144
>> FAX: (310)478-7838
>> email: dav...@wpspublish.com
>>
>>
>>
>> From: wootten.adrie...@gmail.com [mailto:wootten.adrie...@gmail.com] 
>> On
> Behalf
>> Of Adrienne Wootten
>> Sent: Friday, October 22, 2010 9:09 AM
>> To: David Herzberg
>> Cc: r-help@r-project.org
>> Subject: Re: [R] Conditional looping over a set of variables in R
>>
>> David,
>>
>> here I'm referring to your data as testmat, a matrix of 140 columns 
>> and
> 1500
>> rows, but the same or similar notation can be applied to data frames 
>> in
> R.  If
>> I understand correctly, you are looking for the first response 
>> (column)
> where
>> you got a value of 1.  I'm assuming also that since your missing 
>> values
> are
>> characters then your two numeric values are also characters.  keeping
> all this
>> in mind, try something like this.
>
> If you really only want to know which column in each row has first 
> occurrence of 1 (or any other value)  you can get rid of looping and 
> use other R capabilities.
>
>> set.seed(111)
>> mat<-matrix(sample(1:3, 20, replace=T),5,4) mat
>   [,1] [,2] [,3] [,4]
> [1,]2222
> [2,]3121
> [3,]2213
> [4,]2211
> [5,]2112
>> mat.w<-which(mat==1, arr.ind=T)
>> tapply(mat.w[,2], mat.w[,1], min)
> 2 3 4 5
> 2 3 3 2
>> mat[2, ]<-NA
>> mat
>   [,1] [,2] [,3] [,4]
> [1,]2222
> [2,]   NA   NA   NA   NA
> [3,]2213
> [4,]2211
> [5,]2112
>
> and this approach smoothly works with NA values too
>
>> mat.w<-which(mat==1, arr.ind=T)
>> tapply(mat.w[,2], mat.w[,1], min)
> 3 4 5
> 3 3 2
>
> You can then use modify such output as you have info about columns and 
> rows. I am sure there are other maybe better options, e.g.
>
> lll<-as.list(as.data.frame(t(mat)))
>> unlist(lapply(lll, function(x) min(which(x==1
>   V1  V2  V3  V4  V5
> Inf Inf   3   3   2
>
> Regards
> Petr
>
>>
>> first = c() # your extra variable which will eventually contain the
> first
>> correct response for each case
>>
>> for(i in 1:nrow(testmat)){
>>
>> c = 1
>>
>> while( c<=ncol(testmat) | testmat[i,c] != "1" ){
>>

Re: [R] Conditional looping over a set of variables in R

2010-10-26 Thread David Herzberg

Thank you - I will try this solution as well.

Sent via DROID X


-Original message-
From: Petr PIKAL 
To: David Herzberg 
Cc: Adrienne Wootten , "r-help@r-project.org" 

Sent: Tue, Oct 26, 2010 06:43:09 GMT+00:00
Subject: Re: [R] Conditional looping over a set of variables in R

Hi

r-help-boun...@r-project.org napsal dne 25.10.2010 20:41:55:

> Adrienne, there's one glitch when I implement your solution below. When
the
> loop encounters a case with no data at all (that is, all 140 item
responses
> are missing), it aborts and prints this error message: " ERROR: argument
is
> of length zero".
>
> I wonder if there's a logical condition I could add that would enable R
to
> skip these empty cases and continue executing on the next case that
contains data.
>
> Thanks, Dave
>
> David S. Herzberg, Ph.D.
> Vice President, Research and Development
> Western Psychological Services
> 12031 Wilshire Blvd.
> Los Angeles, CA 90025-1251
> Phone: (310)478-2061 x144
> FAX: (310)478-7838
> email: dav...@wpspublish.com
>
>
>
> From: wootten.adrie...@gmail.com [mailto:wootten.adrie...@gmail.com] On
Behalf
> Of Adrienne Wootten
> Sent: Friday, October 22, 2010 9:09 AM
> To: David Herzberg
> Cc: r-help@r-project.org
> Subject: Re: [R] Conditional looping over a set of variables in R
>
> David,
>
> here I'm referring to your data as testmat, a matrix of 140 columns and
1500
> rows, but the same or similar notation can be applied to data frames in
R.  If
> I understand correctly, you are looking for the first response (column)
where
> you got a value of 1.  I'm assuming also that since your missing values
are
> characters then your two numeric values are also characters.  keeping
all this
> in mind, try something like this.

If you really only want to know which column in each row has first
occurrence of 1 (or any other value)  you can get rid of looping and use
other R capabilities.

> set.seed(111)
> mat<-matrix(sample(1:3, 20, replace=T),5,4)
> mat
 [,1] [,2] [,3] [,4]
[1,]2222
[2,]3121
[3,]2213
[4,]2211
[5,]2112
> mat.w<-which(mat==1, arr.ind=T)
> tapply(mat.w[,2], mat.w[,1], min)
2 3 4 5
2 3 3 2
> mat[2, ]<-NA
> mat
 [,1] [,2] [,3] [,4]
[1,]2222
[2,]   NA   NA   NA   NA
[3,]2213
[4,]2211
[5,]2112

and this approach smoothly works with NA values too

> mat.w<-which(mat==1, arr.ind=T)
> tapply(mat.w[,2], mat.w[,1], min)
3 4 5
3 3 2

You can then use modify such output as you have info about columns and
rows. I am sure there are other maybe better options, e.g.

lll<-as.list(as.data.frame(t(mat)))
> unlist(lapply(lll, function(x) min(which(x==1
 V1  V2  V3  V4  V5
Inf Inf   3   3   2

Regards
Petr

>
> first = c() # your extra variable which will eventually contain the
first
> correct response for each case
>
> for(i in 1:nrow(testmat)){
>
> c = 1
>
> while( c<=ncol(testmat) | testmat[i,c] != "1" ){
>
> if( testmat[i,c] == "1"){
>
> first[i] = c
> break # will exit the while loop once it finds the first correct answer,
and
> then jump to the next case
>
>  } else {
>
> c=c+1 # procede to the next column if not
>
> }
>
> }
>
> }
>
>
> Hope this helps you out a bit.
>
> Adrienne Wootten
> NCSU
>
> On Fri, Oct 22, 2010 at 11:33 AM, David Herzberg  mailto:dav...@wpspublish.com>> wrote:
> Here's the problem I'm trying to solve in R: I have a data frame that
consists
> of about 1500 cases (rows) of data from kids who took a test of
listening
> comprehension. The columns are their scores (1 = correct, 0 = incorrect,
 . =
> missing) on 140 test items. The items are numbered sequentially and are
> ordered by increasing difficulty as you go from left to right across the

> columns. I want R to go through the data and find the first correct
response
> for each case. Because of basal and ceiling rules, many cases have
missing
> data on many items before the first correct response appears.
>
> For each case, I want R to evaluate the item responses sequentially
starting
> with item 1. If the score is 0 or missing, proceed to the next item and
> evaluate it. If the score is 1, stop the operation for that case, record
the
> item number of that first correct response in a new variable, proceed to
the
> next case, and restart the operation.
>
> In SPSS, this operation would be carried out with LOOP, VECTOR, and DO
IF, as
> follows (assuming the data set is already loaded):
>
> * DECLARE A NEW VARIABLE TO HOLD THE ITEM NUMBER OF THE FIRST CORRECT
> RESPONSE, SET IT EQUAL TO 0.
> numeric LCf

Re: [R] Conditional looping over a set of variables in R

2010-10-25 Thread David Herzberg
Adrienne, there's one glitch when I implement your solution below. When the 
loop encounters a case with no data at all (that is, all 140 item responses are 
missing), it aborts and prints this error message: " ERROR:  argument is of 
length zero".

I wonder if there's a logical condition I could add that would enable R to skip 
these empty cases and continue executing on the next case that contains data.

Thanks, Dave

David S. Herzberg, Ph.D.
Vice President, Research and Development
Western Psychological Services
12031 Wilshire Blvd.
Los Angeles, CA 90025-1251
Phone: (310)478-2061 x144
FAX: (310)478-7838
email: dav...@wpspublish.com



From: wootten.adrie...@gmail.com [mailto:wootten.adrie...@gmail.com] On Behalf 
Of Adrienne Wootten
Sent: Friday, October 22, 2010 9:09 AM
To: David Herzberg
Cc: r-help@r-project.org
Subject: Re: [R] Conditional looping over a set of variables in R

David,

here I'm referring to your data as testmat, a matrix of 140 columns and 1500 
rows, but the same or similar notation can be applied to data frames in R.  If 
I understand correctly, you are looking for the first response (column) where 
you got a value of 1.  I'm assuming also that since your missing values are 
characters then your two numeric values are also characters.  keeping all this 
in mind, try something like this.

first = c() # your extra variable which will eventually contain the first 
correct response for each case

for(i in 1:nrow(testmat)){

c = 1

while( c<=ncol(testmat) | testmat[i,c] != "1" ){

if( testmat[i,c] == "1"){

first[i] = c
break # will exit the while loop once it finds the first correct answer, and 
then jump to the next case

 } else {

c=c+1 # procede to the next column if not

}

}

}


Hope this helps you out a bit.

Adrienne Wootten
NCSU

On Fri, Oct 22, 2010 at 11:33 AM, David Herzberg 
mailto:dav...@wpspublish.com>> wrote:
Here's the problem I'm trying to solve in R: I have a data frame that consists 
of about 1500 cases (rows) of data from kids who took a test of listening 
comprehension. The columns are their scores (1 = correct, 0 = incorrect,  . = 
missing) on 140 test items. The items are numbered sequentially and are ordered 
by increasing difficulty as you go from left to right across the columns. I 
want R to go through the data and find the first correct response for each 
case. Because of basal and ceiling rules, many cases have missing data on many 
items before the first correct response appears.

For each case, I want R to evaluate the item responses sequentially starting 
with item 1. If the score is 0 or missing, proceed to the next item and 
evaluate it. If the score is 1, stop the operation for that case, record the 
item number of that first correct response in a new variable, proceed to the 
next case, and restart the operation.

In SPSS, this operation would be carried out with LOOP, VECTOR, and DO IF, as 
follows (assuming the data set is already loaded):

* DECLARE A NEW VARIABLE TO HOLD THE ITEM NUMBER OF THE FIRST CORRECT RESPONSE, 
SET IT EQUAL TO 0.
numeric LCfirst1.
comp LCfirst1 = 0

* DECLARE A VECTOR TO HOLD THE 140 ITEM RESPONSE VARIABLES.
vector x=LC1a_score to LC140a_score.

* SET UP A LOOP THAT WILL RUN FROM 1 TO 140, AS LONG AS LCfirst1 = 0. "#i" IS 
AN INDEX VARIABLE THAT INCREASES BY 1 EACH TIME THE LOOP RUNS.
loop #i=1 to 140 if (LCfirst1 = 0).

* SET UP A CONDITIONAL TRANSFORMATION THAT IS EVALUATED FOR EACH ELEMENT OF THE 
VECTOR.  THUS, WHEN #i = 1, THE EXPRESSION EVALUATES THE FIRST ELEMENT OF THE 
VECTOR (THAT IS, THE FIRST OF THE 140 ITEM RESPONSES). AS THE LOOP RUNS AND #i 
INCREASES, SUBSEQUENT VECTOR ELELMENTS ARE EVALUATED. THE do if STATEMENT 
RETAINS CONTROL AND KEEPS LOOPING THROUGH THE VECTOR UNTIL A '1' IS ENCOUNTERED.
+ do if x(#i) = 1.

* WHEN A '1' IS ENCOUNTERED, CONTROL PASSES TO THE NEXT STATEMENT, WHICH 
RECODES THE VALUE OF THAT VECTOR ELEMENT TO '99'.
+ comp x(#i) = 99.

* AND THEN CONTROL PASSES TO THE NEXT STATEMENT, WHICH RECODES THE VALUE OF 
LCfirst1 TO THE CURRENT INDEX VALUE, THUS CAPTURING THE ITEM NUMBER OF THE 
FIRST CORRECT RESPONSE FOR THAT CASE. CHANGING THE VALUE OF LCfirst1 ALSO CAUSE 
S THE LOOP TO STOP EXECUTING FOR THAT CASE, AND THE PROGRAM MOVES TO THE NEXT 
CASE AND RESTARTS THE LOOP.
+ comp LCfirst1 = #i.
+ end if.
end loop.
exe.

After several hours of trying to translate this procedure to R, I'm stumped. I 
played around with creating a list to hold the item responses variables 
(analogous to 'vector' in SPSS), but when I tried to use the list in an R 
procedure, I kept getting a warning along the lines of  'the list contains > 1 
element, only the first element will be used'. So perhaps a list is not the 
appropriate class to 'hold' these variables?

It seems that some nested arrangement of 'for' 'while' and/or 'lapply' wil

Re: [R] Conditional looping over a set of variables in R

2010-10-23 Thread David Herzberg
Adrienne - this solves the problem nicely. Thanks for your help.


David S. Herzberg, Ph.D.
Vice President, Research and Development
Western Psychological Services
12031 Wilshire Blvd.
Los Angeles, CA 90025-1251
Phone: (310)478-2061 x144
FAX: (310)478-7838
email: dav...@wpspublish.com



From: wootten.adrie...@gmail.com [mailto:wootten.adrie...@gmail.com] On Behalf 
Of Adrienne Wootten
Sent: Friday, October 22, 2010 9:09 AM
To: David Herzberg
Cc: r-help@r-project.org
Subject: Re: [R] Conditional looping over a set of variables in R

David,

here I'm referring to your data as testmat, a matrix of 140 columns and 1500 
rows, but the same or similar notation can be applied to data frames in R.  If 
I understand correctly, you are looking for the first response (column) where 
you got a value of 1.  I'm assuming also that since your missing values are 
characters then your two numeric values are also characters.  keeping all this 
in mind, try something like this.

first = c() # your extra variable which will eventually contain the first 
correct response for each case

for(i in 1:nrow(testmat)){

c = 1

while( c<=ncol(testmat) | testmat[i,c] != "1" ){

if( testmat[i,c] == "1"){

first[i] = c
break # will exit the while loop once it finds the first correct answer, and 
then jump to the next case

 } else {

c=c+1 # procede to the next column if not

}

}

}


Hope this helps you out a bit.

Adrienne Wootten
NCSU

On Fri, Oct 22, 2010 at 11:33 AM, David Herzberg 
mailto:dav...@wpspublish.com>> wrote:
Here's the problem I'm trying to solve in R: I have a data frame that consists 
of about 1500 cases (rows) of data from kids who took a test of listening 
comprehension. The columns are their scores (1 = correct, 0 = incorrect,  . = 
missing) on 140 test items. The items are numbered sequentially and are ordered 
by increasing difficulty as you go from left to right across the columns. I 
want R to go through the data and find the first correct response for each 
case. Because of basal and ceiling rules, many cases have missing data on many 
items before the first correct response appears.

For each case, I want R to evaluate the item responses sequentially starting 
with item 1. If the score is 0 or missing, proceed to the next item and 
evaluate it. If the score is 1, stop the operation for that case, record the 
item number of that first correct response in a new variable, proceed to the 
next case, and restart the operation.

In SPSS, this operation would be carried out with LOOP, VECTOR, and DO IF, as 
follows (assuming the data set is already loaded):

* DECLARE A NEW VARIABLE TO HOLD THE ITEM NUMBER OF THE FIRST CORRECT RESPONSE, 
SET IT EQUAL TO 0.
numeric LCfirst1.
comp LCfirst1 = 0

* DECLARE A VECTOR TO HOLD THE 140 ITEM RESPONSE VARIABLES.
vector x=LC1a_score to LC140a_score.

* SET UP A LOOP THAT WILL RUN FROM 1 TO 140, AS LONG AS LCfirst1 = 0. "#i" IS 
AN INDEX VARIABLE THAT INCREASES BY 1 EACH TIME THE LOOP RUNS.
loop #i=1 to 140 if (LCfirst1 = 0).

* SET UP A CONDITIONAL TRANSFORMATION THAT IS EVALUATED FOR EACH ELEMENT OF THE 
VECTOR.  THUS, WHEN #i = 1, THE EXPRESSION EVALUATES THE FIRST ELEMENT OF THE 
VECTOR (THAT IS, THE FIRST OF THE 140 ITEM RESPONSES). AS THE LOOP RUNS AND #i 
INCREASES, SUBSEQUENT VECTOR ELELMENTS ARE EVALUATED. THE do if STATEMENT 
RETAINS CONTROL AND KEEPS LOOPING THROUGH THE VECTOR UNTIL A '1' IS ENCOUNTERED.
+ do if x(#i) = 1.

* WHEN A '1' IS ENCOUNTERED, CONTROL PASSES TO THE NEXT STATEMENT, WHICH 
RECODES THE VALUE OF THAT VECTOR ELEMENT TO '99'.
+ comp x(#i) = 99.

* AND THEN CONTROL PASSES TO THE NEXT STATEMENT, WHICH RECODES THE VALUE OF 
LCfirst1 TO THE CURRENT INDEX VALUE, THUS CAPTURING THE ITEM NUMBER OF THE 
FIRST CORRECT RESPONSE FOR THAT CASE. CHANGING THE VALUE OF LCfirst1 ALSO CAUSE 
S THE LOOP TO STOP EXECUTING FOR THAT CASE, AND THE PROGRAM MOVES TO THE NEXT 
CASE AND RESTARTS THE LOOP.
+ comp LCfirst1 = #i.
+ end if.
end loop.
exe.

After several hours of trying to translate this procedure to R, I'm stumped. I 
played around with creating a list to hold the item responses variables 
(analogous to 'vector' in SPSS), but when I tried to use the list in an R 
procedure, I kept getting a warning along the lines of  'the list contains > 1 
element, only the first element will be used'. So perhaps a list is not the 
appropriate class to 'hold' these variables?

It seems that some nested arrangement of 'for' 'while' and/or 'lapply' will 
allow me to recreate the operation described above? How do I set up the 
indexing operation analogous to 'loop #i' in SPSS?

Any help is appreciated, and I'm happy to provide more information if needed.

David S. Herzberg, Ph.D.
Vice President, Research and Development
Western Psychological Services
12031 Wilshire Blvd.
Los

[R] Combining the values of two variables into one

2010-10-22 Thread David Herzberg
I start with:

v1<-c(1,3,5,7)
v2<-c(2,4,6,8)

And I want to end up with:

v3<-c(12,34,56,78)

How do I get there?

Thanks,


David S. Herzberg, Ph.D.
Vice President, Research and Development
Western Psychological Services
12031 Wilshire Blvd.
Los Angeles, CA 90025-1251
Phone: (310)478-2061 x144
FAX: (310)478-7838
email: dav...@wpspublish.com



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conditional looping over a set of variables in R

2010-10-22 Thread David Herzberg
Bill, thanks so much for this. I'll get a chance to test it later today, and 
will post the outcome.


David S. Herzberg, Ph.D.
Vice President, Research and Development 
Western Psychological Services
12031 Wilshire Blvd.
Los Angeles, CA 90025-1251
Phone: (310)478-2061 x144
FAX: (310)478-7838
email: dav...@wpspublish.com



-Original Message-
From: William Dunlap [mailto:wdun...@tibco.com] 
Sent: Friday, October 22, 2010 9:52 AM
To: David Herzberg; r-help@r-project.org
Subject: RE: [R] Conditional looping over a set of variables in R

You were a bit vague about the format of your data.
I'm assuming all columns were numeric and the entries are one of 0, 1, and NA 
(missing value).  I made a little function to generate random data of that 
format for testing purposes:

makeData <- function (nrow = 1500, ncol = 140, pMissing = 0.1) {
# pMissing if proportion of missing values
m <- matrix(sample(c(1, 0), size = nrow * ncol, replace = TRUE), 
nrow, ncol)
m[runif(nrow * ncol) < pMissing] <- NA
data.frame(m)
}

E.g.,

  > set.seed(168)
  > d <- makeData(15,3)
  > d
  X1 X2 X3
   1   1  1  1
   2   0  0 NA
   3   0  1  0
   4   0  0 NA
   5   0  1  1
   6   0  0 NA
   7   1  0  0
   8   0  1  1
   9   0  0  1
  10   1  1 NA
  11   0  0  1
  12   0  0  0
  13  NA NA NA
  14   0  0  0
  15   1  0  0

I think the following function does what you want.
The algorithm is pretty similar to what you showed.

  columnOfFirstOne <- function(data) {
  # col will be return value, one entry per row of data.
  # Fill it with NA's: NA in output will mean there were no 1's in row
  col <- rep(as.integer(NA), nrow(data))
  for (j in seq_len(ncol(data))) { # loop over columns
  # For each entry in 'col', if it has not been set yet
  # and this entry the j'th column of data is 1 (and not
missing)
  # then set to the column number.
  col[is.na(col) & !is.na(data[, j]) & data[, j] == 1] <- j
  }
  col # return this from function
  }

With the above data we get
  > columnOfFirstOne(d)
   [1]  1 NA  2 NA  2 NA  1  2  3  1  3 NA NA NA  1

It seems quick enough for a dataset of your size
  > dd <- makeData(nrow=1500, ncol=140)
  > system.time(columnOfFirstOne(dd)) # time in seconds
 user  system elapsed 
 0.080.000.08
 
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> -Original Message-
> From: r-help-boun...@r-project.org
> [mailto:r-help-boun...@r-project.org] On Behalf Of David Herzberg
> Sent: Friday, October 22, 2010 8:34 AM
> To: r-help@r-project.org
> Subject: [R] Conditional looping over a set of variables in R
> 
> Here's the problem I'm trying to solve in R: I have a data frame that 
> consists of about 1500 cases (rows) of data from kids who took a test 
> of listening comprehension. The columns are their scores (1 = correct, 
> 0 = incorrect,  . = missing) on 140 test items. The items are numbered 
> sequentially and are ordered by increasing difficulty as you go from 
> left to right across the columns. I want R to go through the data and 
> find the first correct response for each case. Because of basal and 
> ceiling rules, many cases have missing data on many items before the 
> first correct response appears.
> 
> For each case, I want R to evaluate the item responses sequentially 
> starting with item 1. If the score is 0 or missing, proceed to the 
> next item and evaluate it. If the score is 1, stop the operation for 
> that case, record the item number of that first correct response in a 
> new variable, proceed to the next case, and restart the operation.
> 
> In SPSS, this operation would be carried out with LOOP, VECTOR, and DO 
> IF, as follows (assuming the data set is already loaded):
> 
> * DECLARE A NEW VARIABLE TO HOLD THE ITEM NUMBER OF THE FIRST CORRECT 
> RESPONSE, SET IT EQUAL TO 0.
> numeric LCfirst1.
> comp LCfirst1 = 0
> 
> * DECLARE A VECTOR TO HOLD THE 140 ITEM RESPONSE VARIABLES.
> vector x=LC1a_score to LC140a_score.
> 
> * SET UP A LOOP THAT WILL RUN FROM 1 TO 140, AS LONG AS
> LCfirst1 = 0. "#i" IS AN INDEX VARIABLE THAT INCREASES BY 1 EACH TIME 
> THE LOOP RUNS.
> loop #i=1 to 140 if (LCfirst1 = 0).
> 
> * SET UP A CONDITIONAL TRANSFORMATION THAT IS EVALUATED FOR EACH 
> ELEMENT OF THE VECTOR.  THUS, WHEN #i = 1, THE EXPRESSION EVALUATES 
> THE FIRST ELEMENT OF THE VECTOR (THAT IS, THE FIRST OF THE 140 ITEM 
> RESPONSES). AS THE LOOP RUNS AND #i INCREASES, SUBSEQUENT VECTOR 
> ELELMENTS ARE EVALUATED.
> THE do if STATEMENT RETAINS CONTROL AND KEEPS LOOPING THROUGH THE 
> VECTOR UNTIL A '1' IS ENCOUNTERED.
> + do if x(#i) = 1.
> 
> * WHEN A '1' IS ENCOUNTERED, CONTROL PASSES TO THE NEXT STATEMENT, 
&g

[R] Conditional looping over a set of variables in R

2010-10-22 Thread David Herzberg
Here's the problem I'm trying to solve in R: I have a data frame that consists 
of about 1500 cases (rows) of data from kids who took a test of listening 
comprehension. The columns are their scores (1 = correct, 0 = incorrect,  . = 
missing) on 140 test items. The items are numbered sequentially and are ordered 
by increasing difficulty as you go from left to right across the columns. I 
want R to go through the data and find the first correct response for each 
case. Because of basal and ceiling rules, many cases have missing data on many 
items before the first correct response appears.

For each case, I want R to evaluate the item responses sequentially starting 
with item 1. If the score is 0 or missing, proceed to the next item and 
evaluate it. If the score is 1, stop the operation for that case, record the 
item number of that first correct response in a new variable, proceed to the 
next case, and restart the operation.

In SPSS, this operation would be carried out with LOOP, VECTOR, and DO IF, as 
follows (assuming the data set is already loaded):

* DECLARE A NEW VARIABLE TO HOLD THE ITEM NUMBER OF THE FIRST CORRECT RESPONSE, 
SET IT EQUAL TO 0.
numeric LCfirst1.
comp LCfirst1 = 0

* DECLARE A VECTOR TO HOLD THE 140 ITEM RESPONSE VARIABLES.
vector x=LC1a_score to LC140a_score.

* SET UP A LOOP THAT WILL RUN FROM 1 TO 140, AS LONG AS LCfirst1 = 0. "#i" IS 
AN INDEX VARIABLE THAT INCREASES BY 1 EACH TIME THE LOOP RUNS.
loop #i=1 to 140 if (LCfirst1 = 0).

* SET UP A CONDITIONAL TRANSFORMATION THAT IS EVALUATED FOR EACH ELEMENT OF THE 
VECTOR.  THUS, WHEN #i = 1, THE EXPRESSION EVALUATES THE FIRST ELEMENT OF THE 
VECTOR (THAT IS, THE FIRST OF THE 140 ITEM RESPONSES). AS THE LOOP RUNS AND #i 
INCREASES, SUBSEQUENT VECTOR ELELMENTS ARE EVALUATED. THE do if STATEMENT 
RETAINS CONTROL AND KEEPS LOOPING THROUGH THE VECTOR UNTIL A '1' IS ENCOUNTERED.
+ do if x(#i) = 1.

* WHEN A '1' IS ENCOUNTERED, CONTROL PASSES TO THE NEXT STATEMENT, WHICH 
RECODES THE VALUE OF THAT VECTOR ELEMENT TO '99'.
+ comp x(#i) = 99.

* AND THEN CONTROL PASSES TO THE NEXT STATEMENT, WHICH RECODES THE VALUE OF 
LCfirst1 TO THE CURRENT INDEX VALUE, THUS CAPTURING THE ITEM NUMBER OF THE 
FIRST CORRECT RESPONSE FOR THAT CASE. CHANGING THE VALUE OF LCfirst1 ALSO CAUSE 
S THE LOOP TO STOP EXECUTING FOR THAT CASE, AND THE PROGRAM MOVES TO THE NEXT 
CASE AND RESTARTS THE LOOP.
+ comp LCfirst1 = #i.
+ end if.
end loop.
exe.

After several hours of trying to translate this procedure to R, I'm stumped. I 
played around with creating a list to hold the item responses variables 
(analogous to 'vector' in SPSS), but when I tried to use the list in an R 
procedure, I kept getting a warning along the lines of  'the list contains > 1 
element, only the first element will be used'. So perhaps a list is not the 
appropriate class to 'hold' these variables?

It seems that some nested arrangement of 'for' 'while' and/or 'lapply' will 
allow me to recreate the operation described above? How do I set up the 
indexing operation analogous to 'loop #i' in SPSS?

Any help is appreciated, and I'm happy to provide more information if needed.

David S. Herzberg, Ph.D.
Vice President, Research and Development
Western Psychological Services
12031 Wilshire Blvd.
Los Angeles, CA 90025-1251
Phone: (310)478-2061 x144
FAX: (310)478-7838
email: dav...@wpspublish.com



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.