[R] R : how does %in% operator work?

Moumita Das Tue, 18 Aug 2009 04:34:09 -0700

*Problem-1*



CASE-I---------(works fine)

> var1<-"tom"

> var1

[1"tom"

>  var1<-as.character(var1)

>  var1

[1] "tom"

>  var2<-c("tom","harry","kate")

> logc<-(var1 %in% var2)

> logc

[1] TRUE

> typeof(var1)

[1] "character"

> typeof(var2)

[1] "character"



*CASE-II---------(doesnt  work)*

I have my dynamically generated dataset on which I want to use this %in%
operator.But its not working



*predictors_values data frame is shown below:---------------*

       x

2  recmeanC2

3  recmeanC3

4  recmeanC4

5         i1

6         i2

7         i3

8         i4

9         i5

10        i6

11        i7

12        i8

13        i9

14       i10

15       i11

16       i12

17       i13

18       i14

19       i15

*coef_dataframe_rownames data frame is shown below:----*

if (stringsAsFactors) factor(x) else x

1                               recmeanC2

2                               recmeanC3

3                               recmeanC4

4                                      i1

5                                      i2

6                                      i3

7                                      i4

8                                      i5

9                                      i6

10                                     i7

11                                     i8

12                                     i9

13                                    i10

14                                    i12

15                                    i13



*Just pasted a part of my code:--*

predictor<-predictors_values[1,1]

predictor<-as.character(predictor)

predictor<-noquote(predictor)

print("predictor")

print(predictor) ##prints recmeanC1




print("coef_dataframe_rownames")

#coef_dataframe_rownames<-c(coef_dataframe_rownames)

#coef_dataframe_rownames<-c("recmeanC2","recmeanC3"," recmeanC4","i1")
   *#only
when I har coded in this way I get correct values for logc(you will find
logc below)*

names(coef_dataframe_rownames)<-letters[1]

coef_dataframe_rownames<-c(coef_dataframe_rownames)

print(coef_dataframe_rownames)



#prints

[1] "coef_dataframe_rownames"

$a

 [1] recmeanC2 recmeanC3 recmeanC4 i1        i2        i3        i4

 [8] i5        i6        i7        i8        i9        i10       i12

[15] i13

print(typeof(predictor))

print(typeof(coef_dataframe_rownames))

logc<-(predictor %in% coef_dataframe_rownames)

print("logc")

print(logc) # prints FALSE

For  logc<-(predictor %in% coef_dataframe_rownames) to work I have changed
the predictor and coef_dataframe_rownames to all different data types ,like
both vectors ,both dats frames, predictor to character and
coef_dataframe_rownames to vectorBut nothings seems to work.

[ If predictor  is in coef_dataframe_rownames  do  task 1 else task2 ]

Here predictors_values is a data frame of all possible predictors when one
particular element s regression is to be done.And coef_dataframe_rownames  is
the  data frame of rownames of the coefficients table which was produced as
a result of regression function.

*Problem-2:--*

I wanted something ,as in Problem -1 because of  Problem-2.

Now if some rows of the coefficients  table are filled with NAs in all row
then those rows are getting omitted automatically when I am trying to access
only the coefficients table like this:--



*
fit<-lm(item_category_table[element_n_predictors_string_to_vector],singular.ok=TRUE)
*

*Coefficients<-summary(fit)$coefficients*

Now becausing I am running loops to enter values of coefficients table  in
the database tables ,the omission of the rows with all NAs are causing
problems. Even if these rows do not have values I need to populate the data
base tables values for these particular NA row s of the coefficients table.

*Is there any way to get the full coefficients table with out the NA
containing rows being omitted?*



Print  gives this:----

[1] "coef_dataframe without intercept"  # I have omitted the intercept
,please don not get confused

                Estimate   Std. Error       t value   Pr(>|t|)

recmeanC2          9.275880e-17 6.322780e-17  1.467057e+00 0.14349903

recmeanC3         1.283534e-17 2.080644e-17  6.168929e-01 0.53781390

recmeanC4         -3.079466e-17 2.565499e-17 -1.200338e+00 0.23103743

i1                             5.000000e-01 1.036197e-17  4.825338e+16
0.00000000

i2                               -5.630739e-18 1.638267e-17 -3.437010e-01
0.73133282

i3                              4.291387e-18 1.207522e-17  3.553879e-01
0.72257050

i4                              1.472662e-17 1.423051e-17  1.034863e+00
0.30163897

i5                               5.000000e-01 1.003323e-17  4.983441e+16
0.00000000

i6                              5.147966e-18 1.569095e-17  3.280850e-01
0.74309614

i7                              1.096044e-17 1.555829e-17  7.044760e-01
0.48173041

i8        -1.166290e-18 1.287370e-17 -9.059482e-02 0.92788026

i9         1.627371e-17 1.540567e-17  1.056345e+00 0.29173427

i10        4.001692e-18 1.365740e-17  2.930053e-01 0.76973827

i12       -1.052843e-17 1.324484e-17 -7.949081e-01 0.42735000

i13        2.571236e-17 1.357336e-17  1.894325e+00 0.05922715


Whereas summary(fit ) gives:-------------

Coefficients: (3 not defined because of singularities)

              Estimate Std. Error    t value Pr(>|t|)

(Intercept)  2.808e-16  1.579e-17  1.778e+01   <2e-16 ***

recmeanC2    9.276e-17  6.323e-17  1.467e+00   0.1435

recmeanC3    1.283e-17  2.081e-17  6.170e-01   0.5378

recmeanC4   -3.080e-17  2.566e-17 -1.200e+00   0.2310

i1           5.000e-01  1.036e-17  4.825e+16   <2e-16 ***

i2          -5.631e-18  1.638e-17 -3.440e-01   0.7313

i3           4.291e-18  1.207e-17  3.550e-01   0.7226

i4           1.473e-17  1.423e-17  1.035e+00   0.3016

i5           5.000e-01  1.003e-17  4.983e+16   <2e-16 ***

i6           5.148e-18  1.569e-17  3.280e-01   0.7431

i7           1.096e-17  1.556e-17  7.040e-01   0.4817

i8          -1.166e-18  1.287e-17 -9.100e-02   0.9279

i9           1.627e-17  1.541e-17  1.056e+00   0.2917

i10          4.002e-18  1.366e-17  2.930e-01   0.7697

i11                 NA         NA         NA       NA

i12         -1.053e-17  1.325e-17 -7.950e-01   0.4273

i13          2.571e-17  1.357e-17  1.894e+00   0.0592 .

i14                 NA         NA         NA       NA

i15                 NA         NA         NA       NA






I know THERE ARE OTHER COMPARISONS OPERATOR S  like
all.equal,identical,compare,setdiff.I do not have compare function,all.equal
doesnt solve my problem,it just comapares and gives the diff,setdiff also
didnt work and also identical didnt. I know theres problem with data in
the dataset coef_dataframe_rownames.Because

coef_dataframe_rownames<-c("recmeanC2","recmeanC3"," recmeanC4","i1")    *#only
when I har coded in this way I get correct values for logc*

How should treat my dataset to get correct values?






-- 
Thanks
Moumita

        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R : how does %in% operator work?

Reply via email to