subject:"\[R\] Remove"

Re: [R] Remove line from data file

2022-09-19 Thread avi.e.gross

David,

As others have said, there are many possible answers for a vague enough 
question.

For one-time data it is often easiest to simply change the data source as you 
say you did in EXCEL.

Deleting the 18th row can easily be done in R and might make sense if you get 
daily data and decided the 18th reporting station is not reliable and should 
always be excluded. As has been shown, the usual paradigm in R is to filter the 
data through a set of conditions and a very simple one is to specify which 
indices in rows and/or columns to exclude.

If you already have your data in mydata.old, then you can make a mydata.new 
that excludes that 18th row with something as simple as:

mydata.new <- mydata.old[ -18, ]

Since your question was not focused, the answer mentioned that it is common to 
delete based on all kinds of conditions. An example would be if you did not 
want to remove row 18 specifically but any row where a column says the 
collector/reporter of the info was "Smith" which may remove many rows in the 
data, or you wanted only data with a column giving a date in 2021 and not 
before or after.

This filter method is not a deletion per se, but a selective retention, and 
often has the same result. If your goal includes making the deletion of 
selected data permanent, of course, it is wise then to save the altered data in 
another file so later use starts with what you want. 

Actually removing a row from an original data.frame is not something people 
normally do. A data.frame is a list of vectors of some length and generally 
does not allow operations that might produce vectors of unequal length. You can 
set the entire row to be filled with NA if you want but removing the actual row 
in-place is the kind of thing  I do not normally see. In a sense, R is wasteful 
that way as you often end up making near-copies of your data and sometimes 
simply re-assigning the result to the same variable and expecting the old 
version to be garbage collected. As I said, the paradigm is selection more than 
alteration/deletion.

It is, of course, possible to create your own data structures where you could 
do something closer to a deletion of a row while leaving the rest in place but 
there likely is no need for your small amount of data. 

Note columns in R can be deleted easily because they are a top level entry in a 
LIST. mydata$colname <- NULL or similar variants will remove a column cleanly 
in the original data.frame. But as noted, rows in R do not really exist other 
than as a construct that tries to bind the nth elements of each underlying 
vector representing the columns. 

Of course we now can have list-columns in things like tibbles which makes me 
wonder ... 




-Original Message-
From: R-help  On Behalf Of Parkhurst, David
Sent: Sunday, September 18, 2022 8:49 AM
To: CALUM POLWART 
Cc: R-help@r-project.org
Subject: Re: [R] Remove line from data file

Thank you for your reply.  I meant from the dataframe, but that s one of the 
terms I had forgotten.  I created that from read.csv, the csv file coming from 
Excel.  Last night I went ahead and made the change(s) using Excel.

For future reference, when I look at your solutions below, what do you mean by  
value to delete ?  Could that just be a row number?  I was wanting to delete 
something like the 18th row in the dataframe?

From: CALUM POLWART 
Date: Sunday, September 18, 2022 at 7:25 AM
To: Parkhurst, David 
Cc: R-help@r-project.org 
Subject: Re: [R] Remove line from data file From the file? Or the data frame 
once its loaded?

What format is the file? CSV?

Do you know the line that needs deleted?

mydf <- read.csv("myfile.csv")

mydf2 <- mydf[-columnName == "valuetodelete", ] # Note the - infront of column 
name # or perhaps columnName != "value to delete", ]

write.csv(mydf2, "mydeletedfile.csv")




On Sun, 18 Sep 2022, 10:33 Parkhurst, David, 
mailto:parkh...@indiana.edu>> wrote:
I ve been retired since  06 and have forgotten most of R.  Now I have a use for 
it.  I ve created a data file and need to delete one row from it.  How do I do 
that?

DFP (iPad)
__
R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove line from data file

2022-09-19 Thread Parkhurst, David

Thank you for your reply.  I meant from the dataframe, but that�s one of the 
terms I had forgotten.  I created that from read.csv, the csv file coming from 
Excel.  Last night I went ahead and made the change(s) using Excel.

For future reference, when I look at your solutions below, what do you mean by 
�value to delete�?  Could that just be a row number?  I was wanting to delete 
something like the 18th row in the dataframe?

From: CALUM POLWART 
Date: Sunday, September 18, 2022 at 7:25 AM
To: Parkhurst, David 
Cc: R-help@r-project.org 
Subject: Re: [R] Remove line from data file
From the file? Or the data frame once its loaded?

What format is the file? CSV?

Do you know the line that needs deleted?

mydf <- read.csv("myfile.csv")

mydf2 <- mydf[-columnName == "valuetodelete", ]
# Note the - infront of column name
# or perhaps columnName != "value to delete", ]

write.csv(mydf2, "mydeletedfile.csv")

On Sun, 18 Sep 2022, 10:33 Parkhurst, David, 
mailto:parkh...@indiana.edu>> wrote:
I�ve been retired since �06 and have forgotten most of R.  Now I have a use for 
it.  I�ve created a data file and need to delete one row from it.  How do I do 
that?

DFP (iPad)
__
R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove line from data file

2022-09-18 Thread CALUM POLWART

If you want to delete row 18 you can do

mydf <- mydf[-18,]

This selects all rows other than row 18, and all columns and 'saves' it
back to the original data frame. Many people prefer to allocate to a new
dataframe so that if the -18 is wrong they can simply fix things.

I wasn't sure if you knew the row you wanted to delete. If you had a frame
with names and ages in. David Parkhurst might be row 18 and you might want
to delete him. But if the data is ever updated, he may move row.  If you
always want to delete David, it would be better to do:

mydf2 <- mydf[name != "David Parkhurst", ]




On Sun, 18 Sep 2022, 13:48 Parkhurst, David,  wrote:

> Thank you for your reply.  I meant from the dataframe, but that’s one of
> the terms I had forgotten.  I created that from read.csv, the csv file
> coming from Excel.  Last night I went ahead and made the change(s) using
> Excel.
>
>
>
> For future reference, when I look at your solutions below, what do you
> mean by “value to delete”?  Could that just be a row number?  I was wanting
> to delete something like the 18th row in the dataframe?
>
>
>
> *From: *CALUM POLWART 
> *Date: *Sunday, September 18, 2022 at 7:25 AM
> *To: *Parkhurst, David 
> *Cc: *R-help@r-project.org 
> *Subject: *Re: [R] Remove line from data file
>
> From the file? Or the data frame once its loaded?
>
>
>
> What format is the file? CSV?
>
>
>
> Do you know the line that needs deleted?
>
>
>
> mydf <- read.csv("myfile.csv")
>
>
>
> mydf2 <- mydf[-columnName == "valuetodelete", ]
>
> # Note the - infront of column name
>
> # or perhaps columnName != "value to delete", ]
>
>
>
> write.csv(mydf2, "mydeletedfile.csv")
>
>
>
>
>
>
>
>
>
> On Sun, 18 Sep 2022, 10:33 Parkhurst, David,  wrote:
>
> I’ve been retired since ‘06 and have forgotten most of R.  Now I have a
> use for it.  I’ve created a data file and need to delete one row from it.
> How do I do that?
>
> DFP (iPad)
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove line from data file

2022-09-18 Thread CALUM POLWART

>From the file? Or the data frame once its loaded?

What format is the file? CSV?

Do you know the line that needs deleted?

mydf <- read.csv("myfile.csv")

mydf2 <- mydf[-columnName == "valuetodelete", ]
# Note the - infront of column name
# or perhaps columnName != "value to delete", ]

write.csv(mydf2, "mydeletedfile.csv")




On Sun, 18 Sep 2022, 10:33 Parkhurst, David,  wrote:

> I’ve been retired since ‘06 and have forgotten most of R.  Now I have a
> use for it.  I’ve created a data file and need to delete one row from it.
> How do I do that?
>
> DFP (iPad)
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Remove line from data file

2022-09-18 Thread Parkhurst, David

I’ve been retired since ‘06 and have forgotten most of R.  Now I have a use for 
it.  I’ve created a data file and need to delete one row from it.  How do I do 
that?

DFP (iPad)
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove

2022-01-29 Thread Rui Barradas


Hello,

This question is repeated [1].

[1] https://stat.ethz.ch/pipermail/r-help/2022-January/473663.html

Rui Barradas

Às 02:20 de 29/01/2022, Val escreveu:

Hi All,

I want  remove row(s) that contains a character string in an integer column
or  a digit in a character column

Sample data
dat1 <-read.table(text="Name, Age, Weight
  Alex,  20,  13X
  Bob,   25,  142
  Carol, 24,  120
  John,  3BC,  175
  Katy,  35,  160
  Jack3, 34,  140",sep=",",header=TRUE,stringsAsFactors=F)

If the Age/Weight column contains any character(s) then remove
if the Name  column contains any  digit then remove that row

Desired output

Name   Age weight
1   Bob 25142
2   Carol   24120
3   Katy35160

Thank you,

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove all factor levels from an R dataframe

2020-11-10 Thread Eric Berger

Hi John,
I was thinking that you created df1 in a way that set the 'year'
column as a factor when this is not what you wanted to do.
The data.frame() function takes an argument stringsAsFactors which
controls this behavior.
For R versions 3.6.3 or earlier, the default setting is
stringsAsFactors=TRUE, which means that string columns automatically
become factors.
You have to specify stringsAsFactors=FALSE to avoid this. (In R 4.0.x
the default was changed to FALSE.)

Example:
df1 <- data.frame( a=letters[1:10], stringsAsFactors=FALSE )

HTH,
Eric

On Tue, Nov 10, 2020 at 11:16 AM Jim Lemon  wrote:
>
> Sure John,
>
> df1<-df1[order(as.character(df1$year),decreasing=TRUE),]
>
> Jim
>
> On Tue, Nov 10, 2020 at 8:05 PM John  wrote:
>
> > Thanks Jim. Can we do descending order?
> >
> > Jim Lemon  於 2020年11月10日 週二 下午4:56寫道：
> >
> >> Hi John,
> >>
> >> df1<-sapply(df1,as.character)
> >>
> >> Should do what you ask. The error message probably means that you should
> >> do this:
> >>
> >> df1<-df1[order(as.character(df1$year)),]
> >>
> >> as "year" is the name of the first column in df1, not a separate object.
> >>
> >> Jim
> >>
> >> On Tue, Nov 10, 2020 at 6:57 PM John  wrote:
> >>
> >>> Hi,
> >>>
> >>>I would like to sort the following simple dataframe by "year"
> >>> (characters), but the factor structure prevents me from doing so. How
> >>> can I
> >>> remove the factor structure? Thanks!
> >>>
> >>> > df1
> >>>   year  country
> >>> 4 2007 Asia; survey
> >>> 5 2010 8 countries in E/SE Asia
> >>> 6 2015Ghana
> >>> 7
> >>> 8 2000  US?
> >>> > str(df1)
> >>> 'data.frame': 5 obs. of  2 variables:
> >>>  $ year   : Factor w/ 9 levels "2017","2016",..: 4 5 3 6 7
> >>>  $ country: Factor w/ 9 levels "Euro Area\\newline Testing the MP
> >>> performance of the Euro Area",..: 4 5 6 7 8
> >>> > df1[order(-year), ]
> >>> Error in order(-year) : object 'year' not found
> >>>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> __
> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove all factor levels from an R dataframe

2020-11-10 Thread Jim Lemon

Sure John,

df1<-df1[order(as.character(df1$year),decreasing=TRUE),]

Jim

On Tue, Nov 10, 2020 at 8:05 PM John  wrote:

> Thanks Jim. Can we do descending order?
>
> Jim Lemon  於 2020年11月10日 週二 下午4:56寫道：
>
>> Hi John,
>>
>> df1<-sapply(df1,as.character)
>>
>> Should do what you ask. The error message probably means that you should
>> do this:
>>
>> df1<-df1[order(as.character(df1$year)),]
>>
>> as "year" is the name of the first column in df1, not a separate object.
>>
>> Jim
>>
>> On Tue, Nov 10, 2020 at 6:57 PM John  wrote:
>>
>>> Hi,
>>>
>>>I would like to sort the following simple dataframe by "year"
>>> (characters), but the factor structure prevents me from doing so. How
>>> can I
>>> remove the factor structure? Thanks!
>>>
>>> > df1
>>>   year  country
>>> 4 2007 Asia; survey
>>> 5 2010 8 countries in E/SE Asia
>>> 6 2015Ghana
>>> 7
>>> 8 2000  US?
>>> > str(df1)
>>> 'data.frame': 5 obs. of  2 variables:
>>>  $ year   : Factor w/ 9 levels "2017","2016",..: 4 5 3 6 7
>>>  $ country: Factor w/ 9 levels "Euro Area\\newline Testing the MP
>>> performance of the Euro Area",..: 4 5 6 7 8
>>> > df1[order(-year), ]
>>> Error in order(-year) : object 'year' not found
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove all factor levels from an R dataframe

2020-11-10 Thread John

Thanks Jim. Can we do descending order?

Jim Lemon  於 2020年11月10日 週二 下午4:56寫道：

> Hi John,
>
> df1<-sapply(df1,as.character)
>
> Should do what you ask. The error message probably means that you should
> do this:
>
> df1<-df1[order(as.character(df1$year)),]
>
> as "year" is the name of the first column in df1, not a separate object.
>
> Jim
>
> On Tue, Nov 10, 2020 at 6:57 PM John  wrote:
>
>> Hi,
>>
>>I would like to sort the following simple dataframe by "year"
>> (characters), but the factor structure prevents me from doing so. How can
>> I
>> remove the factor structure? Thanks!
>>
>> > df1
>>   year  country
>> 4 2007 Asia; survey
>> 5 2010 8 countries in E/SE Asia
>> 6 2015Ghana
>> 7
>> 8 2000  US?
>> > str(df1)
>> 'data.frame': 5 obs. of  2 variables:
>>  $ year   : Factor w/ 9 levels "2017","2016",..: 4 5 3 6 7
>>  $ country: Factor w/ 9 levels "Euro Area\\newline Testing the MP
>> performance of the Euro Area",..: 4 5 6 7 8
>> > df1[order(-year), ]
>> Error in order(-year) : object 'year' not found
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove all factor levels from an R dataframe

2020-11-10 Thread Jim Lemon

Hi John,

df1<-sapply(df1,as.character)

Should do what you ask. The error message probably means that you should do
this:

df1<-df1[order(as.character(df1$year)),]

as "year" is the name of the first column in df1, not a separate object.

Jim

On Tue, Nov 10, 2020 at 6:57 PM John  wrote:

> Hi,
>
>I would like to sort the following simple dataframe by "year"
> (characters), but the factor structure prevents me from doing so. How can I
> remove the factor structure? Thanks!
>
> > df1
>   year  country
> 4 2007 Asia; survey
> 5 2010 8 countries in E/SE Asia
> 6 2015Ghana
> 7
> 8 2000  US?
> > str(df1)
> 'data.frame': 5 obs. of  2 variables:
>  $ year   : Factor w/ 9 levels "2017","2016",..: 4 5 3 6 7
>  $ country: Factor w/ 9 levels "Euro Area\\newline Testing the MP
> performance of the Euro Area",..: 4 5 6 7 8
> > df1[order(-year), ]
> Error in order(-year) : object 'year' not found
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Remove all factor levels from an R dataframe

2020-11-09 Thread John

Hi,

   I would like to sort the following simple dataframe by "year"
(characters), but the factor structure prevents me from doing so. How can I
remove the factor structure? Thanks!

> df1
  year  country
4 2007 Asia; survey
5 2010 8 countries in E/SE Asia
6 2015Ghana
7
8 2000  US?
> str(df1)
'data.frame': 5 obs. of  2 variables:
 $ year   : Factor w/ 9 levels "2017","2016",..: 4 5 3 6 7
 $ country: Factor w/ 9 levels "Euro Area\\newline Testing the MP
performance of the Euro Area",..: 4 5 6 7 8
> df1[order(-year), ]
Error in order(-year) : object 'year' not found

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove a row

2019-11-28 Thread Bert Gunter

Of course! Use regexec() and regmatches()

>
regmatches(dat$varx,regexec("(^[[:digit:]]{1,3})([[:alpha:]]{1,2})([[:digit:]]{1,5}$)",dat$varx))
[[1]]
[1] "9F209" "9" "F" "209"

[[2]]
character(0)

[[3]]
[1] "2F250" "2" "F" "250"

[[4]]
character(0)

[[5]]
character(0)

[[6]]
character(0)

[[7]]
character(0)

[[8]]
[1] "121FL50" "121" "FL"  "50"

The list components are character(0) for no match, otherwise a character
vector with the whole text entry first, then the 1st, 2nd, and 3rd strings
matching the 1st, 2nd, and 3rd parenthesized subexpressions of the pattern.
These correspond to area code, region code, and your 3rd numeric of course.
I leave it to you to extract what you want from this list, e.g via lapply().

For details, see the Help pages for the two functions.

-- Bert

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove a row

2019-11-28 Thread Ashta

Thank you so much Bert.

Is it possible to split the varx into  three ( area code, region and
the numeric part)as a separate variable

On Thu, Nov 28, 2019 at 7:31 PM Bert Gunter  wrote:
>
> Use regular expressions.
>
> See ?regexp  and ?grep
>
> Using your example:
>
> > grep("^[[:digit:]]{1,3}[[:alpha:]]{1,2}[[:digit:]]{1,5}$",dat$varx,value = 
> > TRUE)
> [1] "9F209"   "2F250"   "121FL50"
>
> Cheers,
> Bert
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and 
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Thu, Nov 28, 2019 at 3:17 PM Ashta  wrote:
>>
>> Hi all,  I want to remove a row based on a condition in one of the
>> variables from a data frame.
>> When we split this string it should be composed of 3-2- 5 format (3
>> digits numeric, 2 characters and 5 digits  numeric).  Like
>> area code -region-numeric. The max length of the area code should be
>> 3, the  max length of region be should be 2,  followed by a max length
>> of  5  numeric digits.  The are code  can  be 1 digit, or 2 digits or
>> 3 digits  but not more than three digits.  So  the  max length of this
>> variable is 10.  Anything outside of this pattern should be excluded.
>> As an example
>>
>> dat <-read.table(text=" rown  varx
>> 1   9F209
>> 2  FL250
>> 3  2F250
>> 4  102250
>> 5  102FL
>> 6   102
>> 7  1212FL250
>> 8  121FL50",header=TRUE,stringsAsFactors=F)
>>
>> 1  9F209   # keep
>> 2  FL250   # remove, no area code
>> 3   2F250  # keep
>> 4  102250 # remove , no region code
>> 5  102FL   # remove , no numeric after region code
>> 6   102  # remove ,  no region code and numeric
>> 7  1212FL250  #remove, area code is more than three digits
>> 8  121FL50  # Keep
>>
>> The desired output should be
>> 1   9F209
>> 3   2F250
>> 8  121FL50
>>
>> How do I do this in an efficient way?
>>
>> Thank you in advance
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove a row

2019-11-28 Thread Bert Gunter

Use regular expressions.

See ?regexp  and ?grep

Using your example:

> grep("^[[:digit:]]{1,3}[[:alpha:]]{1,2}[[:digit:]]{1,5}$",dat$varx,value
= TRUE)
[1] "9F209"   "2F250"   "121FL50"

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Nov 28, 2019 at 3:17 PM Ashta  wrote:

> Hi all,  I want to remove a row based on a condition in one of the
> variables from a data frame.
> When we split this string it should be composed of 3-2- 5 format (3
> digits numeric, 2 characters and 5 digits  numeric).  Like
> area code -region-numeric. The max length of the area code should be
> 3, the  max length of region be should be 2,  followed by a max length
> of  5  numeric digits.  The are code  can  be 1 digit, or 2 digits or
> 3 digits  but not more than three digits.  So  the  max length of this
> variable is 10.  Anything outside of this pattern should be excluded.
> As an example
>
> dat <-read.table(text=" rown  varx
> 1   9F209
> 2  FL250
> 3  2F250
> 4  102250
> 5  102FL
> 6   102
> 7  1212FL250
> 8  121FL50",header=TRUE,stringsAsFactors=F)
>
> 1  9F209   # keep
> 2  FL250   # remove, no area code
> 3   2F250  # keep
> 4  102250 # remove , no region code
> 5  102FL   # remove , no numeric after region code
> 6   102  # remove ,  no region code and numeric
> 7  1212FL250  #remove, area code is more than three digits
> 8  121FL50  # Keep
>
> The desired output should be
> 1   9F209
> 3   2F250
> 8  121FL50
>
> How do I do this in an efficient way?
>
> Thank you in advance
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] remove a row

2019-11-28 Thread Ashta

Hi all,  I want to remove a row based on a condition in one of the
variables from a data frame.
When we split this string it should be composed of 3-2- 5 format (3
digits numeric, 2 characters and 5 digits  numeric).  Like
area code -region-numeric. The max length of the area code should be
3, the  max length of region be should be 2,  followed by a max length
of  5  numeric digits.  The are code  can  be 1 digit, or 2 digits or
3 digits  but not more than three digits.  So  the  max length of this
variable is 10.  Anything outside of this pattern should be excluded.
As an example

dat <-read.table(text=" rown  varx
1   9F209
2  FL250
3  2F250
4  102250
5  102FL
6   102
7  1212FL250
8  121FL50",header=TRUE,stringsAsFactors=F)

1  9F209   # keep
2  FL250   # remove, no area code
3   2F250  # keep
4  102250 # remove , no region code
5  102FL   # remove , no numeric after region code
6   102  # remove ,  no region code and numeric
7  1212FL250  #remove, area code is more than three digits
8  121FL50  # Keep

The desired output should be
1   9F209
3   2F250
8  121FL50

How do I do this in an efficient way?

Thank you in advance

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove highly correlated variables from a data frame or matrix

2019-11-16 Thread Ana Marija

Hi Peter,

Thank you so much!!! I will use complete linkage clustering because
Mendelian Randomization function
(https://cran.r-project.org/web/packages/MendelianRandomization/vignettes/Vignette_MR.pdf)
I plan to use allows for correlations but not as high as 0.9 or more.
I got 40 SNPs out of 246 so improvement!

Regards,
Ana

On Fri, Nov 15, 2019 at 8:01 PM Peter Langfelder
 wrote:
>
> Try hclust(as.dist(1-calc.rho), method = "average").
>
> Peter
>
> On Fri, Nov 15, 2019 at 10:02 AM Ana Marija  
> wrote:
> >
> > HI Peter,
> >
> > Thank you for getting back to me and shedding light on this. I see
> > your point, doing Jim's method:
> >
> > > keeprows<-apply(calc.rho,1,function(x) return(sum(x>0.8)<3))
> > > ro246.lt.8<-calc.rho[keeprows,keeprows]
> > > ro246.lt.8[ro246.lt.8 == 1] <- NA
> > > (mmax <- max(abs(ro246.lt.8), na.rm=TRUE))
> > [1] 0.566
> >
> > Which is good in general, correlations in my matrix  should not be
> > exceeding 0.8. I need to run Mendelian Rendomization on it later on so
> > I can not be having there highly correlated SNPs. But with Jim's
> > method I am only left with 17 SNPs (out of 246) and that means that
> > both pairs of highly correlated SNPs are removed and it would be good
> > to keep one of those highly correlated ones.
> >
> > I tried to do your code:
> > > tree = hclust(1-calc.rho, method = "average")
> > Error in if (is.na(n) || n > 65536L) stop("size cannot be NA nor
> > exceed 65536") :
> >   missing value where TRUE/FALSE needed
> >
> > Please advise.
> >
> > Thanks
> > Ana
> >
> > On Thu, Nov 14, 2019 at 7:37 PM Peter Langfelder
> >  wrote:
> > >
> > > I suspect that you want to identify which variables are highly
> > > correlated, and then keep only "representative" variables, i.e.,
> > > remove redundant ones. This is a bit of a risky procedure but I have
> > > done such things before as well sometimes to simplify large sets of
> > > highly related variables. If your threshold of 0.8 is approximate, you
> > > could simply use average linkage hierarchical clustering with
> > > dissimilarity = 1-correlation, cut the tree at the appropriate height
> > > (1-0.8=0.2), and from each cluster keep a single representative (e.g.,
> > > the one with the highest mean correlation with other members of the
> > > cluster). Something along these lines (untested)
> > >
> > > tree = hclust(1-calc.rho, method = "average")
> > > clusts = cutree(tree, h = 0.2)
> > > clustLevels = sort(unique(clusts))
> > > representatives = unlist(lapply(clustLevels, function(cl)
> > > {
> > >   inClust = which(clusts==cl);
> > >   rho1 = calc.rho[inClust, inClust, drop = FALSE];
> > >   repr = inClust[ which.max(colSums(rho1)) ]
> > >   repr
> > > }))
> > >
> > > the variable representatives now contains indices of the variables you
> > > want to retain, so you could subset the calc.rho matrix as
> > > rho.retained = calc.rho[representatives, representatives]
> > >
> > > I haven't tested the code and it may contain bugs, but something along
> > > these lines should get you where you want to be.
> > >
> > > Oh, and depending on how strict you want to be with the remaining
> > > correlations, you could use complete linkage clustering (will retain
> > > more variables, some correlations will be above 0.8).
> > >
> > > Peter
> > >
> > > On Thu, Nov 14, 2019 at 10:50 AM Ana Marija  
> > > wrote:
> > > >
> > > > Hello,
> > > >
> > > > I have a data frame like this (a matrix):
> > > > head(calc.rho)
> > > > rs9900318 rs8069906 rs9908521 rs9908336 rs9908870 rs9895995
> > > > rs56192520  0.903 0.268 0.327 0.327 0.327 0.582
> > > > rs3764410   0.928 0.276 0.336 0.336 0.336 0.598
> > > > rs145984817 0.975 0.309 0.371 0.371 0.371 0.638
> > > > rs1807401   0.975 0.309 0.371 0.371 0.371 0.638
> > > > rs1807402   0.975 0.309 0.371 0.371 0.371 0.638
> > > > rs35350506  0.975 0.309 0.371 0.371 0.371 0.638
> > > >
> > > > > dim(calc.rho)
> > > > [1] 246 246
> > > >
> > > > I would like to remove from this data all highly correlated variables,
> > > > with correlation more than 0.8
> > > >
> > > > I tried this:
> > > >
> > > > > data<- calc.rho[,!apply(calc.rho,2,function(x) any(abs(x) > 0.80))]
> > > > > dim(data)
> > > > [1] 246   0
> > > >
> > > > Can you please advise,
> > > >
> > > > Thanks
> > > > Ana
> > > >
> > > > But this removes everything.
> > > >
> > > > __
> > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide 
> > > > http://www.R-project.org/posting-guide.html
> > > > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help

Re: [R] Remove highly correlated variables from a data frame or matrix

2019-11-15 Thread Peter Langfelder

Try hclust(as.dist(1-calc.rho), method = "average").

Peter

On Fri, Nov 15, 2019 at 10:02 AM Ana Marija  wrote:
>
> HI Peter,
>
> Thank you for getting back to me and shedding light on this. I see
> your point, doing Jim's method:
>
> > keeprows<-apply(calc.rho,1,function(x) return(sum(x>0.8)<3))
> > ro246.lt.8<-calc.rho[keeprows,keeprows]
> > ro246.lt.8[ro246.lt.8 == 1] <- NA
> > (mmax <- max(abs(ro246.lt.8), na.rm=TRUE))
> [1] 0.566
>
> Which is good in general, correlations in my matrix  should not be
> exceeding 0.8. I need to run Mendelian Rendomization on it later on so
> I can not be having there highly correlated SNPs. But with Jim's
> method I am only left with 17 SNPs (out of 246) and that means that
> both pairs of highly correlated SNPs are removed and it would be good
> to keep one of those highly correlated ones.
>
> I tried to do your code:
> > tree = hclust(1-calc.rho, method = "average")
> Error in if (is.na(n) || n > 65536L) stop("size cannot be NA nor
> exceed 65536") :
>   missing value where TRUE/FALSE needed
>
> Please advise.
>
> Thanks
> Ana
>
> On Thu, Nov 14, 2019 at 7:37 PM Peter Langfelder
>  wrote:
> >
> > I suspect that you want to identify which variables are highly
> > correlated, and then keep only "representative" variables, i.e.,
> > remove redundant ones. This is a bit of a risky procedure but I have
> > done such things before as well sometimes to simplify large sets of
> > highly related variables. If your threshold of 0.8 is approximate, you
> > could simply use average linkage hierarchical clustering with
> > dissimilarity = 1-correlation, cut the tree at the appropriate height
> > (1-0.8=0.2), and from each cluster keep a single representative (e.g.,
> > the one with the highest mean correlation with other members of the
> > cluster). Something along these lines (untested)
> >
> > tree = hclust(1-calc.rho, method = "average")
> > clusts = cutree(tree, h = 0.2)
> > clustLevels = sort(unique(clusts))
> > representatives = unlist(lapply(clustLevels, function(cl)
> > {
> >   inClust = which(clusts==cl);
> >   rho1 = calc.rho[inClust, inClust, drop = FALSE];
> >   repr = inClust[ which.max(colSums(rho1)) ]
> >   repr
> > }))
> >
> > the variable representatives now contains indices of the variables you
> > want to retain, so you could subset the calc.rho matrix as
> > rho.retained = calc.rho[representatives, representatives]
> >
> > I haven't tested the code and it may contain bugs, but something along
> > these lines should get you where you want to be.
> >
> > Oh, and depending on how strict you want to be with the remaining
> > correlations, you could use complete linkage clustering (will retain
> > more variables, some correlations will be above 0.8).
> >
> > Peter
> >
> > On Thu, Nov 14, 2019 at 10:50 AM Ana Marija  
> > wrote:
> > >
> > > Hello,
> > >
> > > I have a data frame like this (a matrix):
> > > head(calc.rho)
> > > rs9900318 rs8069906 rs9908521 rs9908336 rs9908870 rs9895995
> > > rs56192520  0.903 0.268 0.327 0.327 0.327 0.582
> > > rs3764410   0.928 0.276 0.336 0.336 0.336 0.598
> > > rs145984817 0.975 0.309 0.371 0.371 0.371 0.638
> > > rs1807401   0.975 0.309 0.371 0.371 0.371 0.638
> > > rs1807402   0.975 0.309 0.371 0.371 0.371 0.638
> > > rs35350506  0.975 0.309 0.371 0.371 0.371 0.638
> > >
> > > > dim(calc.rho)
> > > [1] 246 246
> > >
> > > I would like to remove from this data all highly correlated variables,
> > > with correlation more than 0.8
> > >
> > > I tried this:
> > >
> > > > data<- calc.rho[,!apply(calc.rho,2,function(x) any(abs(x) > 0.80))]
> > > > dim(data)
> > > [1] 246   0
> > >
> > > Can you please advise,
> > >
> > > Thanks
> > > Ana
> > >
> > > But this removes everything.
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide 
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove highly correlated variables from a data frame or matrix

2019-11-15 Thread Jim Lemon

While the remedy for your dissatisfaction with my previous solution
should be obvious, I will make it explicit.

# that is rows containing at most one value > 0.8
# ignoring the diagonal
keeprows<-apply(ro246,1,function(x) return(sum(x>0.8)<2))
ro246.lt.8<-ro246[keeprows,keeprows]

Jim

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove highly correlated variables from a data frame or matrix

2019-11-15 Thread Ana Marija

if it is of any help my correlation matrix (calc.rho) was done here,
under LDmatrix tab https://ldlink.nci.nih.gov/?tab=ldmatrix
and dataset of 246 is bellow

rs56192520
rs3764410
rs145984817
rs1807401
rs1807402
rs35350506
rs2089177
rs12325677
rs62064624
rs62064631
rs2349295
rs2174369
rs7218554
rs62064634
rs4360974
rs4527060
rs6502526
rs6502527
rs9900318
rs8069906
rs9908521
rs9908336
rs9908870
rs9895995
rs7211086
rs9905280
rs8073305
rs8072086
rs4312350
rs4313843
rs8069610
rs883504
rs8072394
rs4280293
rs4465638
rs12602378
rs9899059
rs6502530
rs4380085
rs6502532
rs4792798
rs4792799
rs4316813
rs148563931
rs74751226
rs8068857
rs8069441
rs77397878
rs75339756
rs4608391
rs79569548
rs4275914
rs11870422
rs8075751
rs11658904
rs138437542
rs80344434
rs7222311
rs7221842
rs7223686
rs78013597
rs74965036
rs78063986
rs118106233
rs117345712
rs113004656
rs9898995
rs4985718
rs9893911
rs79110942
rs7208929
rs12601453
rs4078062
rs75129280
rs76664572
rs78961289
rs146364798
rs76715413
rs4078534
rs79457460
rs74369938
rs76423171
rs74668400
rs75146120
rs1135237
rs9914671
rs117759512
rs4985696
rs16961340
rs17794159
rs4247118
rs78572469
rs12601193
rs2349646
rs2090018
rs12601424
rs4985701
rs8064550
rs2271521
rs2271520
rs11078374
rs4985702
rs1124961
rs11652674
rs3924340
rs112450164
rs7208973
rs9910857
rs78574480
rs8072184
rs12602196
rs6502563
rs3744135
rs148779543
rs77689691
rs41319048
rs117340532
rs78647096
rs77712968
rs16961396
rs80054920
rs7206981
rs4985740
rs3803762
rs77103270
rs7207485
rs77342773
rs3826304
rs3744126
rs7210879
rs7211576
rs117967362
rs75978745
rs6502564
rs9894565
rs36079048
rs8076621
rs7218795
rs3803761
rs12602675
rs7208065
rs4985705
rs8080386
rs8065832
rs2018781
rs1736221
rs1736220
rs1736217
rs1708620
rs1708619
rs1736216
rs76319098
rs1736215
rs1736214
rs1708617
rs12602831
rs12602871
rs1736213
rs1736212
rs76045368
rs34518797
rs11078378
rs8079562
rs8065774
rs8066090
rs41337846
rs1736209
rs1736208
rs12949822
rs76246042
rs12600635
rs55689224
rs1736207
rs1708626
rs1736206
rs9896078
rs16961474
rs1708627
rs1736205
rs1708628
rs7220577
rs2294155
rs1736204
rs1736203
rs1736202
rs12937908
rs1736200
rs1708623
rs1708624
rs9894884
rs9901894
rs9903294
rs2472689
rs1630656
rs111478970
rs3182911
rs7219012
rs9890657
rs12453455
rs12947291
rs150267386
rs16961493
rs11652745
rs9907107
rs8070574
rs4985759
rs3866959
rs7219248
rs6502568
rs7220275
rs12450037
rs7225876
rs9892352
rs4985760
rs6502569
rs1029830
rs2012954
rs1029832
rs2270180
rs8072402
rs7221553
rs145597919
rs150772017
rs2041393
rs6502578
rs11078382
rs9912109
rs12601631
rs11869054
rs11869079
rs9912599
rs7220057
rs9896970
rs34121330
rs34668117
rs67773570
rs242252
rs955893
rs28583584
rs9944423
rs7217764
rs11651957
rs73978990
rs8071007
rs56044345
rs17804843


On Fri, Nov 15, 2019 at 12:03 PM Ana Marija  wrote:
>
> HI Peter,
>
> Thank you for getting back to me and shedding light on this. I see
> your point, doing Jim's method:
>
> > keeprows<-apply(calc.rho,1,function(x) return(sum(x>0.8)<3))
> > ro246.lt.8<-calc.rho[keeprows,keeprows]
> > ro246.lt.8[ro246.lt.8 == 1] <- NA
> > (mmax <- max(abs(ro246.lt.8), na.rm=TRUE))
> [1] 0.566
>
> Which is good in general, correlations in my matrix  should not be
> exceeding 0.8. I need to run Mendelian Rendomization on it later on so
> I can not be having there highly correlated SNPs. But with Jim's
> method I am only left with 17 SNPs (out of 246) and that means that
> both pairs of highly correlated SNPs are removed and it would be good
> to keep one of those highly correlated ones.
>
> I tried to do your code:
> > tree = hclust(1-calc.rho, method = "average")
> Error in if (is.na(n) || n > 65536L) stop("size cannot be NA nor
> exceed 65536") :
>   missing value where TRUE/FALSE needed
>
> Please advise.
>
> Thanks
> Ana
>
> On Thu, Nov 14, 2019 at 7:37 PM Peter Langfelder
>  wrote:
> >
> > I suspect that you want to identify which variables are highly
> > correlated, and then keep only "representative" variables, i.e.,
> > remove redundant ones. This is a bit of a risky procedure but I have
> > done such things before as well sometimes to simplify large sets of
> > highly related variables. If your threshold of 0.8 is approximate, you
> > could simply use average linkage hierarchical clustering with
> > dissimilarity = 1-correlation, cut the tree at the appropriate height
> > (1-0.8=0.2), and from each cluster keep a single representative (e.g.,
> > the one with the highest mean correlation with other members of the
> > cluster). Something along these lines (untested)
> >
> > tree = hclust(1-calc.rho, method = "average")
> > clusts = cutree(tree, h = 0.2)
> > clustLevels = sort(unique(clusts))
> > representatives = unlist(lapply(clustLevels, function(cl)
> > {
> >   inClust = which(clusts==cl);
> >   rho1 = calc.rho[inClust, inClust, drop = FALSE];
> >   repr = inClust[ which.max(colSums(rho1)) ]
> >   repr
> > }))
> >
> > the variable representatives now contains indices of the variables you
> > want to retain, so you could subset the calc.rho

Re: [R] Remove highly correlated variables from a data frame or matrix

2019-11-15 Thread Ana Marija

HI Peter,

Thank you for getting back to me and shedding light on this. I see
your point, doing Jim's method:

> keeprows<-apply(calc.rho,1,function(x) return(sum(x>0.8)<3))
> ro246.lt.8<-calc.rho[keeprows,keeprows]
> ro246.lt.8[ro246.lt.8 == 1] <- NA
> (mmax <- max(abs(ro246.lt.8), na.rm=TRUE))
[1] 0.566

Which is good in general, correlations in my matrix  should not be
exceeding 0.8. I need to run Mendelian Rendomization on it later on so
I can not be having there highly correlated SNPs. But with Jim's
method I am only left with 17 SNPs (out of 246) and that means that
both pairs of highly correlated SNPs are removed and it would be good
to keep one of those highly correlated ones.

I tried to do your code:
> tree = hclust(1-calc.rho, method = "average")
Error in if (is.na(n) || n > 65536L) stop("size cannot be NA nor
exceed 65536") :
  missing value where TRUE/FALSE needed

Please advise.

Thanks
Ana

On Thu, Nov 14, 2019 at 7:37 PM Peter Langfelder
 wrote:
>
> I suspect that you want to identify which variables are highly
> correlated, and then keep only "representative" variables, i.e.,
> remove redundant ones. This is a bit of a risky procedure but I have
> done such things before as well sometimes to simplify large sets of
> highly related variables. If your threshold of 0.8 is approximate, you
> could simply use average linkage hierarchical clustering with
> dissimilarity = 1-correlation, cut the tree at the appropriate height
> (1-0.8=0.2), and from each cluster keep a single representative (e.g.,
> the one with the highest mean correlation with other members of the
> cluster). Something along these lines (untested)
>
> tree = hclust(1-calc.rho, method = "average")
> clusts = cutree(tree, h = 0.2)
> clustLevels = sort(unique(clusts))
> representatives = unlist(lapply(clustLevels, function(cl)
> {
>   inClust = which(clusts==cl);
>   rho1 = calc.rho[inClust, inClust, drop = FALSE];
>   repr = inClust[ which.max(colSums(rho1)) ]
>   repr
> }))
>
> the variable representatives now contains indices of the variables you
> want to retain, so you could subset the calc.rho matrix as
> rho.retained = calc.rho[representatives, representatives]
>
> I haven't tested the code and it may contain bugs, but something along
> these lines should get you where you want to be.
>
> Oh, and depending on how strict you want to be with the remaining
> correlations, you could use complete linkage clustering (will retain
> more variables, some correlations will be above 0.8).
>
> Peter
>
> On Thu, Nov 14, 2019 at 10:50 AM Ana Marija  
> wrote:
> >
> > Hello,
> >
> > I have a data frame like this (a matrix):
> > head(calc.rho)
> > rs9900318 rs8069906 rs9908521 rs9908336 rs9908870 rs9895995
> > rs56192520  0.903 0.268 0.327 0.327 0.327 0.582
> > rs3764410   0.928 0.276 0.336 0.336 0.336 0.598
> > rs145984817 0.975 0.309 0.371 0.371 0.371 0.638
> > rs1807401   0.975 0.309 0.371 0.371 0.371 0.638
> > rs1807402   0.975 0.309 0.371 0.371 0.371 0.638
> > rs35350506  0.975 0.309 0.371 0.371 0.371 0.638
> >
> > > dim(calc.rho)
> > [1] 246 246
> >
> > I would like to remove from this data all highly correlated variables,
> > with correlation more than 0.8
> >
> > I tried this:
> >
> > > data<- calc.rho[,!apply(calc.rho,2,function(x) any(abs(x) > 0.80))]
> > > dim(data)
> > [1] 246   0
> >
> > Can you please advise,
> >
> > Thanks
> > Ana
> >
> > But this removes everything.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove highly correlated variables from a data frame or matrix

2019-11-14 Thread Peter Langfelder

I suspect that you want to identify which variables are highly
correlated, and then keep only "representative" variables, i.e.,
remove redundant ones. This is a bit of a risky procedure but I have
done such things before as well sometimes to simplify large sets of
highly related variables. If your threshold of 0.8 is approximate, you
could simply use average linkage hierarchical clustering with
dissimilarity = 1-correlation, cut the tree at the appropriate height
(1-0.8=0.2), and from each cluster keep a single representative (e.g.,
the one with the highest mean correlation with other members of the
cluster). Something along these lines (untested)

tree = hclust(1-calc.rho, method = "average")
clusts = cutree(tree, h = 0.2)
clustLevels = sort(unique(clusts))
representatives = unlist(lapply(clustLevels, function(cl)
{
  inClust = which(clusts==cl);
  rho1 = calc.rho[inClust, inClust, drop = FALSE];
  repr = inClust[ which.max(colSums(rho1)) ]
  repr
}))

the variable representatives now contains indices of the variables you
want to retain, so you could subset the calc.rho matrix as
rho.retained = calc.rho[representatives, representatives]

I haven't tested the code and it may contain bugs, but something along
these lines should get you where you want to be.

Oh, and depending on how strict you want to be with the remaining
correlations, you could use complete linkage clustering (will retain
more variables, some correlations will be above 0.8).

Peter

On Thu, Nov 14, 2019 at 10:50 AM Ana Marija  wrote:
>
> Hello,
>
> I have a data frame like this (a matrix):
> head(calc.rho)
> rs9900318 rs8069906 rs9908521 rs9908336 rs9908870 rs9895995
> rs56192520  0.903 0.268 0.327 0.327 0.327 0.582
> rs3764410   0.928 0.276 0.336 0.336 0.336 0.598
> rs145984817 0.975 0.309 0.371 0.371 0.371 0.638
> rs1807401   0.975 0.309 0.371 0.371 0.371 0.638
> rs1807402   0.975 0.309 0.371 0.371 0.371 0.638
> rs35350506  0.975 0.309 0.371 0.371 0.371 0.638
>
> > dim(calc.rho)
> [1] 246 246
>
> I would like to remove from this data all highly correlated variables,
> with correlation more than 0.8
>
> I tried this:
>
> > data<- calc.rho[,!apply(calc.rho,2,function(x) any(abs(x) > 0.80))]
> > dim(data)
> [1] 246   0
>
> Can you please advise,
>
> Thanks
> Ana
>
> But this removes everything.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove highly correlated variables from a data frame or matrix

2019-11-14 Thread Ana Marija

HI Jim,

This:
colnames(calc.jim)[colSums(abs(calc.jim)>0.8)<3]

was the master take!

Thank you so much!!!

On Thu, Nov 14, 2019 at 3:39 PM Jim Lemon  wrote:
>
> I thought you were going to trick us. What I think you are asking now
> is how to get the variable names in the columns that have at most one
> _absolute_ value greater than 0.8. OK:
>
> # I'm not going to try to recreate your correlation matrix
> calc.jim<-matrix(runif(100,min=-1,max=1),nrow=10)
> for(i in 1:10) calc.jim[i,i]<-1
> rownames(calc.jim)<-<-colnames(calc.jim)<-paste0("rs",1:10)
>
> Now that we have a plausible fake correlation matrix, all we have to
> do is extract the column names:
>
> colnames(calc.jim)[colSums(abs(calc.jim)>0.8)<2]
>
> Of course, what you really meant could have been, "I want the column
> names of the variables with at most one absolute value greater than
> 0.8 ignoring the diagonal values because I don't care about those". If
> so:
>
> colnames(calc.jim)[colSums(abs(calc.jim)>0.8)<3]
>
> Any more tricks?
>
> Jim
>
> On Fri, Nov 15, 2019 at 8:17 AM Ana Marija  
> wrote:
> >
> > what would be the approach to remove variable that has at least 2
> > correlation coefficients >0.8?
> > this is the whole output of the head()
> >
> > > head(calc.rho)
> > rs56192520 rs3764410 rs145984817 rs1807401 rs1807402 rs35350506
> > rs56192520   1.000 0.976   0.927 0.927 0.927  0.927
> > rs37644100.976 1.000   0.952 0.952 0.952  0.952
> > rs145984817  0.927 0.952   1.000 1.000 1.000  1.000
> > rs18074010.927 0.952   1.000 1.000 1.000  1.000
> > rs18074020.927 0.952   1.000 1.000 1.000  1.000
> > rs35350506   0.927 0.952   1.000 1.000 1.000  1.000
> > rs2089177 rs12325677 rs62064624 rs62064631 rs2349295 rs2174369
> > rs56192520  0.927  0.927  0.927  0.927 0.709 0.903
> > rs3764410   0.952  0.952  0.952  0.952 0.728 0.928
> > rs145984817 1.000  1.000  1.000  1.000 0.771 0.975
> > rs1807401   1.000  1.000  1.000  1.000 0.771 0.975
> > rs1807402   1.000  1.000  1.000  1.000 0.771 0.975
> > rs35350506  1.000  1.000  1.000  1.000 0.771 0.975
> > rs7218554 rs62064634 rs4360974 rs4527060 rs6502526 rs6502527
> > rs56192520  0.903  0.903 0.903 0.903 0.903 0.903
> > rs3764410   0.928  0.928 0.928 0.928 0.928 0.928
> > rs145984817 0.975  0.975 0.975 0.975 0.975 0.975
> > rs1807401   0.975  0.975 0.975 0.975 0.975 0.975
> > rs1807402   0.975  0.975 0.975 0.975 0.975 0.975
> > rs35350506  0.975  0.975 0.975 0.975 0.975 0.975
> > rs9900318 rs8069906 rs9908521 rs9908336 rs9908870 rs9895995
> > rs56192520  0.903 0.268 0.327 0.327 0.327 0.582
> > rs3764410   0.928 0.276 0.336 0.336 0.336 0.598
> > rs145984817 0.975 0.309 0.371 0.371 0.371 0.638
> > rs1807401   0.975 0.309 0.371 0.371 0.371 0.638
> > rs1807402   0.975 0.309 0.371 0.371 0.371 0.638
> > rs35350506  0.975 0.309 0.371 0.371 0.371 0.638
> > rs7211086 rs9905280 rs8073305 rs8072086 rs4312350 rs4313843
> > rs56192520  0.880 0.268 0.327 0.880 0.880 0.880
> > rs3764410   0.905 0.276 0.336 0.905 0.905 0.905
> > rs145984817 0.951 0.309 0.371 0.951 0.951 0.951
> > rs1807401   0.951 0.309 0.371 0.951 0.951 0.951
> > rs1807402   0.951 0.309 0.371 0.951 0.951 0.951
> > rs35350506  0.951 0.309 0.371 0.951 0.951 0.951
> > rs8069610 rs883504 rs8072394 rs4280293 rs4465638 rs12602378
> > rs56192520  0.5820.903 0.582 0.582 0.811  0.302
> > rs3764410   0.5980.928 0.598 0.598 0.836  0.311
> > rs145984817 0.6380.975 0.638 0.638 0.879  0.344
> > rs1807401   0.6380.975 0.638 0.638 0.879  0.344
> > rs1807402   0.6380.975 0.638 0.638 0.879  0.344
> > rs35350506  0.6380.975 0.638 0.638 0.879  0.344
> > rs9899059 rs6502530 rs4380085 rs6502532 rs4792798 rs4792799
> > rs56192520  0.302 0.309 0.834 0.251 0.063 0.063
> > rs3764410   0.311 0.318 0.858 0.259 0.080 0.080
> > rs145984817 0.344 0.352 0.902 0.291 0.086 0.086
> > rs1807401   0.344 0.352 0.902 0.291 0.086 0.086
> > rs1807402   0.344 0.352 0.902 0.291 0.086 0.086
> > rs35350506  0.344 0.352

Re: [R] Remove highly correlated variables from a data frame or matrix

2019-11-14 Thread Jim Lemon

I thought you were going to trick us. What I think you are asking now
is how to get the variable names in the columns that have at most one
_absolute_ value greater than 0.8. OK:

# I'm not going to try to recreate your correlation matrix
calc.jim<-matrix(runif(100,min=-1,max=1),nrow=10)
for(i in 1:10) calc.jim[i,i]<-1
rownames(calc.jim)<-<-colnames(calc.jim)<-paste0("rs",1:10)

Now that we have a plausible fake correlation matrix, all we have to
do is extract the column names:

colnames(calc.jim)[colSums(abs(calc.jim)>0.8)<2]

Of course, what you really meant could have been, "I want the column
names of the variables with at most one absolute value greater than
0.8 ignoring the diagonal values because I don't care about those". If
so:

colnames(calc.jim)[colSums(abs(calc.jim)>0.8)<3]

Any more tricks?

Jim

On Fri, Nov 15, 2019 at 8:17 AM Ana Marija  wrote:
>
> what would be the approach to remove variable that has at least 2
> correlation coefficients >0.8?
> this is the whole output of the head()
>
> > head(calc.rho)
> rs56192520 rs3764410 rs145984817 rs1807401 rs1807402 rs35350506
> rs56192520   1.000 0.976   0.927 0.927 0.927  0.927
> rs37644100.976 1.000   0.952 0.952 0.952  0.952
> rs145984817  0.927 0.952   1.000 1.000 1.000  1.000
> rs18074010.927 0.952   1.000 1.000 1.000  1.000
> rs18074020.927 0.952   1.000 1.000 1.000  1.000
> rs35350506   0.927 0.952   1.000 1.000 1.000  1.000
> rs2089177 rs12325677 rs62064624 rs62064631 rs2349295 rs2174369
> rs56192520  0.927  0.927  0.927  0.927 0.709 0.903
> rs3764410   0.952  0.952  0.952  0.952 0.728 0.928
> rs145984817 1.000  1.000  1.000  1.000 0.771 0.975
> rs1807401   1.000  1.000  1.000  1.000 0.771 0.975
> rs1807402   1.000  1.000  1.000  1.000 0.771 0.975
> rs35350506  1.000  1.000  1.000  1.000 0.771 0.975
> rs7218554 rs62064634 rs4360974 rs4527060 rs6502526 rs6502527
> rs56192520  0.903  0.903 0.903 0.903 0.903 0.903
> rs3764410   0.928  0.928 0.928 0.928 0.928 0.928
> rs145984817 0.975  0.975 0.975 0.975 0.975 0.975
> rs1807401   0.975  0.975 0.975 0.975 0.975 0.975
> rs1807402   0.975  0.975 0.975 0.975 0.975 0.975
> rs35350506  0.975  0.975 0.975 0.975 0.975 0.975
> rs9900318 rs8069906 rs9908521 rs9908336 rs9908870 rs9895995
> rs56192520  0.903 0.268 0.327 0.327 0.327 0.582
> rs3764410   0.928 0.276 0.336 0.336 0.336 0.598
> rs145984817 0.975 0.309 0.371 0.371 0.371 0.638
> rs1807401   0.975 0.309 0.371 0.371 0.371 0.638
> rs1807402   0.975 0.309 0.371 0.371 0.371 0.638
> rs35350506  0.975 0.309 0.371 0.371 0.371 0.638
> rs7211086 rs9905280 rs8073305 rs8072086 rs4312350 rs4313843
> rs56192520  0.880 0.268 0.327 0.880 0.880 0.880
> rs3764410   0.905 0.276 0.336 0.905 0.905 0.905
> rs145984817 0.951 0.309 0.371 0.951 0.951 0.951
> rs1807401   0.951 0.309 0.371 0.951 0.951 0.951
> rs1807402   0.951 0.309 0.371 0.951 0.951 0.951
> rs35350506  0.951 0.309 0.371 0.951 0.951 0.951
> rs8069610 rs883504 rs8072394 rs4280293 rs4465638 rs12602378
> rs56192520  0.5820.903 0.582 0.582 0.811  0.302
> rs3764410   0.5980.928 0.598 0.598 0.836  0.311
> rs145984817 0.6380.975 0.638 0.638 0.879  0.344
> rs1807401   0.6380.975 0.638 0.638 0.879  0.344
> rs1807402   0.6380.975 0.638 0.638 0.879  0.344
> rs35350506  0.6380.975 0.638 0.638 0.879  0.344
> rs9899059 rs6502530 rs4380085 rs6502532 rs4792798 rs4792799
> rs56192520  0.302 0.309 0.834 0.251 0.063 0.063
> rs3764410   0.311 0.318 0.858 0.259 0.080 0.080
> rs145984817 0.344 0.352 0.902 0.291 0.086 0.086
> rs1807401   0.344 0.352 0.902 0.291 0.086 0.086
> rs1807402   0.344 0.352 0.902 0.291 0.086 0.086
> rs35350506  0.344 0.352 0.902 0.291 0.086 0.086
> rs4316813 rs148563931 rs74751226 rs8068857 rs8069441 rs77397878
> rs56192520  0.006   0.006  0.006 0.006 0.006  0.006
> rs3764410   0.006   0.006  0.006 0.006 0.006  0.006
> rs145984817 0.006   0.006  0.006

Re: [R] Remove highly correlated variables from a data frame or matrix

2019-11-14 Thread Jim Lemon

Hi Ana,
Rather than addressing the question of why you want to do this, Let's
get make the question easier to answer:

calc.rho<-matrix(c(0.903,0.268,0.327,0.327,0.327,0.582,
0.928,0.276,0.336,0.336,0.336,0.598,
0.975,0.309,0.371,0.371,0.371,0.638,
0.975,0.309,0.371,0.371,0.371,0.638,
0.975,0.309,0.371,0.371,0.371,0.638,
0.975,0.309,0.371,0.371,0.371,0.638),ncol=6,byrow=TRUE)
rnames<-c("rs56192520","rs3764410","rs145984817","rs1807401",
"rs1807402","rs35350506")
rownames(calc.rho)<-rnames
cnames<-c("rs9900318","rs8069906","rs9908521","rs9908336",
"rs9908870","rs9895995")
colnames(calc.rho)<-cnames

Now if you  just want a vector of the values less than 0.8, it's trivial:

calc.rho[calc.rho<0.8]

However, based on your previous questions, I suspect you want
something else. Maybe the pairs of row/column names that correspond to
the values less than 0.8. To ensure that you haven't tricked us by not
including columns in which values range around 0.8, I'll do it this
way:

# make the new variable name possible to decode
calc.lt.8<-calc.rho<0.8
varnames.lt.8<-data.frame(var1=NA,var2=NA)
for(row in 1:nrow(calc.rho)) {
 for(col in 1:ncol(calc.rho))
  if(calc.lt.8[row,col])
   varnames.lt.8<-rbind(varnames.lt.8,c(rnames[row],cnames[col]))
}
# now get rid of the first row of NA values
varnames.lt.8<-varnames.lt.8[-1,]

Clunky, but effective. You now have those variable pairs that you may
want. Let us know in the next episode of this soap operation.

Jim

On Fri, Nov 15, 2019 at 5:50 AM Ana Marija  wrote:
>
> Hello,
>
> I have a data frame like this (a matrix):
> head(calc.rho)
> rs9900318 rs8069906 rs9908521 rs9908336 rs9908870 rs9895995
> rs56192520  0.903 0.268 0.327 0.327 0.327 0.582
> rs3764410   0.928 0.276 0.336 0.336 0.336 0.598
> rs145984817 0.975 0.309 0.371 0.371 0.371 0.638
> rs1807401   0.975 0.309 0.371 0.371 0.371 0.638
> rs1807402   0.975 0.309 0.371 0.371 0.371 0.638
> rs35350506  0.975 0.309 0.371 0.371 0.371 0.638
>
> > dim(calc.rho)
> [1] 246 246
>
> I would like to remove from this data all highly correlated variables,
> with correlation more than 0.8
>
> I tried this:
>
> > data<- calc.rho[,!apply(calc.rho,2,function(x) any(abs(x) > 0.80))]
> > dim(data)
> [1] 246   0
>
> Can you please advise,
>
> Thanks
> Ana
>
> But this removes everything.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove highly correlated variables from a data frame or matrix

2019-11-14 Thread Ana Marija

what would be the approach to remove variable that has at least 2
correlation coefficients >0.8?
this is the whole output of the head()

> head(calc.rho)
rs56192520 rs3764410 rs145984817 rs1807401 rs1807402 rs35350506
rs56192520   1.000 0.976   0.927 0.927 0.927  0.927
rs37644100.976 1.000   0.952 0.952 0.952  0.952
rs145984817  0.927 0.952   1.000 1.000 1.000  1.000
rs18074010.927 0.952   1.000 1.000 1.000  1.000
rs18074020.927 0.952   1.000 1.000 1.000  1.000
rs35350506   0.927 0.952   1.000 1.000 1.000  1.000
rs2089177 rs12325677 rs62064624 rs62064631 rs2349295 rs2174369
rs56192520  0.927  0.927  0.927  0.927 0.709 0.903
rs3764410   0.952  0.952  0.952  0.952 0.728 0.928
rs145984817 1.000  1.000  1.000  1.000 0.771 0.975
rs1807401   1.000  1.000  1.000  1.000 0.771 0.975
rs1807402   1.000  1.000  1.000  1.000 0.771 0.975
rs35350506  1.000  1.000  1.000  1.000 0.771 0.975
rs7218554 rs62064634 rs4360974 rs4527060 rs6502526 rs6502527
rs56192520  0.903  0.903 0.903 0.903 0.903 0.903
rs3764410   0.928  0.928 0.928 0.928 0.928 0.928
rs145984817 0.975  0.975 0.975 0.975 0.975 0.975
rs1807401   0.975  0.975 0.975 0.975 0.975 0.975
rs1807402   0.975  0.975 0.975 0.975 0.975 0.975
rs35350506  0.975  0.975 0.975 0.975 0.975 0.975
rs9900318 rs8069906 rs9908521 rs9908336 rs9908870 rs9895995
rs56192520  0.903 0.268 0.327 0.327 0.327 0.582
rs3764410   0.928 0.276 0.336 0.336 0.336 0.598
rs145984817 0.975 0.309 0.371 0.371 0.371 0.638
rs1807401   0.975 0.309 0.371 0.371 0.371 0.638
rs1807402   0.975 0.309 0.371 0.371 0.371 0.638
rs35350506  0.975 0.309 0.371 0.371 0.371 0.638
rs7211086 rs9905280 rs8073305 rs8072086 rs4312350 rs4313843
rs56192520  0.880 0.268 0.327 0.880 0.880 0.880
rs3764410   0.905 0.276 0.336 0.905 0.905 0.905
rs145984817 0.951 0.309 0.371 0.951 0.951 0.951
rs1807401   0.951 0.309 0.371 0.951 0.951 0.951
rs1807402   0.951 0.309 0.371 0.951 0.951 0.951
rs35350506  0.951 0.309 0.371 0.951 0.951 0.951
rs8069610 rs883504 rs8072394 rs4280293 rs4465638 rs12602378
rs56192520  0.5820.903 0.582 0.582 0.811  0.302
rs3764410   0.5980.928 0.598 0.598 0.836  0.311
rs145984817 0.6380.975 0.638 0.638 0.879  0.344
rs1807401   0.6380.975 0.638 0.638 0.879  0.344
rs1807402   0.6380.975 0.638 0.638 0.879  0.344
rs35350506  0.6380.975 0.638 0.638 0.879  0.344
rs9899059 rs6502530 rs4380085 rs6502532 rs4792798 rs4792799
rs56192520  0.302 0.309 0.834 0.251 0.063 0.063
rs3764410   0.311 0.318 0.858 0.259 0.080 0.080
rs145984817 0.344 0.352 0.902 0.291 0.086 0.086
rs1807401   0.344 0.352 0.902 0.291 0.086 0.086
rs1807402   0.344 0.352 0.902 0.291 0.086 0.086
rs35350506  0.344 0.352 0.902 0.291 0.086 0.086
rs4316813 rs148563931 rs74751226 rs8068857 rs8069441 rs77397878
rs56192520  0.006   0.006  0.006 0.006 0.006  0.006
rs3764410   0.006   0.006  0.006 0.006 0.006  0.006
rs145984817 0.006   0.006  0.006 0.006 0.006  0.006
rs1807401   0.006   0.006  0.006 0.006 0.006  0.006
rs1807402   0.006   0.006  0.006 0.006 0.006  0.006
rs35350506  0.006   0.006  0.006 0.006 0.006  0.006
rs75339756 rs4608391 rs79569548 rs4275914 rs11870422 rs8075751
rs56192520   0.006 0.006  0.006 0.044  0.007 0.004
rs37644100.006 0.006  0.006 0.042  0.005 0.005
rs145984817  0.006 0.006  0.006 0.047  0.002 0.015
rs18074010.006 0.006  0.006 0.047  0.002 0.015
rs18074020.006 0.006  0.006 0.047  0.002 0.015
rs35350506   0.006 0.006  0.006 0.047  0.002 0.015
rs11658904 rs138437542 rs80344434 rs7222311 rs7221842 rs7223686
rs56192520   0.003   0.004  0.004 0.033 0.009 0.000
rs37644100.004   0.004  0.004

Re: [R] Remove highly correlated variables from a data frame or matrix

2019-11-14 Thread Abby Spurdle

That's assuming your data was returned by head().

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove highly correlated variables from a data frame or matrix

2019-11-14 Thread Abby Spurdle

> I basically want to remove all entries for pairs which have value in
> between them (correlation calculated not in R, bit it is correlation,
> r2)
> so for example I would not keep: rs883504 because it has r2>0.8 for
> all those rs...

I'm still not sure what "remove all entries" means?
In your example rs883504, has all correlation coefficients > 0.8, in
the data returned by head().
However, most of its correlation coefficients are < 0.8, if you
include the entire matrix.

If you remove a variable that has at least one correlation coefficient
> 0.8, you would remove all the variables.
However, if you remove a variable that has all correlation
coefficients > 0.8, you would (probably) remove no variables.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove highly correlated variables from a data frame or matrix

2019-11-14 Thread Abby Spurdle

Sorry, but I don't understand your question.

When I first looked at this, I thought it was a correlation (or
covariance) matrix.
e.g.

> cor (quakes)
> cov (quakes)

However, your  row and column variables are different, implying two
different data sets.
Also, some of the (correlation?) coefficients are the same, implying
that some of the variables are the same, or very close.

Also, note that a matrix is not a data.frame.


> I have a data frame like this (a matrix):
> head(calc.rho)
> rs9900318 rs8069906 rs9908521 rs9908336 rs9908870 rs9895995
> rs56192520  0.903 0.268 0.327 0.327 0.327 0.582
> rs3764410   0.928 0.276 0.336 0.336 0.336 0.598
> rs145984817 0.975 0.309 0.371 0.371 0.371 0.638
> rs1807401   0.975 0.309 0.371 0.371 0.371 0.638
> rs1807402   0.975 0.309 0.371 0.371 0.371 0.638
> rs35350506  0.975 0.309 0.371 0.371 0.371 0.638
> > dim(calc.rho)
> [1] 246 246
> I would like to remove from this data all highly correlated variables,
> with correlation more than 0.8

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove highly correlated variables from a data frame or matrix

2019-11-14 Thread Ana Marija

I don't understand. I have to keep only pairs of variables with
correlation less than 0.8 in order to proceed with some calculations

On Thu, Nov 14, 2019 at 2:09 PM Bert Gunter  wrote:
>
> Obvious advice:
>
> DON'T DO THIS!
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and 
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Thu, Nov 14, 2019 at 10:50 AM Ana Marija  
> wrote:
>>
>> Hello,
>>
>> I have a data frame like this (a matrix):
>> head(calc.rho)
>> rs9900318 rs8069906 rs9908521 rs9908336 rs9908870 rs9895995
>> rs56192520  0.903 0.268 0.327 0.327 0.327 0.582
>> rs3764410   0.928 0.276 0.336 0.336 0.336 0.598
>> rs145984817 0.975 0.309 0.371 0.371 0.371 0.638
>> rs1807401   0.975 0.309 0.371 0.371 0.371 0.638
>> rs1807402   0.975 0.309 0.371 0.371 0.371 0.638
>> rs35350506  0.975 0.309 0.371 0.371 0.371 0.638
>>
>> > dim(calc.rho)
>> [1] 246 246
>>
>> I would like to remove from this data all highly correlated variables,
>> with correlation more than 0.8
>>
>> I tried this:
>>
>> > data<- calc.rho[,!apply(calc.rho,2,function(x) any(abs(x) > 0.80))]
>> > dim(data)
>> [1] 246   0
>>
>> Can you please advise,
>>
>> Thanks
>> Ana
>>
>> But this removes everything.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove highly correlated variables from a data frame or matrix

2019-11-14 Thread Bert Gunter

Obvious advice:

DON'T DO THIS!

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Nov 14, 2019 at 10:50 AM Ana Marija 
wrote:

> Hello,
>
> I have a data frame like this (a matrix):
> head(calc.rho)
> rs9900318 rs8069906 rs9908521 rs9908336 rs9908870 rs9895995
> rs56192520  0.903 0.268 0.327 0.327 0.327 0.582
> rs3764410   0.928 0.276 0.336 0.336 0.336 0.598
> rs145984817 0.975 0.309 0.371 0.371 0.371 0.638
> rs1807401   0.975 0.309 0.371 0.371 0.371 0.638
> rs1807402   0.975 0.309 0.371 0.371 0.371 0.638
> rs35350506  0.975 0.309 0.371 0.371 0.371 0.638
>
> > dim(calc.rho)
> [1] 246 246
>
> I would like to remove from this data all highly correlated variables,
> with correlation more than 0.8
>
> I tried this:
>
> > data<- calc.rho[,!apply(calc.rho,2,function(x) any(abs(x) > 0.80))]
> > dim(data)
> [1] 246   0
>
> Can you please advise,
>
> Thanks
> Ana
>
> But this removes everything.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Remove highly correlated variables from a data frame or matrix

2019-11-14 Thread Ana Marija

Hello,

I have a data frame like this (a matrix):
head(calc.rho)
rs9900318 rs8069906 rs9908521 rs9908336 rs9908870 rs9895995
rs56192520  0.903 0.268 0.327 0.327 0.327 0.582
rs3764410   0.928 0.276 0.336 0.336 0.336 0.598
rs145984817 0.975 0.309 0.371 0.371 0.371 0.638
rs1807401   0.975 0.309 0.371 0.371 0.371 0.638
rs1807402   0.975 0.309 0.371 0.371 0.371 0.638
rs35350506  0.975 0.309 0.371 0.371 0.371 0.638

> dim(calc.rho)
[1] 246 246

I would like to remove from this data all highly correlated variables,
with correlation more than 0.8

I tried this:

> data<- calc.rho[,!apply(calc.rho,2,function(x) any(abs(x) > 0.80))]
> dim(data)
[1] 246   0

Can you please advise,

Thanks
Ana

But this removes everything.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove Even Number from A Vector

2019-03-03 Thread Duncan Murdoch


On 03/03/2019 3:44 a.m., Ivan Krylov wrote:

Hi Darren,

On Sat, 2 Mar 2019 22:27:55 +
Darren Danyluk  wrote:


It sounds like she is working with the very basics of this software,
and her task is to write the code which would result in the
extraction of "odd" data from a dataset of restaurant sales.


Not a native English speaker here; what exactly do you mean by "odd" in
this case?


"Odd" numbers have a remainder of 1 when divided by 2; "even" numbers 
are multiples of 2.


Duncan Murdoch



If I ignore the "subject" field, it looks like your daughter should be
looking for outlier detection methods. For example, https://rseek.org/
offers some good results for "outlier detection".


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove Even Number from A Vector

2019-03-03 Thread Ivan Krylov

Hi Darren,

On Sat, 2 Mar 2019 22:27:55 +
Darren Danyluk  wrote:

> It sounds like she is working with the very basics of this software,
> and her task is to write the code which would result in the
> extraction of "odd" data from a dataset of restaurant sales.

Not a native English speaker here; what exactly do you mean by "odd" in
this case?

If I ignore the "subject" field, it looks like your daughter should be
looking for outlier detection methods. For example, https://rseek.org/
offers some good results for "outlier detection".

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove Even Number from A Vector

2019-03-02 Thread Jim Lemon

Hi Darren,
You're probably looking for the %% (remainder) operator:

x<-1:10
# get odd numbers
x[as.logical(x%%2)]
# get even numbers
x[!(x%%2)]

Jim

On Sun, Mar 3, 2019 at 4:10 PM Darren Danyluk  wrote:
>
> Hello,
>
> I found this email when looking for some help with R Studio.  It's actually 
> my daughter who is looking for help.
>
> It sounds like she is working with the very basics of this software, and her 
> task is to write the code which would result in the extraction of "odd" data 
> from a dataset of restaurant sales.
>
> This is a shot in the dark...please ignore if my question makes little or no 
> sense.  I have no working knowledge of R software.
>
> Thanks.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove Even Number from A Vector

2019-03-02 Thread Darren Danyluk

Hello,

I found this email when looking for some help with R Studio.  It's actually my 
daughter who is looking for help.

It sounds like she is working with the very basics of this software, and her 
task is to write the code which would result in the extraction of "odd" data 
from a dataset of restaurant sales.

This is a shot in the dark...please ignore if my question makes little or no 
sense.  I have no working knowledge of R software.

Thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove cases with -Inf from a data frame

2019-02-17 Thread Ek Esawi

This is a similar versions of other answers.

df[apply(apply(df,2,is.finite),1,sum)==4,]

BOL---EK

On Sat, Feb 16, 2019 at 10:07 AM AbouEl-Makarim Aboueissa
 wrote:
>
> Dear All: good morning
>
>
> I have a log-transformed data frame with some *-Inf* data values.
>
> *my question: *how to remove all rows with *-Inf* data value from that data
> frame?
>
>
> with many thanks
> abou
> __
>
>
> *AbouEl-Makarim Aboueissa, PhD*
>
> *Professor, Statistics and Data Science*
> *Graduate Coordinator*
>
> *Department of Mathematics and Statistics*
> *University of Southern Maine*
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove cases with -Inf from a data frame

2019-02-17 Thread AbouEl-Makarim Aboueissa

Dear Rui and All:

thank you very much for your very helpful responses.

with many thanks
abou
__


*AbouEl-Makarim Aboueissa, PhD*

*Professor, Statistics and Data Science*
*Graduate Coordinator*

*Department of Mathematics and Statistics*
*University of Southern Maine*



On Sat, Feb 16, 2019 at 11:36 AM Rui Barradas  wrote:

> Hello,
>
> An alternative, same dataset.
>
> df[apply(df, 1, function(x) all(is.finite(x))), ]
>
>
> Hope this helps,
>
> Rui Barradas
>
> Às 16:14 de 16/02/2019, Martin Møller Skarbiniks Pedersen escreveu:
> > On Sat, 16 Feb 2019 at 16:07, AbouEl-Makarim Aboueissa <
> > abouelmakarim1...@gmail.com> wrote:
> >>
> >> I have a log-transformed data frame with some *-Inf* data values.
> >>
> >> *my question: *how to remove all rows with *-Inf* data value from that
> > data
> >> frame?
> >
> >
> > Hi,
> >Here is a solution which uses apply.
> >
> > First a data-frame as input:
> >
> > set.seed(1)
> > df <- data.frame(w = sample(c(-Inf,1:20), 10),
> >   x = sample(c(-Inf,1:20), 10),
> >   y = sample(c(-Inf,1:20), 10),
> >   z = sample(c(-Inf,1:20), 10))
> >
> > df <- df[-(unlist(apply(df, 2, function(x) which(x == -Inf,]
> >
> > Regards
> > Martin
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove cases with -Inf from a data frame

2019-02-16 Thread Rui Barradas


Hello,

An alternative, same dataset.

df[apply(df, 1, function(x) all(is.finite(x))), ]


Hope this helps,

Rui Barradas

Às 16:14 de 16/02/2019, Martin Møller Skarbiniks Pedersen escreveu:

On Sat, 16 Feb 2019 at 16:07, AbouEl-Makarim Aboueissa <
abouelmakarim1...@gmail.com> wrote:


I have a log-transformed data frame with some *-Inf* data values.

*my question: *how to remove all rows with *-Inf* data value from that

data

frame?



Hi,
   Here is a solution which uses apply.

First a data-frame as input:

set.seed(1)
df <- data.frame(w = sample(c(-Inf,1:20), 10),
  x = sample(c(-Inf,1:20), 10),
  y = sample(c(-Inf,1:20), 10),
  z = sample(c(-Inf,1:20), 10))

df <- df[-(unlist(apply(df, 2, function(x) which(x == -Inf,]

Regards
Martin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove cases with -Inf from a data frame

2019-02-16 Thread Bert Gunter

Sorry, that's

function(x)all(is.finite(x) | is.na(x) )

of course.


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Feb 16, 2019 at 8:25 AM Bert Gunter  wrote:

> Many ways. I assume you know that Inf and -Inf are (special) numeric
> values that can be treated like other numerics. i.e.
>
> > 1 == - Inf
> [1] FALSE
>
> So straightforward indexing (selection) would do it.
> But there is also ?is.infinite and ?is.finite, so
>
> apply(yourdat, 1, function(x)all(is.finite(x)))
>
> would produce the index vector to keep rows with only finite values
> assuming yourdat contains only numeric data. If this is not the case, just
> select the numeric columns to index on, i.e.
>
> apply(yourdat[sapply(yourdat,is.numeric)], 1, function(x)
> all(is.finite(x)))
>
> One possible problem here is handling of NA's:
>
> is.finite(c(-Inf,NA))
> [1] FALSE FALSE
> ... so rows containing NA's but no -Inf's would also get removed. If you
> wish to keep rows with NA's but no -Inf's, then
>
> function(x)(is.finite(x) | is.na(x) )
>
> could be used.
>
> Cheers,
> Bert
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Sat, Feb 16, 2019 at 7:07 AM AbouEl-Makarim Aboueissa <
> abouelmakarim1...@gmail.com> wrote:
>
>> Dear All: good morning
>>
>>
>> I have a log-transformed data frame with some *-Inf* data values.
>>
>> *my question: *how to remove all rows with *-Inf* data value from that
>> data
>> frame?
>>
>>
>> with many thanks
>> abou
>> __
>>
>>
>> *AbouEl-Makarim Aboueissa, PhD*
>>
>> *Professor, Statistics and Data Science*
>> *Graduate Coordinator*
>>
>> *Department of Mathematics and Statistics*
>> *University of Southern Maine*
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove cases with -Inf from a data frame

2019-02-16 Thread Bert Gunter

Many ways. I assume you know that Inf and -Inf are (special) numeric values
that can be treated like other numerics. i.e.

> 1 == - Inf
[1] FALSE

So straightforward indexing (selection) would do it.
But there is also ?is.infinite and ?is.finite, so

apply(yourdat, 1, function(x)all(is.finite(x)))

would produce the index vector to keep rows with only finite values
assuming yourdat contains only numeric data. If this is not the case, just
select the numeric columns to index on, i.e.

apply(yourdat[sapply(yourdat,is.numeric)], 1, function(x) all(is.finite(x)))

One possible problem here is handling of NA's:

is.finite(c(-Inf,NA))
[1] FALSE FALSE
... so rows containing NA's but no -Inf's would also get removed. If you
wish to keep rows with NA's but no -Inf's, then

function(x)(is.finite(x) | is.na(x) )

could be used.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Sat, Feb 16, 2019 at 7:07 AM AbouEl-Makarim Aboueissa <
abouelmakarim1...@gmail.com> wrote:

> Dear All: good morning
>
>
> I have a log-transformed data frame with some *-Inf* data values.
>
> *my question: *how to remove all rows with *-Inf* data value from that data
> frame?
>
>
> with many thanks
> abou
> __
>
>
> *AbouEl-Makarim Aboueissa, PhD*
>
> *Professor, Statistics and Data Science*
> *Graduate Coordinator*
>
> *Department of Mathematics and Statistics*
> *University of Southern Maine*
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove cases with -Inf from a data frame

2019-02-16 Thread Martin Møller Skarbiniks Pedersen

On Sat, 16 Feb 2019 at 16:07, AbouEl-Makarim Aboueissa <
abouelmakarim1...@gmail.com> wrote:
>
> I have a log-transformed data frame with some *-Inf* data values.
>
> *my question: *how to remove all rows with *-Inf* data value from that
data
> frame?

Hi,
  Here is a solution which uses apply.

First a data-frame as input:

set.seed(1)
df <- data.frame(w = sample(c(-Inf,1:20), 10),
 x = sample(c(-Inf,1:20), 10),
 y = sample(c(-Inf,1:20), 10),
 z = sample(c(-Inf,1:20), 10))

df <- df[-(unlist(apply(df, 2, function(x) which(x == -Inf,]

Regards
Martin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove cases with -Inf from a data frame

2019-02-16 Thread Michael Dewey


Dear Abou

Depends on exact details of your variables but

?is.finite

Gives you the basic tool.

On 16/02/2019 15:05, AbouEl-Makarim Aboueissa wrote:

Dear All: good morning


I have a log-transformed data frame with some *-Inf* data values.

*my question: *how to remove all rows with *-Inf* data value from that data
frame?


with many thanks
abou
__


*AbouEl-Makarim Aboueissa, PhD*

*Professor, Statistics and Data Science*
*Graduate Coordinator*

*Department of Mathematics and Statistics*
*University of Southern Maine*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Michael
http://www.dewey.myzen.co.uk/home.html

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Remove cases with -Inf from a data frame

2019-02-16 Thread AbouEl-Makarim Aboueissa

Dear All: good morning


I have a log-transformed data frame with some *-Inf* data values.

*my question: *how to remove all rows with *-Inf* data value from that data
frame?


with many thanks
abou
__


*AbouEl-Makarim Aboueissa, PhD*

*Professor, Statistics and Data Science*
*Graduate Coordinator*

*Department of Mathematics and Statistics*
*University of Southern Maine*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove specific rows from nested list of matrices

2018-11-08 Thread Ek Esawi

Thank you all, Bert, Jeff, Bill an Don. I realized I made a silly
mistake in list indexing. Once I saw Bills’ suggestion and was able to
wrap my head around indexing recursive lists, I resolved the problem.
For future readers, here is the answers, even though the question may
not have been clear. I tried Don’s idea and it worked too.

 To filter out rows that start with an empty (i.e. start with numbers,
in this case) string, I used Bill’s suggestion.
G <- lapply(FF, function(x) lapply(x, function (y) lapply(y,
function(z)   z[grepl("^[0-9][0-9]/",z[,1]),])))
S1 <-"\\sÂ.*\\s|^[0-9]x.*|.*[P-p]oints.*|.*\\sto\\s.*"
To remove some unwanted entries, I used this formula.
F <- lapply(G, function(x) lapply(x, function (y) lapply(y,
function(z) gsub(S1,"",z

Thanks again--EK
On Fri, Nov 2, 2018 at 11:00 AM Ek Esawi  wrote:
>
> Hi All,
>
> I have a list that is made up of nested lists, as shown below. I want
> to remove all rows in each sub-list that start with an empty space,
> that’s the first entry of a row is blank; for example, on
> [[1]][[1]][[1]] Remove row 4,on [[1]][[1]][[3]] remove row 5, on
> [[1]][[2]][[1]] remove row 6, etc.. All rows start with 2 digits/ 2
> digits. My formula works on individual sublist but not the whole
> list.. I know my indexing is wrong, but don’t know how to fix it.
>
>
> > FF
>
> [[1]]
> [[1]][[1]]
> [[1]][[1]][[1]]
> [,1][,2]   [,3][,4] [,5]
> [1,] "30/20"   "" ““   "-89"
> [2,] "02/20"   "” ““   "-98"
> [3,] "02/20"   “AAA” ““   "-84"
> [4,] “  “ “  “   “
> [[1]][[1]][[2]]
> [,1][,2]
> [1,] "02/23" “” : 29" “
> [2,] "02/23" “” ." “
> [3,] "02/23" “” " “
> [4,] "02/23" “” "
> [[1]][[1]][[3]]
> [,1][,2][,3] [,4] [,5] [,6] [,7]
> [1,] "01/09" “"“   “   “   "53"
> [2,] "01/09" “” "   “   “   “   "403"
> [3,] "01/09" “” "   “   “   “   "83"
> [4,] "01/09" “” "   “   “   “   "783"
> [5,] “  “  “”  3042742181"   “   “   “   “
> [[1]][[2]]
> [[1]][[2]][[1]]
> [,1]  [,2] [,3] [,4] [,5]
> [1,] ““   “   “   “” "
> [2,] "Standard Purchases"  “   “   “   "
> [3,] "24/90 "” “   "243"  "
> [4,] "24/90 "” "   "143"  "
> [5,] "24/91 "” " “   "143" “
> [6,] ““   “   “   "792"
> [[1]][[2]][[2]]
> [,1][,2]
> [1,] "02/23" “”: 31" “
> [2,] "02/23" “”." “
> [3,] "02/23" “” " “
> [4,] "02/23" “”
> [5,] "02/23" “”
> [6,] "02/23" “” 20"
> [7,] "02/23" “”  “
> [8,] "02/23" “” "33"
> [[1]][[3]]
> [[1]][[3]][[1]]
> [,1][,2]
> [1,] "02/23" “”: 28" “
> [2,] "02/23" “”." “
> [3,] "02/23" “” " “
> [4,] "02/23" “” "
> [[1]][[3]][[2]]
> [,1][,2][,3][,4] [,5] [,6] [,7][,8][,9]
> [1,] "02/23" “” " “   “   "53" "
> [2,] "02/24" “” " “   “   "
> [3,] “  “  “  “   “   “   “  “  "1,241"
> [4,] "02/24" "”  “   "33”
>
> My Formula,:
>
> G <- lapply(FF, function(x) lapply(x, function (y) lapply(y,
> function(z)  z[grepl("^[0-9][0-9]/",z[,1]),])))
>
> The error: Error in z[, 1] : incorrect number of dimensions
>
>
>
> Thanks in advance--EK

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove specific rows from nested list of matrices

2018-11-02 Thread MacQueen, Don via R-help

It appears that at the bottom of the nesting, so to speak, you have a character 
matrix.
That is, the contents of the [[1]][[1]][[1]] element is a character matrix 
that, according to the row and column labels, has 4 rows and 5 columns.
However, the matrix itself, as printed, has, apparently, 4 column in row one, 
not 5 -- and five quote marks in row 4, so the number of columns is ambiguous 
(quote marks have to be balanced).

None the less, assuming you really do have character matrices that you're 
trying to modify, I'd be inclined to take a brute force approach.

I would also use for() loops instead of lapply(), because the code will be 
easier to follow.

Do you know, or can you assume, the maximum depth of nesting? Let's say it's 
three.

Here's an outline. I can't test it without an actual object to work on, and it 
probably has some details wrong. My intent is to present the concept.
(I believe I have the 'next' statements in the right place...)

for (i1 in length(FF)) {
  ## the "1" in "ff1" means first level of nesting, not first element of the 
list
  ff1 <- FF[[i1]]

  if ( !is.list(ff1) ) {
 ## the current element is not nested list
 {apply the function that removes the appropriate rows}
  ## this 'next' statement is supposed to move us to the 2nd element of FF
 next
  } else {
## the current element (of FF) is a list, therefore, have to loop through 
its elements

  for (i2 in length(ff1)) {
 ff2 <- ff1[[i2]]

 if ( !is.list(ff2) ) {
## the current element is not a nested list
{apply the removal function}
next
} else {
## the current element is a nested list
   for (i3 in length(ff2) {
  ## if I've kept track correctly, we're now looking at the third level 
down of nesting,
  ## and if that's the max depth, we don't have to go any further

 etc, and close all the loops ---


This brute force approach consists of nested for() loops.
The outer loop is for the top level list.
The next nested loop is for the second level lists, within each element of the 
top level
The next nested loop is for the third level lists, within each second level 
element
Since not all elements are nested to the same depth, it has to be noticed when 
a non-list element is reached. That element gets modified and that level is 
done; move up to the previous level and continue to its next element.
At least, that's an approach that I think can work, but getting all the details 
correct will take some work.

--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
Lab cell 925-724-7509
 
 

On 11/2/18, 8:00 AM, "R-help on behalf of Ek Esawi" 
 wrote:

Hi All,

I have a list that is made up of nested lists, as shown below. I want
to remove all rows in each sub-list that start with an empty space,
that’s the first entry of a row is blank; for example, on
[[1]][[1]][[1]] Remove row 4,on [[1]][[1]][[3]] remove row 5, on
[[1]][[2]][[1]] remove row 6, etc.. All rows start with 2 digits/ 2
digits. My formula works on individual sublist but not the whole
list.. I know my indexing is wrong, but don’t know how to fix it.


> FF

[[1]]
[[1]][[1]]
[[1]][[1]][[1]]
[,1][,2]   [,3][,4] [,5]
[1,] "30/20"   "" ““   "-89"
[2,] "02/20"   "” ““   "-98"
[3,] "02/20"   “AAA” ““   "-84"
[4,] “  “ “  “   “
[[1]][[1]][[2]]
[,1][,2]
[1,] "02/23" “” : 29" “
[2,] "02/23" “” ." “
[3,] "02/23" “” " “
[4,] "02/23" “” "
[[1]][[1]][[3]]
[,1][,2][,3] [,4] [,5] [,6] [,7]
[1,] "01/09" “"“   “   “   "53"
[2,] "01/09" “” "   “   “   “   "403"
[3,] "01/09" “” "   “   “   “   "83"
[4,] "01/09" “” "   “   “   “   "783"
[5,] “  “  “”  3042742181"   “   “   “   “
[[1]][[2]]
[[1]][[2]][[1]]
[,1]  [,2] [,3] [,4] [,5]
[1,] ““   “   “   “” "
[2,] "Standard Purchases"  “   “   “   "
[3,] "24/90 "” “   "243"  "
[4,] "24/90 "” "   "143"  "
[5,] "24/91 "” " “   "143" “
[6,] ““   “   “   "792"
[[1]][[2]][[2]]
[,1][,2]
[1,] "02/23" “”: 31" “
[2,] "02/23" “”." “
[3,] "02/23" “” " “
[4,] "02/23" “”
[5,] "02/23" “”
[6,] "02/23" “” 20"
[7,] "02/23" “”  “
[8,] "02/23" “” "33"
[[1]][[3]]
[[1]][[3]][[1]]
[,1][,2]
[1,] "02/23" “”: 28" “
[2,] "02/23" “”." “
[3,] "02/23" “” " “
[4,] "02/23" “” "
[[1]][[3]][[2]]
[,1][,2][,3][,4] [,5] [,6] [,7][,8][,9]
[1,] "02/23" “” " “   “   "53" "
[2,] "02/24" “” " “   “   "
[3,] “  “  “  “   “   “   “  “  "1,241"
[4,] "02/24" "”  “   "33”

Re: [R] Remove specific rows from nested list of matrices

2018-11-02 Thread William Dunlap via R-help

Since you cannot show the data you have have to learn some R debugging
techniques.

Here is some data that look something like yours and I want to delete rows
of character
matrices whose first entry starts with a space.

  FF <- lapply(1:2,function(i)lapply(1:3, function(j) lapply(1:2,
function(k) if (identical(c(i,j,k), c(2L,3L,1L))) c(" X", "YZ") else
rbind(c("A", "BC", "DE"), c(" P", "QR", "ST")
  G <- lapply(FF, function(x) lapply(x, function (y) lapply(y, function(z)
z[grepl("^ ", z[,1]),])))
  #Error in z[, 1] : incorrect number of dimensions

If you don't recognize the problem right away, try setting
  options(error=recover)
which lets you look at objects and evaluate expressions at the time of the
error.   Look at ?recover for details.

  G <- lapply(FF, function(x) lapply(x, function (y) lapply(y, function(z)
z[grepl("^ ", z[,1]),])))
  #Error in z[, 1] : incorrect number of dimensions
  #
  #Enter a frame number, or 0 to exit
  #
  #1: lapply(FF, function(x) lapply(x, function(y) lapply(y, function(z)
z[grepl(
  #2: FUN(X[[i]], ...)
  #3: #1: lapply(x, function(y) lapply(y, function(z) z[grepl("^ ", z[,
1]), ]))
  #4: FUN(X[[i]], ...)
  #5: #1: lapply(y, function(z) z[grepl("^ ", z[, 1]), ])
  #6: FUN(X[[i]], ...)
  #7: #1: grepl("^ ", z[, 1])
  #
  Selection: 6
  #Called from: eval(substitute(browser(skipCalls = skip), list(skip = 7 -
which)),
  #  envir = sys.frame(which))
  #Browse[1]> objects()
  #[1] "z"
  #Browse[1]> str(z)
  # chr [1:2] " X" "YZ"
  #Browse[1]> str(z[,1])
  #Error during wrapup: incorrect number of dimensions
  #Browse[1]>

My guess is that your list includes a mix of matrices and vectors, perhaps
from not using drop=FALSE when you subscripted them earlier.  Add drop=FALSE
to all your calls to "[" when using matrices.






Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, Nov 2, 2018 at 10:50 AM, Ek Esawi  wrote:

> Thank you Jeff and Bert. I know i have to use dput add  provide a
> reproducible example. The problem is that the output,is huge, has many
> nested lists, and the info is private.
>
> Here is the first line of dput(FF) if it helps:
> dput(FF)
> list(list(list(structure(c("12/30 12/30", "01/02 01/02", "01/02 01/02",
>
> Thanks again--EK
> On Fri, Nov 2, 2018 at 11:21 AM Jeff Newmiller 
> wrote:
> >
> > Can you supply the output of
> >
> > dput(FF)
> >
> > ?
> >
> > On November 2, 2018 8:00:08 AM PDT, Ek Esawi  wrote:
> > >Hi All,
> > >
> > >I have a list that is made up of nested lists, as shown below. I want
> > >to remove all rows in each sub-list that start with an empty space,
> > >that’s the first entry of a row is blank; for example, on
> > >[[1]][[1]][[1]] Remove row 4,on [[1]][[1]][[3]] remove row 5, on
> > >[[1]][[2]][[1]] remove row 6, etc.. All rows start with 2 digits/ 2
> > >digits. My formula works on individual sublist but not the whole
> > >list.. I know my indexing is wrong, but don’t know how to fix it.
> > >
> > >
> > >> FF
> > >
> > >[[1]]
> > >[[1]][[1]]
> > >[[1]][[1]][[1]]
> > >[,1][,2]   [,3][,4] [,5]
> > >[1,] "30/20"   "" ““   "-89"
> > >[2,] "02/20"   "” ““   "-98"
> > >[3,] "02/20"   “AAA” ““   "-84"
> > >[4,] “  “ “  “   “
> > >[[1]][[1]][[2]]
> > >[,1][,2]
> > >[1,] "02/23" “” : 29" “
> > >[2,] "02/23" “” ." “
> > >[3,] "02/23" “” " “
> > >[4,] "02/23" “” "
> > >[[1]][[1]][[3]]
> > >[,1][,2][,3] [,4] [,5] [,6] [,7]
> > >[1,] "01/09" “"“   “   “   "53"
> > >[2,] "01/09" “” "   “   “   “   "403"
> > >[3,] "01/09" “” "   “   “   “   "83"
> > >[4,] "01/09" “” "   “   “   “   "783"
> > >[5,] “  “  “”  3042742181"   “   “   “   “
> > >[[1]][[2]]
> > >[[1]][[2]][[1]]
> > >[,1]  [,2] [,3] [,4] [,5]
> > >[1,] ““   “   “   “” "
> > >[2,] "Standard Purchases"  “   “   “   "
> > >[3,] "24/90 "” “   "243"  "
> > >[4,] "24/90 "” "   "143"  "
> > >[5,] "24/91 "” " “   "143" “
> > >[6,] ““   “   “   "792"
> > >[[1]][[2]][[2]]
> > >[,1][,2]
> > >[1,] "02/23" “”: 31" “
> > >[2,] "02/23" “”." “
> > >[3,] "02/23" “” " “
> > >[4,] "02/23" “”
> > >[5,] "02/23" “”
> > >[6,] "02/23" “” 20"
> > >[7,] "02/23" “”  “
> > >[8,] "02/23" “” "33"
> > >[[1]][[3]]
> > >[[1]][[3]][[1]]
> > >[,1][,2]
> > >[1,] "02/23" “”: 28" “
> > >[2,] "02/23" “”." “
> > >[3,] "02/23" “” " “
> > >[4,] "02/23" “” "
> > >[[1]][[3]][[2]]
> > >[,1][,2][,3][,4] [,5] [,6] [,7][,8][,9]
> > >[1,] "02/23" “” " “   “   "53" "
> > >[2,] "02/24" “” " “   “   "
> > >[3,] “  “  “  “   “   “   “  “  "1,241"
> > >[4,] "02/24" "”  “   "33”
> > >
> > >My Formula,:
> > >
> > >G <- lapply(FF, function(x) lapply(x, function (y) lapply(y,
> > >function(z)  z[grepl("^[0-9][0-9]/",z[,1]),])))
> > >
> > >The error: Error in z[, 1] : incorrect number of dimensions
> > >
> > >
> > >
> >

Re: [R] Remove specific rows from nested list of matrices

2018-11-02 Thread Jeff Newmiller

A partial dput is no help at all. A complete dput of part of your data is much 
more likely to be helpful, but only if you see the same problem in it as you do 
in the full data set.

As to private data... if you want data handling help in a public forum then you 
need to create a small set of data that illustrates the problem. If you have to 
manufacture the data by hand we don't care, but it is up to you to communicate 
a clear question somehow.

On November 2, 2018 10:50:06 AM PDT, Ek Esawi  wrote:
>Thank you Jeff and Bert. I know i have to use dput add  provide a
>reproducible example. The problem is that the output,is huge, has many
>nested lists, and the info is private.
>
>Here is the first line of dput(FF) if it helps:
>dput(FF)
>list(list(list(structure(c("12/30 12/30", "01/02 01/02", "01/02 01/02",
>
>Thanks again--EK
>On Fri, Nov 2, 2018 at 11:21 AM Jeff Newmiller
> wrote:
>>
>> Can you supply the output of
>>
>> dput(FF)
>>
>> ?
>>
>> On November 2, 2018 8:00:08 AM PDT, Ek Esawi 
>wrote:
>> >Hi All,
>> >
>> >I have a list that is made up of nested lists, as shown below. I
>want
>> >to remove all rows in each sub-list that start with an empty space,
>> >that’s the first entry of a row is blank; for example, on
>> >[[1]][[1]][[1]] Remove row 4,on [[1]][[1]][[3]] remove row 5, on
>> >[[1]][[2]][[1]] remove row 6, etc.. All rows start with 2 digits/ 2
>> >digits. My formula works on individual sublist but not the whole
>> >list.. I know my indexing is wrong, but don’t know how to fix it.
>> >
>> >
>> >> FF
>> >
>> >[[1]]
>> >[[1]][[1]]
>> >[[1]][[1]][[1]]
>> >[,1][,2]   [,3][,4] [,5]
>> >[1,] "30/20"   "" ““   "-89"
>> >[2,] "02/20"   "” ““   "-98"
>> >[3,] "02/20"   “AAA” ““   "-84"
>> >[4,] “  “ “  “   “
>> >[[1]][[1]][[2]]
>> >[,1][,2]
>> >[1,] "02/23" “” : 29" “
>> >[2,] "02/23" “” ." “
>> >[3,] "02/23" “” " “
>> >[4,] "02/23" “” "
>> >[[1]][[1]][[3]]
>> >[,1][,2][,3] [,4] [,5] [,6] [,7]
>> >[1,] "01/09" “"“   “   “   "53"
>> >[2,] "01/09" “” "   “   “   “   "403"
>> >[3,] "01/09" “” "   “   “   “   "83"
>> >[4,] "01/09" “” "   “   “   “   "783"
>> >[5,] “  “  “”  3042742181"   “   “   “   “
>> >[[1]][[2]]
>> >[[1]][[2]][[1]]
>> >[,1]  [,2] [,3] [,4] [,5]
>> >[1,] ““   “   “   “” "
>> >[2,] "Standard Purchases"  “   “   “   "
>> >[3,] "24/90 "” “   "243"  "
>> >[4,] "24/90 "” "   "143"  "
>> >[5,] "24/91 "” " “   "143" “
>> >[6,] ““   “   “   "792"
>> >[[1]][[2]][[2]]
>> >[,1][,2]
>> >[1,] "02/23" “”: 31" “
>> >[2,] "02/23" “”." “
>> >[3,] "02/23" “” " “
>> >[4,] "02/23" “”
>> >[5,] "02/23" “”
>> >[6,] "02/23" “” 20"
>> >[7,] "02/23" “”  “
>> >[8,] "02/23" “” "33"
>> >[[1]][[3]]
>> >[[1]][[3]][[1]]
>> >[,1][,2]
>> >[1,] "02/23" “”: 28" “
>> >[2,] "02/23" “”." “
>> >[3,] "02/23" “” " “
>> >[4,] "02/23" “” "
>> >[[1]][[3]][[2]]
>> >[,1][,2][,3][,4] [,5] [,6] [,7][,8][,9]
>> >[1,] "02/23" “” " “   “   "53" "
>> >[2,] "02/24" “” " “   “   "
>> >[3,] “  “  “  “   “   “   “  “  "1,241"
>> >[4,] "02/24" "”  “   "33”
>> >
>> >My Formula,:
>> >
>> >G <- lapply(FF, function(x) lapply(x, function (y) lapply(y,
>> >function(z)  z[grepl("^[0-9][0-9]/",z[,1]),])))
>> >
>> >The error: Error in z[, 1] : incorrect number of dimensions
>> >
>> >
>> >
>> >Thanks in advance--EK
>> >
>> >__
>> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >PLEASE do read the posting guide
>> >http://www.R-project.org/posting-guide.html
>> >and provide commented, minimal, self-contained, reproducible code.
>>
>> --
>> Sent from my phone. Please excuse my brevity.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove specific rows from nested list of matrices

2018-11-02 Thread Ek Esawi

Thank you Jeff and Bert. I know i have to use dput add  provide a
reproducible example. The problem is that the output,is huge, has many
nested lists, and the info is private.

Here is the first line of dput(FF) if it helps:
dput(FF)
list(list(list(structure(c("12/30 12/30", "01/02 01/02", "01/02 01/02",

Thanks again--EK
On Fri, Nov 2, 2018 at 11:21 AM Jeff Newmiller  wrote:
>
> Can you supply the output of
>
> dput(FF)
>
> ?
>
> On November 2, 2018 8:00:08 AM PDT, Ek Esawi  wrote:
> >Hi All,
> >
> >I have a list that is made up of nested lists, as shown below. I want
> >to remove all rows in each sub-list that start with an empty space,
> >that’s the first entry of a row is blank; for example, on
> >[[1]][[1]][[1]] Remove row 4,on [[1]][[1]][[3]] remove row 5, on
> >[[1]][[2]][[1]] remove row 6, etc.. All rows start with 2 digits/ 2
> >digits. My formula works on individual sublist but not the whole
> >list.. I know my indexing is wrong, but don’t know how to fix it.
> >
> >
> >> FF
> >
> >[[1]]
> >[[1]][[1]]
> >[[1]][[1]][[1]]
> >[,1][,2]   [,3][,4] [,5]
> >[1,] "30/20"   "" ““   "-89"
> >[2,] "02/20"   "” ““   "-98"
> >[3,] "02/20"   “AAA” ““   "-84"
> >[4,] “  “ “  “   “
> >[[1]][[1]][[2]]
> >[,1][,2]
> >[1,] "02/23" “” : 29" “
> >[2,] "02/23" “” ." “
> >[3,] "02/23" “” " “
> >[4,] "02/23" “” "
> >[[1]][[1]][[3]]
> >[,1][,2][,3] [,4] [,5] [,6] [,7]
> >[1,] "01/09" “"“   “   “   "53"
> >[2,] "01/09" “” "   “   “   “   "403"
> >[3,] "01/09" “” "   “   “   “   "83"
> >[4,] "01/09" “” "   “   “   “   "783"
> >[5,] “  “  “”  3042742181"   “   “   “   “
> >[[1]][[2]]
> >[[1]][[2]][[1]]
> >[,1]  [,2] [,3] [,4] [,5]
> >[1,] ““   “   “   “” "
> >[2,] "Standard Purchases"  “   “   “   "
> >[3,] "24/90 "” “   "243"  "
> >[4,] "24/90 "” "   "143"  "
> >[5,] "24/91 "” " “   "143" “
> >[6,] ““   “   “   "792"
> >[[1]][[2]][[2]]
> >[,1][,2]
> >[1,] "02/23" “”: 31" “
> >[2,] "02/23" “”." “
> >[3,] "02/23" “” " “
> >[4,] "02/23" “”
> >[5,] "02/23" “”
> >[6,] "02/23" “” 20"
> >[7,] "02/23" “”  “
> >[8,] "02/23" “” "33"
> >[[1]][[3]]
> >[[1]][[3]][[1]]
> >[,1][,2]
> >[1,] "02/23" “”: 28" “
> >[2,] "02/23" “”." “
> >[3,] "02/23" “” " “
> >[4,] "02/23" “” "
> >[[1]][[3]][[2]]
> >[,1][,2][,3][,4] [,5] [,6] [,7][,8][,9]
> >[1,] "02/23" “” " “   “   "53" "
> >[2,] "02/24" “” " “   “   "
> >[3,] “  “  “  “   “   “   “  “  "1,241"
> >[4,] "02/24" "”  “   "33”
> >
> >My Formula,:
> >
> >G <- lapply(FF, function(x) lapply(x, function (y) lapply(y,
> >function(z)  z[grepl("^[0-9][0-9]/",z[,1]),])))
> >
> >The error: Error in z[, 1] : incorrect number of dimensions
> >
> >
> >
> >Thanks in advance--EK
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove specific rows from nested list of matrices

2018-11-02 Thread Jeff Newmiller

Can you supply the output of

dput(FF)

?

On November 2, 2018 8:00:08 AM PDT, Ek Esawi  wrote:
>Hi All,
>
>I have a list that is made up of nested lists, as shown below. I want
>to remove all rows in each sub-list that start with an empty space,
>that’s the first entry of a row is blank; for example, on
>[[1]][[1]][[1]] Remove row 4,on [[1]][[1]][[3]] remove row 5, on
>[[1]][[2]][[1]] remove row 6, etc.. All rows start with 2 digits/ 2
>digits. My formula works on individual sublist but not the whole
>list.. I know my indexing is wrong, but don’t know how to fix it.
>
>
>> FF
>
>[[1]]
>[[1]][[1]]
>[[1]][[1]][[1]]
>[,1][,2]   [,3][,4] [,5]
>[1,] "30/20"   "" ““   "-89"
>[2,] "02/20"   "” ““   "-98"
>[3,] "02/20"   “AAA” ““   "-84"
>[4,] “  “ “  “   “
>[[1]][[1]][[2]]
>[,1][,2]
>[1,] "02/23" “” : 29" “
>[2,] "02/23" “” ." “
>[3,] "02/23" “” " “
>[4,] "02/23" “” "
>[[1]][[1]][[3]]
>[,1][,2][,3] [,4] [,5] [,6] [,7]
>[1,] "01/09" “"“   “   “   "53"
>[2,] "01/09" “” "   “   “   “   "403"
>[3,] "01/09" “” "   “   “   “   "83"
>[4,] "01/09" “” "   “   “   “   "783"
>[5,] “  “  “”  3042742181"   “   “   “   “
>[[1]][[2]]
>[[1]][[2]][[1]]
>[,1]  [,2] [,3] [,4] [,5]
>[1,] ““   “   “   “” "
>[2,] "Standard Purchases"  “   “   “   "
>[3,] "24/90 "” “   "243"  "
>[4,] "24/90 "” "   "143"  "
>[5,] "24/91 "” " “   "143" “
>[6,] ““   “   “   "792"
>[[1]][[2]][[2]]
>[,1][,2]
>[1,] "02/23" “”: 31" “
>[2,] "02/23" “”." “
>[3,] "02/23" “” " “
>[4,] "02/23" “”
>[5,] "02/23" “”
>[6,] "02/23" “” 20"
>[7,] "02/23" “”  “
>[8,] "02/23" “” "33"
>[[1]][[3]]
>[[1]][[3]][[1]]
>[,1][,2]
>[1,] "02/23" “”: 28" “
>[2,] "02/23" “”." “
>[3,] "02/23" “” " “
>[4,] "02/23" “” "
>[[1]][[3]][[2]]
>[,1][,2][,3][,4] [,5] [,6] [,7][,8][,9]
>[1,] "02/23" “” " “   “   "53" "
>[2,] "02/24" “” " “   “   "
>[3,] “  “  “  “   “   “   “  “  "1,241"
>[4,] "02/24" "”  “   "33”
>
>My Formula,:
>
>G <- lapply(FF, function(x) lapply(x, function (y) lapply(y,
>function(z)  z[grepl("^[0-9][0-9]/",z[,1]),])))
>
>The error: Error in z[, 1] : incorrect number of dimensions
>
>
>
>Thanks in advance--EK
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove specific rows from nested list of matrices

2018-11-02 Thread Bert Gunter

If you learn to use dput() to provide useful examples in your posts, you
are more likely to receive useful help. It is rather difficult to make much
sense of your messy text, though some brave soul(s) may try to help.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Nov 2, 2018 at 8:00 AM Ek Esawi  wrote:

> Hi All,
>
> I have a list that is made up of nested lists, as shown below. I want
> to remove all rows in each sub-list that start with an empty space,
> that’s the first entry of a row is blank; for example, on
> [[1]][[1]][[1]] Remove row 4,on [[1]][[1]][[3]] remove row 5, on
> [[1]][[2]][[1]] remove row 6, etc.. All rows start with 2 digits/ 2
> digits. My formula works on individual sublist but not the whole
> list.. I know my indexing is wrong, but don’t know how to fix it.
>
>
> > FF
>
> [[1]]
> [[1]][[1]]
> [[1]][[1]][[1]]
> [,1][,2]   [,3][,4] [,5]
> [1,] "30/20"   "" ““   "-89"
> [2,] "02/20"   "” ““   "-98"
> [3,] "02/20"   “AAA” ““   "-84"
> [4,] “  “ “  “   “
> [[1]][[1]][[2]]
> [,1][,2]
> [1,] "02/23" “” : 29" “
> [2,] "02/23" “” ." “
> [3,] "02/23" “” " “
> [4,] "02/23" “” "
> [[1]][[1]][[3]]
> [,1][,2][,3] [,4] [,5] [,6] [,7]
> [1,] "01/09" “"“   “   “   "53"
> [2,] "01/09" “” "   “   “   “   "403"
> [3,] "01/09" “” "   “   “   “   "83"
> [4,] "01/09" “” "   “   “   “   "783"
> [5,] “  “  “”  3042742181"   “   “   “   “
> [[1]][[2]]
> [[1]][[2]][[1]]
> [,1]  [,2] [,3] [,4] [,5]
> [1,] ““   “   “   “” "
> [2,] "Standard Purchases"  “   “   “   "
> [3,] "24/90 "” “   "243"  "
> [4,] "24/90 "” "   "143"  "
> [5,] "24/91 "” " “   "143" “
> [6,] ““   “   “   "792"
> [[1]][[2]][[2]]
> [,1][,2]
> [1,] "02/23" “”: 31" “
> [2,] "02/23" “”." “
> [3,] "02/23" “” " “
> [4,] "02/23" “”
> [5,] "02/23" “”
> [6,] "02/23" “” 20"
> [7,] "02/23" “”  “
> [8,] "02/23" “” "33"
> [[1]][[3]]
> [[1]][[3]][[1]]
> [,1][,2]
> [1,] "02/23" “”: 28" “
> [2,] "02/23" “”." “
> [3,] "02/23" “” " “
> [4,] "02/23" “” "
> [[1]][[3]][[2]]
> [,1][,2][,3][,4] [,5] [,6] [,7][,8][,9]
> [1,] "02/23" “” " “   “   "53" "
> [2,] "02/24" “” " “   “   "
> [3,] “  “  “  “   “   “   “  “  "1,241"
> [4,] "02/24" "”  “   "33”
>
> My Formula,:
>
> G <- lapply(FF, function(x) lapply(x, function (y) lapply(y,
> function(z)  z[grepl("^[0-9][0-9]/",z[,1]),])))
>
> The error: Error in z[, 1] : incorrect number of dimensions
>
>
>
> Thanks in advance--EK
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Remove specific rows from nested list of matrices

2018-11-02 Thread Ek Esawi

Hi All,

I have a list that is made up of nested lists, as shown below. I want
to remove all rows in each sub-list that start with an empty space,
that’s the first entry of a row is blank; for example, on
[[1]][[1]][[1]] Remove row 4,on [[1]][[1]][[3]] remove row 5, on
[[1]][[2]][[1]] remove row 6, etc.. All rows start with 2 digits/ 2
digits. My formula works on individual sublist but not the whole
list.. I know my indexing is wrong, but don’t know how to fix it.


> FF

[[1]]
[[1]][[1]]
[[1]][[1]][[1]]
[,1][,2]   [,3][,4] [,5]
[1,] "30/20"   "" ““   "-89"
[2,] "02/20"   "” ““   "-98"
[3,] "02/20"   “AAA” ““   "-84"
[4,] “  “ “  “   “
[[1]][[1]][[2]]
[,1][,2]
[1,] "02/23" “” : 29" “
[2,] "02/23" “” ." “
[3,] "02/23" “” " “
[4,] "02/23" “” "
[[1]][[1]][[3]]
[,1][,2][,3] [,4] [,5] [,6] [,7]
[1,] "01/09" “"“   “   “   "53"
[2,] "01/09" “” "   “   “   “   "403"
[3,] "01/09" “” "   “   “   “   "83"
[4,] "01/09" “” "   “   “   “   "783"
[5,] “  “  “”  3042742181"   “   “   “   “
[[1]][[2]]
[[1]][[2]][[1]]
[,1]  [,2] [,3] [,4] [,5]
[1,] ““   “   “   “” "
[2,] "Standard Purchases"  “   “   “   "
[3,] "24/90 "” “   "243"  "
[4,] "24/90 "” "   "143"  "
[5,] "24/91 "” " “   "143" “
[6,] ““   “   “   "792"
[[1]][[2]][[2]]
[,1][,2]
[1,] "02/23" “”: 31" “
[2,] "02/23" “”." “
[3,] "02/23" “” " “
[4,] "02/23" “”
[5,] "02/23" “”
[6,] "02/23" “” 20"
[7,] "02/23" “”  “
[8,] "02/23" “” "33"
[[1]][[3]]
[[1]][[3]][[1]]
[,1][,2]
[1,] "02/23" “”: 28" “
[2,] "02/23" “”." “
[3,] "02/23" “” " “
[4,] "02/23" “” "
[[1]][[3]][[2]]
[,1][,2][,3][,4] [,5] [,6] [,7][,8][,9]
[1,] "02/23" “” " “   “   "53" "
[2,] "02/24" “” " “   “   "
[3,] “  “  “  “   “   “   “  “  "1,241"
[4,] "02/24" "”  “   "33”

My Formula,:

G <- lapply(FF, function(x) lapply(x, function (y) lapply(y,
function(z)  z[grepl("^[0-9][0-9]/",z[,1]),])))

The error: Error in z[, 1] : incorrect number of dimensions



Thanks in advance--EK

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove text from nested list

2018-10-25 Thread Ek Esawi

Thank you Bert and Peter. My apology for posting poor code. I cannot
create a reproducible example of my data, but i hope the list indices
as shown below helps you understand my question.. My regex pattern on
my previous post works correctly because i tested it on a few sublists
and it worked, but did not work for the all lists.
I tend to think a set of lapply and one apply function will work, but
not sure how to do that.

i tried, but obviously i don't understand nested apply functions.

> lapply(mylsit, function(x) gsub(pattern,"",x))
> lapply(mylist, function(x) lapply (x, function(y)  gsub(mypattern,"",y)))
> lappqaly(mylist, function(x) lapply (x, function(y) 
> apply(y,2,gsub(mypattern,"",y


[[1]]

[[1]][[1]]
[[1]][[1]][[1]]
[[1]][[1]][[2]]
[[1]][[1]][[3]]

[[1]][[2]]
[[1]][[2]][[1]]
[[1]][[2]][[2]]

[[1]][[3]]
[[1]][[3]][[1]]
[[1]][[3]][[2]]

[[1]][[4]]
[[1]][[4]][[1]]
[[1]][[4]][[2]]
[[1]][[4]][[3]]
[[1]][[4]][[4]]
[[1]][[4]][[5]]

[[1]][[5]]
[[1]][[5]][[1]]
[[1]][[5]][[2]]
[[1]][[5]][[3]]
[[1]][[5]][[4]]

[[1]][[6]]
[[1]][[6]][[1]]
[[1]][[6]][[2]]
[[1]][[6]][[3]]

[[1]][[7]]
[[1]][[7]][[1]]
[[1]][[7]][[2]]
[[1]][[7]][[3]]

[[1]][[8]]
[[1]][[8]][[1]]
[[1]][[8]][[3]]
On Thu, Oct 25, 2018 at 9:34 PM Bert Gunter  wrote:
>
> 1. Please learn how to use dput() to provide examples to responders. There's 
> not much we can do with a text printout (at least without some work that I 
> don't care to do).
>
> 2. Do you know what mylist[[c(1,2,1)]] means? If not, read ?(Extract) and 
> note in particular:
> "[[ can be applied recursively to lists, so that if the single index i is a 
> vector of length p, alist[[i]] is equivalent to alist[[i1]]...[[ip]] 
> providing all but the final indexing results in a list."
>
> As your intent is unclear -- no reproducible example showing the desired 
> result -- I would suggest just using list indexing to access the matrices you 
> wish to change. But maybe this does not satisfy your vague request.
>
> Also, something seems screwy in the example you showed: For example, the 
> [[1]][[2]][[1]] component indicates a 2 x 5 matrix, but I see only 3 columns 
> of text. Am I missing something?
>
> Cheers,
> Bert
>
>
>
> On Thu, Oct 25, 2018 at 6:04 PM Ek Esawi  wrote:
>>
>> Hi All—
>>
>> I have a list that contains multiple sub-lists and each sub-list
>> contains multiple  sub(sub-lists), each of the sub(sub-lists) is made
>> up of matrices of text. I want to replace some of the text in some
>> parts in the matrices on the list. I tried gsub and stringr,
>> str_remove, but nothing seems to work
>>
>> I tried:
>>
>> lapply(mylist, function(x) lapply(x, function(y)
>> gsub("[0-9][0-9]/[0-9[0-9].*com","",y)))
>> lapply(mylist, function(x) str_remove(x,"[0-9][0-9]/[0-9[0-9].*com"))
>>
>> Any help is greatly apprercaited.
>>
>>
>>
>> mylist—this is just an example
>>
>> [[1]]
>> [[1]][[1]]
>> [[1]][[1]][[1]]
>> [,1]  [,2]  [,3]  [,4] [,5]
>> [1,] "12/30 12/30"  "ABABABABABAB"  "8.00"
>> [2,] "01/02 01/02"  "”.   “99"
>> [3,] "01/02 01/02"  "CACACACACACC” "55.97"
>>
>> [[1]][[1]][[2]]
>> [,1]  [,2]
>> [1,] "12/30 12/30" "DDD” “29"
>> [2,] "12/30 12/30"  :GGG” “333”
>>
>> [[1]][[2]]
>> [[1]][[2]][[1]]
>> [,1]  [,2]  [,3] [,4]  [,5]
>> [1,]  "01/02 01/02" "ThankYou" “23”
>> [2,] "01/02 01/02"  "Standard data"  "251"
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove text from nested list

2018-10-25 Thread William Dunlap via R-help

If your matrices are at various depths in the list, try rapply().  E.g.,

>  L <- list( A = list( a1 = matrix(c("AAA","AB", "AAB","AC"),2,2),
a2=c("AAx")), list(B = c("AAb1AAA","AAb2")))
> str(L)
List of 2
 $ A:List of 2
  ..$ a1: chr [1:2, 1:2] "AAA" "AB" "AAB" "AC"
  ..$ a2: chr "AAx"
 $  :List of 1
  ..$ B: chr [1:2] "AAb1AAA" "AAb2"
> str(rapply(L, function(x)gsub("A+", "-", x), how="replace"))
List of 2
 $ A:List of 2
  ..$ a1: chr [1:2, 1:2] "-" "-B" "-B" "-C"
  ..$ a2: chr "-x"
 $  :List of 1
  ..$ B: chr [1:2] "-b1-" "-b2"
> # only apply f to matrices in the list:
> str(rapply(L, function(x)gsub("A+", "-", x), classes="matrix",
how="replace"))
List of 2
 $ A:List of 2
  ..$ a1: chr [1:2, 1:2] "-" "-B" "-B" "-C"
  ..$ a2: chr "AAx"
 $  :List of 1
  ..$ B: chr [1:2] "AAb1AAA" "AAb2"


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Oct 25, 2018 at 6:04 PM, Ek Esawi  wrote:

> Hi All—
>
> I have a list that contains multiple sub-lists and each sub-list
> contains multiple  sub(sub-lists), each of the sub(sub-lists) is made
> up of matrices of text. I want to replace some of the text in some
> parts in the matrices on the list. I tried gsub and stringr,
> str_remove, but nothing seems to work
>
> I tried:
>
> lapply(mylist, function(x) lapply(x, function(y)
> gsub("[0-9][0-9]/[0-9[0-9].*com","",y)))
> lapply(mylist, function(x) str_remove(x,"[0-9][0-9]/[0-9[0-9].*com"))
>
> Any help is greatly apprercaited.
>
>
>
> mylist—this is just an example
>
> [[1]]
> [[1]][[1]]
> [[1]][[1]][[1]]
> [,1]  [,2]  [,3]  [,4] [,5]
> [1,] "12/30 12/30"  "ABABABABABAB"  "8.00"
> [2,] "01/02 01/02"  "”.   “99"
> [3,] "01/02 01/02"  "CACACACACACC” "55.97"
>
> [[1]][[1]][[2]]
> [,1]  [,2]
> [1,] "12/30 12/30" "DDD” “29"
> [2,] "12/30 12/30"  :GGG” “333”
>
> [[1]][[2]]
> [[1]][[2]][[1]]
> [,1]  [,2]  [,3] [,4]  [,5]
> [1,]  "01/02 01/02" "ThankYou" “23”
> [2,] "01/02 01/02"  "Standard data"  "251"
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove text from nested list

2018-10-25 Thread Bert Gunter

1. Please learn how to use dput() to provide examples to responders.
There's not much we can do with a text printout (at least without some work
that I don't care to do).

2. Do you know what mylist[[c(1,2,1)]] means? If not, read ?(Extract) and
note in particular:
"[[ can be applied recursively to lists, so that if the single index i is a
vector of length p, alist[[i]] is equivalent to alist[[i1]]...[[ip]] providing
all but the final indexing results in a list."

As your intent is unclear -- no reproducible example showing the desired
result -- I would suggest just using list indexing to access the matrices
you wish to change. But maybe this does not satisfy your vague request.

Also, something seems screwy in the example you showed: For example, the
[[1]][[2]][[1]] component indicates a 2 x 5 matrix, but I see only 3
columns of text. Am I missing something?

Cheers,
Bert

On Thu, Oct 25, 2018 at 6:04 PM Ek Esawi  wrote:

> Hi All—
>
> I have a list that contains multiple sub-lists and each sub-list
> contains multiple  sub(sub-lists), each of the sub(sub-lists) is made
> up of matrices of text. I want to replace some of the text in some
> parts in the matrices on the list. I tried gsub and stringr,
> str_remove, but nothing seems to work
>
> I tried:
>
> lapply(mylist, function(x) lapply(x, function(y)
> gsub("[0-9][0-9]/[0-9[0-9].*com","",y)))
> lapply(mylist, function(x) str_remove(x,"[0-9][0-9]/[0-9[0-9].*com"))
>
> Any help is greatly apprercaited.
>
>
>
> mylist—this is just an example
>
> [[1]]
> [[1]][[1]]
> [[1]][[1]][[1]]
> [,1]  [,2]  [,3]  [,4] [,5]
> [1,] "12/30 12/30"  "ABABABABABAB"  "8.00"
> [2,] "01/02 01/02"  "”.   “99"
> [3,] "01/02 01/02"  "CACACACACACC” "55.97"
>
> [[1]][[1]][[2]]
> [,1]  [,2]
> [1,] "12/30 12/30" "DDD” “29"
> [2,] "12/30 12/30"  :GGG” “333”
>
> [[1]][[2]]
> [[1]][[2]][[1]]
> [,1]  [,2]  [,3] [,4]  [,5]
> [1,]  "01/02 01/02" "ThankYou" “23”
> [2,] "01/02 01/02"  "Standard data"  "251"
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove text from nested list

2018-10-25 Thread Peter Langfelder

You should be more specific about what you want to replace and with
what. The pattern you use, namely "[0-9][0-9]/[0-9[0-9].*com", does
not (AFAICS) match any of the strings in your data, so don't be
surprised that your commands do not change anything.

If you have a correct pattern and replacement and all lists have depth
3, using something like

lapply(mylist, lapply, lapply, function(y) gsub(pattern, replacement, y))

should work. If your list has a variable depth, I would use a
recursive function, something like

recursiveGSub = function(x, pattern, replacement)
{
  if (is.atomic(x)) gsub(pattern, replacement, x) else lapply(x,
recursiveGSub, pattern, replacement)
}

Example:

lst = list("a001", list("b001", list("c001", "d001")))

lst
recursiveGSub(lst, "00", "")


HTH,

Peter
On Thu, Oct 25, 2018 at 6:04 PM Ek Esawi  wrote:
>
> Hi All—
>
> I have a list that contains multiple sub-lists and each sub-list
> contains multiple  sub(sub-lists), each of the sub(sub-lists) is made
> up of matrices of text. I want to replace some of the text in some
> parts in the matrices on the list. I tried gsub and stringr,
> str_remove, but nothing seems to work
>
> I tried:
>
> lapply(mylist, function(x) lapply(x, function(y)
> gsub("[0-9][0-9]/[0-9[0-9].*com","",y)))
> lapply(mylist, function(x) str_remove(x,"[0-9][0-9]/[0-9[0-9].*com"))
>
> Any help is greatly apprercaited.
>
>
>
> mylist—this is just an example
>
> [[1]]
> [[1]][[1]]
> [[1]][[1]][[1]]
> [,1]  [,2]  [,3]  [,4] [,5]
> [1,] "12/30 12/30"  "ABABABABABAB"  "8.00"
> [2,] "01/02 01/02"  "”.   “99"
> [3,] "01/02 01/02"  "CACACACACACC” "55.97"
>
> [[1]][[1]][[2]]
> [,1]  [,2]
> [1,] "12/30 12/30" "DDD” “29"
> [2,] "12/30 12/30"  :GGG” “333”
>
> [[1]][[2]]
> [[1]][[2]][[1]]
> [,1]  [,2]  [,3] [,4]  [,5]
> [1,]  "01/02 01/02" "ThankYou" “23”
> [2,] "01/02 01/02"  "Standard data"  "251"
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] remove text from nested list

2018-10-25 Thread Ek Esawi

Hi All—

I have a list that contains multiple sub-lists and each sub-list
contains multiple  sub(sub-lists), each of the sub(sub-lists) is made
up of matrices of text. I want to replace some of the text in some
parts in the matrices on the list. I tried gsub and stringr,
str_remove, but nothing seems to work

I tried:

lapply(mylist, function(x) lapply(x, function(y)
gsub("[0-9][0-9]/[0-9[0-9].*com","",y)))
lapply(mylist, function(x) str_remove(x,"[0-9][0-9]/[0-9[0-9].*com"))

Any help is greatly apprercaited.



mylist—this is just an example

[[1]]
[[1]][[1]]
[[1]][[1]][[1]]
[,1]  [,2]  [,3]  [,4] [,5]
[1,] "12/30 12/30"  "ABABABABABAB"  "8.00"
[2,] "01/02 01/02"  "”.   “99"
[3,] "01/02 01/02"  "CACACACACACC” "55.97"

[[1]][[1]][[2]]
[,1]  [,2]
[1,] "12/30 12/30" "DDD” “29"
[2,] "12/30 12/30"  :GGG” “333”

[[1]][[2]]
[[1]][[2]][[1]]
[,1]  [,2]  [,3] [,4]  [,5]
[1,]  "01/02 01/02" "ThankYou" “23”
[2,] "01/02 01/02"  "Standard data"  "251"

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove plot axis values in dotplot graph

2018-09-11 Thread Jim Lemon

Hi Abou,
Surprisingly you can't omit the x axis in dotchart. This hack will work:

sink("dotchar_noax.R")
sink()

Edit the resulting file by joining the first two lines with the
assignment symbol (<-), delete the two lines at the bottom and comment
out the line "axis(1)".

source("dotchart.noax.R")
dotchart.noax(scores,cex=1.5, pch = 18, col=c(1:3), xaxt = "n", main="Dot Plot
 child’s cough data", xlab="cough Scores")
library(plotrix)
staxlab(1,0:16)

I used "staxlab" so that you could have all of the labels 0:16.

Jim


On Wed, Sep 12, 2018 at 5:57 AM AbouEl-Makarim Aboueissa
 wrote:
>
> Dear All:
>
> One more thing. I want to Remove the plot x-axis values in dotplot graph. I
> am trying to use xaxt = "n", but it seems NOT working. Also after removing
> the x-axis values, I want to use the command axis(1, at=0:16, cex.axis=1)
> to add x-axis values from 0 to 16, but it seems not working as expect.
>
>
>
> Honey.Dosage<-c(12,11,15,11,10,13,10,4,15,16,9,14,10,6,10,8,11,12,12,8,12,9,11,15,10,15,9,13,8,12,10,8,9,5,12)
>
> DM.Dosage<-c(4,6,9,4,7,7,7,9,12,10,11,6,3,4,9,12,7,6,8,12,12,4,12,13,7,10,13,9,4,4,10,15,9)
>
> No.Dosage<-c(5,8,6,1,0,8,12,8,7,7,1,6,7,7,12,7,9,7,9,5,11,9,5,6,8,8,6,7,10,9,4,8,7,3,1,4,3)
>
> scores<-c(Honey.Dosage,DM.Dosage,No.Dosage)
>
> min(scores)
> max(scores)
>
> dotchart(scores,cex=1.5, pch = 18, col=c(1:3), xaxt = "n", main="Dot Plot
> child’s cough data", xlab="cough Scores")
>
> axis(1, at=0:16, cex.axis=1.5)
>
>
>
>
> with many thanks
> abou
> __
>
>
> *AbouEl-Makarim Aboueissa, PhD*
>
> *Professor of Statistics*
> *Graduate Coordinator*
>
> *Department of Mathematics and Statistics*
> *University of Southern Maine*
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Remove plot axis values in dotplot graph

2018-09-11 Thread AbouEl-Makarim Aboueissa

Dear All:

One more thing. I want to Remove the plot x-axis values in dotplot graph. I
am trying to use xaxt = "n", but it seems NOT working. Also after removing
the x-axis values, I want to use the command axis(1, at=0:16, cex.axis=1)
to add x-axis values from 0 to 16, but it seems not working as expect.



Honey.Dosage<-c(12,11,15,11,10,13,10,4,15,16,9,14,10,6,10,8,11,12,12,8,12,9,11,15,10,15,9,13,8,12,10,8,9,5,12)

DM.Dosage<-c(4,6,9,4,7,7,7,9,12,10,11,6,3,4,9,12,7,6,8,12,12,4,12,13,7,10,13,9,4,4,10,15,9)

No.Dosage<-c(5,8,6,1,0,8,12,8,7,7,1,6,7,7,12,7,9,7,9,5,11,9,5,6,8,8,6,7,10,9,4,8,7,3,1,4,3)

scores<-c(Honey.Dosage,DM.Dosage,No.Dosage)

min(scores)
max(scores)

dotchart(scores,cex=1.5, pch = 18, col=c(1:3), xaxt = "n", main="Dot Plot
child’s cough data", xlab="cough Scores")

axis(1, at=0:16, cex.axis=1.5)




with many thanks
abou
__


*AbouEl-Makarim Aboueissa, PhD*

*Professor of Statistics*
*Graduate Coordinator*

*Department of Mathematics and Statistics*
*University of Southern Maine*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove rows of a matrix by part of its row name

2018-05-22 Thread William Dunlap via R-help

I think it is simpler to use !grepl() instead of -grep() here, since
subscripting with logicals works properly when there are no matches.
Also, since mat is a matrix, add the argument drop=FALSE so the
result is a matrix when all but one rows are omitted.  E.g.,

> mat <- matrix(1:6, nrow=3, ncol=2,
dimnames=list(c("One","Two","Three"),c("A","B")))
> str(mat[ -grep("T", rownames(mat)), ]) # bad, not a matrix if only one
row wanted
 Named int [1:2] 1 4
 - attr(*, "names")= chr [1:2] "A" "B"
> str(mat[ -grep("X", rownames(mat)), ]) # bad, zero-row matrix if no
unwanted rows
 int[0 , 1:2]
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:2] "A" "B"
> str(mat[ !grepl("T", rownames(mat)), , drop=FALSE]) # good, one row
matrix if only one row wanted
 int [1, 1:2] 1 4
 - attr(*, "dimnames")=List of 2
  ..$ : chr "One"
  ..$ : chr [1:2] "A" "B"
> str(mat[ !grepl("X", rownames(mat)), , drop=FALSE]) # good, entire matrix
is no unwanted rows
 int [1:3, 1:2] 1 2 3 4 5 6
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:3] "One" "Two" "Three"
  ..$ : chr [1:2] "A" "B"



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Tue, May 22, 2018 at 4:34 AM, Rui Barradas  wrote:

> Hello,
>
> Use grep to get the row indices and then subset with a *negative* index to
> remove those rows.
>
> rn <- scan(what = character(), text = "
> 70/556
> 71.1/280
> 72.1/556
> 72.1/343
> 73.1/390
> 73.1/556
> ")
>
> mat <- matrix(rnorm(6*6), nrow = 6)
> row.names(mat) <- rn
>
> inx <- grep("73\\.", row.names(mat))
>
> new_mat <- mat[-inx, ]
> new_mat
>
>
> Hope this helps,
>
> Rui Barradas
>
>
> On 5/22/2018 11:48 AM, Ahmed Serag wrote:
>
>> Dear R-experts,
>>
>>
>> How can I remove a certain feature or observation by a part of its name.
>> To be clear, I have a matrix with 766 observations as a rows. The row names
>> are like this
>>
>> 70/556
>> 71.1/280
>> 72.1/556
>> 72.1/343
>> 73.1/390
>> 73.1/556
>> Now I would like to remove all the rows that contain the text 73.1
>>
>> Any ideas or suggestion please ?
>>
>>
>> Regards
>>
>>
>>
>>
>> **
>>
>> Ahmed Serag
>>
>> Analytical Chemistry Department
>>
>> Faculty of Pharmacy
>>
>> Al-Azhar University
>>
>> Cairo
>>
>> Egypt
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove rows of a matrix by part of its row name

2018-05-22 Thread Ahmed Serag


Thanks a lot. The code works great.

Regards

Ahmed

**

Ahmed Serag

Analytical Chemistry Department

Faculty of Pharmacy

Al-Azhar University

Cairo

Egypt



From: Rui Barradas <ruipbarra...@sapo.pt>
Sent: Tuesday, May 22, 2018 2:16 PM
To: Ahmed Serag; R-help@r-project.org
Subject: Re: [R] remove rows of a matrix by part of its row name

Hello,

Please always cc the list.

As for the question, yes, it does. If you want to remove just the ones
with exactly 73.1 use the pattern

grep("^73\\.1$", etc)

Explanation:

Beginning of string: ^
End of string: $
Escape special characters: \\ (needed because the period is a special
character.)

Hope this helps,

Rui Barradas

On 5/22/2018 12:50 PM, Ahmed Serag wrote:
> Thank you Mr. Barradas. The code works great. Unfortunately I have also
> some labeles with
>
>
> 173.1
>
> 273.1
>
>
> the grep script remove them also ?
>
> Any ideas Plz, Thanks again
>
>
> 
>
> *Ahmed Serag*
>
> /Analytical Chemistry Department/
>
> /Faculty of Pharmacy/
>
> /Al-Azhar University/
>
> /Cairo/
>
> /Egypt/
>
>
>
> 
> *From:* Rui Barradas <ruipbarra...@sapo.pt>
> *Sent:* Tuesday, May 22, 2018 1:34 PM
> *To:* Ahmed Serag; r-help@r-project.org
> *Subject:* Re: [R] remove rows of a matrix by part of its row name
> Hello,
>
> Use grep to get the row indices and then subset with a *negative* index
> to remove those rows.
>
> rn <- scan(what = character(), text = "
> 70/556
> 71.1/280
> 72.1/556
> 72.1/343
> 73.1/390
> 73.1/556
> ")
>
> mat <- matrix(rnorm(6*6), nrow = 6)
> row.names(mat) <- rn
>
> inx <- grep("73\\.", row.names(mat))
>
> new_mat <- mat[-inx, ]
> new_mat
>
>
> Hope this helps,
>
> Rui Barradas
>
> On 5/22/2018 11:48 AM, Ahmed Serag wrote:
>> Dear R-experts,
>>
>>
>> How can I remove a certain feature or observation by a part of its name. To 
>> be clear, I have a matrix with 766 observations as a rows. The row names are 
>> like this
>>
>> 70/556
>> 71.1/280
>> 72.1/556
>> 72.1/343
>> 73.1/390
>> 73.1/556
>> Now I would like to remove all the rows that contain the text 73.1
>>
>> Any ideas or suggestion please ?
>>
>>
>> Regards
>>
>>
>>
>>
>> **
>>
>> Ahmed Serag
>>
>> Analytical Chemistry Department
>>
>> Faculty of Pharmacy
>>
>> Al-Azhar University
>>
>> Cairo
>>
>> Egypt
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help

thz.ch/mailman/listinfo/r-help>
stat.ethz.ch
The main R mailing list, for announcements about the development of R and the 
availability of new code, questions and answers about problems and solutions 
using R, enhancements and patches to the source code and documentation of R, 
comparison and compatibility with S and S-plus, and for the posting of nice 
examples and benchmarks.




> <https://stat.ethz.ch/mailman/listinfo/r-help>
> stat.ethz.ch
> The main R mailing list, for announcements about the development of R
> and the availability of new code, questions and answers about problems
> and solutions using R, enhancements and patches to the source code and
> documentation of R, comparison and compatibility with S and S-plus, and
> for the posting of nice examples and benchmarks.
>
>
>
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove rows of a matrix by part of its row name

2018-05-22 Thread Rui Barradas


Hello,

Please always cc the list.

As for the question, yes, it does. If you want to remove just the ones 
with exactly 73.1 use the pattern


grep("^73\\.1$", etc)

Explanation:

Beginning of string: ^
End of string: $
Escape special characters: \\ (needed because the period is a special 
character.)


Hope this helps,

Rui Barradas

On 5/22/2018 12:50 PM, Ahmed Serag wrote:
Thank you Mr. Barradas. The code works great. Unfortunately I have also 
some labeles with



173.1

273.1


the grep script remove them also ?

Any ideas Plz, Thanks again




*Ahmed Serag*

/Analytical Chemistry Department/

/Faculty of Pharmacy/

/Al-Azhar University/

/Cairo/

/Egypt/




*From:* Rui Barradas <ruipbarra...@sapo.pt>
*Sent:* Tuesday, May 22, 2018 1:34 PM
*To:* Ahmed Serag; r-help@r-project.org
*Subject:* Re: [R] remove rows of a matrix by part of its row name
Hello,

Use grep to get the row indices and then subset with a *negative* index
to remove those rows.

rn <- scan(what = character(), text = "
70/556
71.1/280
72.1/556
72.1/343
73.1/390
73.1/556
")

mat <- matrix(rnorm(6*6), nrow = 6)
row.names(mat) <- rn

inx <- grep("73\\.", row.names(mat))

new_mat <- mat[-inx, ]
new_mat


Hope this helps,

Rui Barradas

On 5/22/2018 11:48 AM, Ahmed Serag wrote:

Dear R-experts,


How can I remove a certain feature or observation by a part of its name. To be 
clear, I have a matrix with 766 observations as a rows. The row names are like 
this

70/556
71.1/280
72.1/556
72.1/343
73.1/390
73.1/556
Now I would like to remove all the rows that contain the text 73.1

Any ideas or suggestion please ?


Regards




**

Ahmed Serag

Analytical Chemistry Department

Faculty of Pharmacy

Al-Azhar University

Cairo

Egypt

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
R-help -- Main R Mailing List: Primary help - Homepage - SfS 
<https://stat.ethz.ch/mailman/listinfo/r-help>

stat.ethz.ch
The main R mailing list, for announcements about the development of R 
and the availability of new code, questions and answers about problems 
and solutions using R, enhancements and patches to the source code and 
documentation of R, comparison and compatibility with S and S-plus, and 
for the posting of nice examples and benchmarks.





PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove rows of a matrix by part of its row name

2018-05-22 Thread Rui Barradas


Hello,

Use grep to get the row indices and then subset with a *negative* index 
to remove those rows.


rn <- scan(what = character(), text = "
70/556
71.1/280
72.1/556
72.1/343
73.1/390
73.1/556
")

mat <- matrix(rnorm(6*6), nrow = 6)
row.names(mat) <- rn

inx <- grep("73\\.", row.names(mat))

new_mat <- mat[-inx, ]
new_mat


Hope this helps,

Rui Barradas

On 5/22/2018 11:48 AM, Ahmed Serag wrote:

Dear R-experts,


How can I remove a certain feature or observation by a part of its name. To be 
clear, I have a matrix with 766 observations as a rows. The row names are like 
this

70/556
71.1/280
72.1/556
72.1/343
73.1/390
73.1/556
Now I would like to remove all the rows that contain the text 73.1

Any ideas or suggestion please ?


Regards




**

Ahmed Serag

Analytical Chemistry Department

Faculty of Pharmacy

Al-Azhar University

Cairo

Egypt

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] remove rows of a matrix by part of its row name

2018-05-22 Thread Ahmed Serag

Dear R-experts,


How can I remove a certain feature or observation by a part of its name. To be 
clear, I have a matrix with 766 observations as a rows. The row names are like 
this

70/556
71.1/280
72.1/556
72.1/343
73.1/390
73.1/556
Now I would like to remove all the rows that contain the text 73.1

Any ideas or suggestion please ?


Regards




**

Ahmed Serag

Analytical Chemistry Department

Faculty of Pharmacy

Al-Azhar University

Cairo

Egypt

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove

2017-12-09 Thread Ashta

Thank you All !!

Now, I have plenty of options to chose.


On Sat, Dec 9, 2017 at 1:21 PM, William Dunlap  wrote:
> You could make numeric vectors, named by the group identifier, of the
> contraints
> and subscript it by group name:
>
>> DM <- read.table( text='GR x y
> + A 25 125
> + A 23 135
> + A 14 145
> + A 35 230
> + B 45 321
> + B 47 512
> + B 53 123
> + B 55 451
> + C 61 521
> + C 68 235
> + C 85 258
> + C 80 654',header = TRUE, stringsAsFactors = FALSE)
>>
>> GRmin <- c(A=15, B=40, C=60)
>> GRmax <- c(A=30, B=50, C=75)
>> subset(DM, x>=GRmin[GR] & x <=GRmax[GR])
>GR  x   y
> 1   A 25 125
> 2   A 23 135
> 5   B 45 321
> 6   B 47 512
> 9   C 61 521
> 10  C 68 235
>
> Or, if you want to completely avoid non-standard evaluation:
>> DM[ DM$x >= GRmin[DM$GR] & DM$x <= GRmax[DM$GR], ]
>GR  x   y
> 1   A 25 125
> 2   A 23 135
> 5   B 45 321
> 6   B 47 512
> 9   C 61 521
> 10  C 68 235
>
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Sat, Dec 9, 2017 at 9:38 AM, David Winsemius 
> wrote:
>>
>>
>> > On Dec 8, 2017, at 6:16 PM, David Winsemius 
>> > wrote:
>> >
>> >
>> >> On Dec 8, 2017, at 4:48 PM, Ashta  wrote:
>> >>
>> >> Hi David, Ista and all,
>> >>
>> >> I  have one related question  Within one group I want to keep records
>> >> conditionally.
>> >> example within
>> >> group A I want keep rows that have  " x" values  ranged  between 15 and
>> >> 30.
>> >> group B I want keep rows that have  " x" values  ranged  between  40
>> >> and 50.
>> >> group C I want keep rows that have  " x" values  ranged  between  60
>> >> and 75.
>> >
>> > When you have a problem where there are multiple "parallel: parameters,
>> > the function to "reach for" is `mapply`.
>> >
>> >mapply( your_selection_func, group_vec, min_vec, max_vec)
>> >
>> > ... and this will probably return the values as a list (of dataframes if
>> > you build the function correctly,  so you may may need to then do:
>> >
>> >do.call(rbind, ...)
>>
>>  do.call( rbind,
>> mapply( function(dat, grp, minx, maxx) {dat[ dat$GR==grp & dat$x >=
>> minx & dat$x <= maxx, ]},
>> grp=LETTERS[1:3], minx=c(15,40,60), maxx=c(30,50,75) ,
>> MoreArgs=list(dat=DM),
>> IMPLIFY=FALSE))
>>  GR  x   y
>> A.1   A 25 125
>> A.2   A 23 135
>> B.5   B 45 321
>> B.6   B 47 512
>> C.9   C 61 521
>> C.10  C 68 235
>>
>> >
>> > --
>> > David.
>> >>
>> >>
>> >> DM <- read.table( text='GR x y
>> >> A 25 125
>> >> A 23 135
>> >> A 14 145
>> >> A 35 230
>> >> B 45 321
>> >> B 47 512
>> >> B 53 123
>> >> B 55 451
>> >> C 61 521
>> >> C 68 235
>> >> C 85 258
>> >> C 80 654',header = TRUE, stringsAsFactors = FALSE)
>> >>
>> >>
>> >> The end result will be
>> >> A 25 125
>> >> A 23 135
>> >> B 45 321
>> >> B 47 512
>> >> C 61 521
>> >> C 68 235
>> >>
>> >> Thank you
>> >>
>> >> On Wed, Dec 6, 2017 at 10:34 PM, David Winsemius
>> >>  wrote:
>> >>>
>>  On Dec 6, 2017, at 4:27 PM, Ashta  wrote:
>> 
>>  Thank you Ista! Worked fine.
>> >>>
>> >>> Here's another (possibly more direct in its logic?):
>> >>>
>> >>> DM[ !ave(DM$x, DM$GR, FUN= function(x) {!length(unique(x))==1}), ]
>> >>> GR  x   y
>> >>> 5  B 25 321
>> >>> 6  B 25 512
>> >>> 7  B 25 123
>> >>> 8  B 25 451
>> >>>
>> >>> --
>> >>> David
>> >>>
>>  On Wed, Dec 6, 2017 at 5:59 PM, Ista Zahn  wrote:
>> > Hi Ashta,
>> >
>> > There are many ways to do it. Here is one:
>> >
>> > vars <- sapply(split(DM$x, DM$GR), var)
>> > DM[DM$GR %in% names(vars[vars > 0]), ]
>> >
>> > Best
>> > Ista
>> >
>> > On Wed, Dec 6, 2017 at 6:58 PM, Ashta  wrote:
>> >> Thank you Jeff,
>> >>
>> >> subset( DM, "B" != x ), this works if I know the group only.
>> >> But if I don't know that group in this case "B", how do I identify
>> >> group(s) that  all elements of x have the same value?
>> >>
>> >> On Wed, Dec 6, 2017 at 5:48 PM, Jeff Newmiller
>> >>  wrote:
>> >>> subset( DM, "B" != x )
>> >>>
>> >>> This is covered in the Introduction to R document that comes with
>> >>> R.
>> >>> --
>> >>> Sent from my phone. Please excuse my brevity.
>> >>>
>> >>> On December 6, 2017 3:21:12 PM PST, David Winsemius
>> >>>  wrote:
>> 
>> > On Dec 6, 2017, at 3:15 PM, Ashta  wrote:
>> >
>> > Hi all,
>> > In a data set I have group(GR) and two variables   x and y. I
>> > want to
>> > remove a  group that have  the same record for the x variable in
>> > each
>> > row.
>> >
>> > DM <- read.table( text='GR x y
>> > A 25 125
>> > A 23 135
>> > A 14 145
>> > A 12 230
>> > B 25 321
>> > B 25 512
>> > B

Re: [R] Remove

2017-12-09 Thread William Dunlap via R-help

You could make numeric vectors, named by the group identifier, of the
contraints
and subscript it by group name:

> DM <- read.table( text='GR x y
+ A 25 125
+ A 23 135
+ A 14 145
+ A 35 230
+ B 45 321
+ B 47 512
+ B 53 123
+ B 55 451
+ C 61 521
+ C 68 235
+ C 85 258
+ C 80 654',header = TRUE, stringsAsFactors = FALSE)
>
> GRmin <- c(A=15, B=40, C=60)
> GRmax <- c(A=30, B=50, C=75)
> subset(DM, x>=GRmin[GR] & x <=GRmax[GR])
   GR  x   y
1   A 25 125
2   A 23 135
5   B 45 321
6   B 47 512
9   C 61 521
10  C 68 235

Or, if you want to completely avoid non-standard evaluation:
> DM[ DM$x >= GRmin[DM$GR] & DM$x <= GRmax[DM$GR], ]
   GR  x   y
1   A 25 125
2   A 23 135
5   B 45 321
6   B 47 512
9   C 61 521
10  C 68 235




Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Sat, Dec 9, 2017 at 9:38 AM, David Winsemius 
wrote:

>
> > On Dec 8, 2017, at 6:16 PM, David Winsemius 
> wrote:
> >
> >
> >> On Dec 8, 2017, at 4:48 PM, Ashta  wrote:
> >>
> >> Hi David, Ista and all,
> >>
> >> I  have one related question  Within one group I want to keep records
> >> conditionally.
> >> example within
> >> group A I want keep rows that have  " x" values  ranged  between 15 and
> 30.
> >> group B I want keep rows that have  " x" values  ranged  between  40
> and 50.
> >> group C I want keep rows that have  " x" values  ranged  between  60
> and 75.
> >
> > When you have a problem where there are multiple "parallel: parameters,
> the function to "reach for" is `mapply`.
> >
> >mapply( your_selection_func, group_vec, min_vec, max_vec)
> >
> > ... and this will probably return the values as a list (of dataframes if
> you build the function correctly,  so you may may need to then do:
> >
> >do.call(rbind, ...)
>
>  do.call( rbind,
> mapply( function(dat, grp, minx, maxx) {dat[ dat$GR==grp & dat$x >=
> minx & dat$x <= maxx, ]},
> grp=LETTERS[1:3], minx=c(15,40,60), maxx=c(30,50,75) ,
> MoreArgs=list(dat=DM),
> IMPLIFY=FALSE))
>  GR  x   y
> A.1   A 25 125
> A.2   A 23 135
> B.5   B 45 321
> B.6   B 47 512
> C.9   C 61 521
> C.10  C 68 235
>
> >
> > --
> > David.
> >>
> >>
> >> DM <- read.table( text='GR x y
> >> A 25 125
> >> A 23 135
> >> A 14 145
> >> A 35 230
> >> B 45 321
> >> B 47 512
> >> B 53 123
> >> B 55 451
> >> C 61 521
> >> C 68 235
> >> C 85 258
> >> C 80 654',header = TRUE, stringsAsFactors = FALSE)
> >>
> >>
> >> The end result will be
> >> A 25 125
> >> A 23 135
> >> B 45 321
> >> B 47 512
> >> C 61 521
> >> C 68 235
> >>
> >> Thank you
> >>
> >> On Wed, Dec 6, 2017 at 10:34 PM, David Winsemius <
> dwinsem...@comcast.net> wrote:
> >>>
>  On Dec 6, 2017, at 4:27 PM, Ashta  wrote:
> 
>  Thank you Ista! Worked fine.
> >>>
> >>> Here's another (possibly more direct in its logic?):
> >>>
> >>> DM[ !ave(DM$x, DM$GR, FUN= function(x) {!length(unique(x))==1}), ]
> >>> GR  x   y
> >>> 5  B 25 321
> >>> 6  B 25 512
> >>> 7  B 25 123
> >>> 8  B 25 451
> >>>
> >>> --
> >>> David
> >>>
>  On Wed, Dec 6, 2017 at 5:59 PM, Ista Zahn  wrote:
> > Hi Ashta,
> >
> > There are many ways to do it. Here is one:
> >
> > vars <- sapply(split(DM$x, DM$GR), var)
> > DM[DM$GR %in% names(vars[vars > 0]), ]
> >
> > Best
> > Ista
> >
> > On Wed, Dec 6, 2017 at 6:58 PM, Ashta  wrote:
> >> Thank you Jeff,
> >>
> >> subset( DM, "B" != x ), this works if I know the group only.
> >> But if I don't know that group in this case "B", how do I identify
> >> group(s) that  all elements of x have the same value?
> >>
> >> On Wed, Dec 6, 2017 at 5:48 PM, Jeff Newmiller <
> jdnew...@dcn.davis.ca.us> wrote:
> >>> subset( DM, "B" != x )
> >>>
> >>> This is covered in the Introduction to R document that comes with
> R.
> >>> --
> >>> Sent from my phone. Please excuse my brevity.
> >>>
> >>> On December 6, 2017 3:21:12 PM PST, David Winsemius <
> dwinsem...@comcast.net> wrote:
> 
> > On Dec 6, 2017, at 3:15 PM, Ashta  wrote:
> >
> > Hi all,
> > In a data set I have group(GR) and two variables   x and y. I
> want to
> > remove a  group that have  the same record for the x variable in
> each
> > row.
> >
> > DM <- read.table( text='GR x y
> > A 25 125
> > A 23 135
> > A 14 145
> > A 12 230
> > B 25 321
> > B 25 512
> > B 25 123
> > B 25 451
> > C 11 521
> > C 14 235
> > C 15 258
> > C 10 654',header = TRUE, stringsAsFactors = FALSE)
> >
> > In this example the output should contain group A and C  as
> group B
> > has   the same record  for the variable x .
> >
> > The result will be
> > A 25 125
> > A 23 135
> > A 14 145
>

Re: [R] Remove

2017-12-09 Thread David Winsemius


> On Dec 8, 2017, at 6:16 PM, David Winsemius  wrote:
> 
> 
>> On Dec 8, 2017, at 4:48 PM, Ashta  wrote:
>> 
>> Hi David, Ista and all,
>> 
>> I  have one related question  Within one group I want to keep records
>> conditionally.
>> example within
>> group A I want keep rows that have  " x" values  ranged  between 15 and 30.
>> group B I want keep rows that have  " x" values  ranged  between  40 and 50.
>> group C I want keep rows that have  " x" values  ranged  between  60 and 75.
> 
> When you have a problem where there are multiple "parallel: parameters, the 
> function to "reach for" is `mapply`. 
> 
>mapply( your_selection_func, group_vec, min_vec, max_vec)
> 
> ... and this will probably return the values as a list (of dataframes if you 
> build the function correctly,  so you may may need to then do:
> 
>do.call(rbind, ...)

 do.call( rbind, 
mapply( function(dat, grp, minx, maxx) {dat[ dat$GR==grp & dat$x >= minx & 
dat$x <= maxx, ]}, 
grp=LETTERS[1:3], minx=c(15,40,60), maxx=c(30,50,75) ,
MoreArgs=list(dat=DM),
IMPLIFY=FALSE))
 GR  x   y
A.1   A 25 125
A.2   A 23 135
B.5   B 45 321
B.6   B 47 512
C.9   C 61 521
C.10  C 68 235

> 
> -- 
> David.
>> 
>> 
>> DM <- read.table( text='GR x y
>> A 25 125
>> A 23 135
>> A 14 145
>> A 35 230
>> B 45 321
>> B 47 512
>> B 53 123
>> B 55 451
>> C 61 521
>> C 68 235
>> C 85 258
>> C 80 654',header = TRUE, stringsAsFactors = FALSE)
>> 
>> 
>> The end result will be
>> A 25 125
>> A 23 135
>> B 45 321
>> B 47 512
>> C 61 521
>> C 68 235
>> 
>> Thank you
>> 
>> On Wed, Dec 6, 2017 at 10:34 PM, David Winsemius  
>> wrote:
>>> 
 On Dec 6, 2017, at 4:27 PM, Ashta  wrote:
 
 Thank you Ista! Worked fine.
>>> 
>>> Here's another (possibly more direct in its logic?):
>>> 
>>> DM[ !ave(DM$x, DM$GR, FUN= function(x) {!length(unique(x))==1}), ]
>>> GR  x   y
>>> 5  B 25 321
>>> 6  B 25 512
>>> 7  B 25 123
>>> 8  B 25 451
>>> 
>>> --
>>> David
>>> 
 On Wed, Dec 6, 2017 at 5:59 PM, Ista Zahn  wrote:
> Hi Ashta,
> 
> There are many ways to do it. Here is one:
> 
> vars <- sapply(split(DM$x, DM$GR), var)
> DM[DM$GR %in% names(vars[vars > 0]), ]
> 
> Best
> Ista
> 
> On Wed, Dec 6, 2017 at 6:58 PM, Ashta  wrote:
>> Thank you Jeff,
>> 
>> subset( DM, "B" != x ), this works if I know the group only.
>> But if I don't know that group in this case "B", how do I identify
>> group(s) that  all elements of x have the same value?
>> 
>> On Wed, Dec 6, 2017 at 5:48 PM, Jeff Newmiller 
>>  wrote:
>>> subset( DM, "B" != x )
>>> 
>>> This is covered in the Introduction to R document that comes with R.
>>> --
>>> Sent from my phone. Please excuse my brevity.
>>> 
>>> On December 6, 2017 3:21:12 PM PST, David Winsemius 
>>>  wrote:
 
> On Dec 6, 2017, at 3:15 PM, Ashta  wrote:
> 
> Hi all,
> In a data set I have group(GR) and two variables   x and y. I want to
> remove a  group that have  the same record for the x variable in each
> row.
> 
> DM <- read.table( text='GR x y
> A 25 125
> A 23 135
> A 14 145
> A 12 230
> B 25 321
> B 25 512
> B 25 123
> B 25 451
> C 11 521
> C 14 235
> C 15 258
> C 10 654',header = TRUE, stringsAsFactors = FALSE)
> 
> In this example the output should contain group A and C  as group B
> has   the same record  for the variable x .
> 
> The result will be
> A 25 125
> A 23 135
> A 14 145
> A 12 230
> C 11 521
> C 14 235
> C 15 258
> C 10 654
 
 Try:
 
 DM[ !duplicated(DM$x) , ]
> 
> How do I do it R?
> Thank you.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
 
 David Winsemius
 Alameda, CA, USA
 
 'Any technology distinguishable from magic is insufficiently advanced.'
 -Gehm's Corollary to Clarke's Third Law
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and

Re: [R] Remove

2017-12-09 Thread Ek Esawi

HI--

How about this one. It produces the desired result. If you have more
conditions, you can put them in a matrix/DF form and subset as suggested by
one of the previous suggestion.

DM[(DM$GR=="A"$x>=15$x<=30)|(DM$GR=="B"$x>=40$x<=50)|(DM
$GR=="C"$x>=60$x<=70),]


EK

On Sat, Dec 9, 2017 at 5:00 AM, Rui Barradas  wrote:

> Hello,
>
> Try the following.
>
> keep <- list(A = c(15, 30), B = c(40, 50), C = c(60, 75))
> sp <- split(DM$x, DM$GR)
> inx <- unlist(lapply(seq_along(sp), function(i) keep[[i]][1] <= sp[[i]] &
> sp[[i]] <= keep[[i]][2]))
> DM[inx, ]
> #   GR  x   y
> #1   A 25 125
> #2   A 23 135
> #5   B 45 321
> #6   B 47 512
> #9   C 61 521
> #10  C 68 235
>
> Hope this helps,
>
> Rui Barradas
>
>
> On 12/9/2017 12:48 AM, Ashta wrote:
>
>> Hi David, Ista and all,
>>
>> I  have one related question  Within one group I want to keep records
>> conditionally.
>> example within
>> group A I want keep rows that have  " x" values  ranged  between 15 and
>> 30.
>> group B I want keep rows that have  " x" values  ranged  between  40 and
>> 50.
>> group C I want keep rows that have  " x" values  ranged  between  60 and
>> 75.
>>
>>
>> DM <- read.table( text='GR x y
>> A 25 125
>> A 23 135
>> A 14 145
>> A 35 230
>> B 45 321
>> B 47 512
>> B 53 123
>> B 55 451
>> C 61 521
>> C 68 235
>> C 85 258
>> C 80 654',header = TRUE, stringsAsFactors = FALSE)
>>
>>
>> The end result will be
>> A 25 125
>> A 23 135
>> B 45 321
>> B 47 512
>> C 61 521
>> C 68 235
>>
>> Thank you
>>
>> On Wed, Dec 6, 2017 at 10:34 PM, David Winsemius 
>> wrote:
>>
>>>
>>> On Dec 6, 2017, at 4:27 PM, Ashta  wrote:

 Thank you Ista! Worked fine.

>>>
>>> Here's another (possibly more direct in its logic?):
>>>
>>>   DM[ !ave(DM$x, DM$GR, FUN= function(x) {!length(unique(x))==1}), ]
>>>GR  x   y
>>> 5  B 25 321
>>> 6  B 25 512
>>> 7  B 25 123
>>> 8  B 25 451
>>>
>>> --
>>> David
>>>
>>> On Wed, Dec 6, 2017 at 5:59 PM, Ista Zahn  wrote:

> Hi Ashta,
>
> There are many ways to do it. Here is one:
>
> vars <- sapply(split(DM$x, DM$GR), var)
> DM[DM$GR %in% names(vars[vars > 0]), ]
>
> Best
> Ista
>
> On Wed, Dec 6, 2017 at 6:58 PM, Ashta  wrote:
>
>> Thank you Jeff,
>>
>> subset( DM, "B" != x ), this works if I know the group only.
>> But if I don't know that group in this case "B", how do I identify
>> group(s) that  all elements of x have the same value?
>>
>> On Wed, Dec 6, 2017 at 5:48 PM, Jeff Newmiller <
>> jdnew...@dcn.davis.ca.us> wrote:
>>
>>> subset( DM, "B" != x )
>>>
>>> This is covered in the Introduction to R document that comes with R.
>>> --
>>> Sent from my phone. Please excuse my brevity.
>>>
>>> On December 6, 2017 3:21:12 PM PST, David Winsemius <
>>> dwinsem...@comcast.net> wrote:
>>>

 On Dec 6, 2017, at 3:15 PM, Ashta  wrote:
>
> Hi all,
> In a data set I have group(GR) and two variables   x and y. I want
> to
> remove a  group that have  the same record for the x variable in
> each
> row.
>
> DM <- read.table( text='GR x y
> A 25 125
> A 23 135
> A 14 145
> A 12 230
> B 25 321
> B 25 512
> B 25 123
> B 25 451
> C 11 521
> C 14 235
> C 15 258
> C 10 654',header = TRUE, stringsAsFactors = FALSE)
>
> In this example the output should contain group A and C  as group B
> has   the same record  for the variable x .
>
> The result will be
> A 25 125
> A 23 135
> A 14 145
> A 12 230
> C 11 521
> C 14 235
> C 15 258
> C 10 654
>

 Try:

 DM[ !duplicated(DM$x) , ]

>
> How do I do it R?
> Thank you.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
>
 http://www.R-project.org/posting-guide.html

> and provide commented, minimal, self-contained, reproducible code.
>

 David Winsemius
 Alameda, CA, USA

 'Any technology distinguishable from magic is insufficiently
 advanced.'
 -Gehm's Corollary to Clarke's Third Law

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal,

Re: [R] Remove

2017-12-09 Thread Rui Barradas


Hello,

Try the following.

keep <- list(A = c(15, 30), B = c(40, 50), C = c(60, 75))
sp <- split(DM$x, DM$GR)
inx <- unlist(lapply(seq_along(sp), function(i) keep[[i]][1] <= sp[[i]] 
& sp[[i]] <= keep[[i]][2]))

DM[inx, ]
#   GR  x   y
#1   A 25 125
#2   A 23 135
#5   B 45 321
#6   B 47 512
#9   C 61 521
#10  C 68 235

Hope this helps,

Rui Barradas

On 12/9/2017 12:48 AM, Ashta wrote:

Hi David, Ista and all,

I  have one related question  Within one group I want to keep records
conditionally.
example within
group A I want keep rows that have  " x" values  ranged  between 15 and 30.
group B I want keep rows that have  " x" values  ranged  between  40 and 50.
group C I want keep rows that have  " x" values  ranged  between  60 and 75.


DM <- read.table( text='GR x y
A 25 125
A 23 135
A 14 145
A 35 230
B 45 321
B 47 512
B 53 123
B 55 451
C 61 521
C 68 235
C 85 258
C 80 654',header = TRUE, stringsAsFactors = FALSE)


The end result will be
A 25 125
A 23 135
B 45 321
B 47 512
C 61 521
C 68 235

Thank you

On Wed, Dec 6, 2017 at 10:34 PM, David Winsemius  wrote:



On Dec 6, 2017, at 4:27 PM, Ashta  wrote:

Thank you Ista! Worked fine.


Here's another (possibly more direct in its logic?):

  DM[ !ave(DM$x, DM$GR, FUN= function(x) {!length(unique(x))==1}), ]
   GR  x   y
5  B 25 321
6  B 25 512
7  B 25 123
8  B 25 451

--
David


On Wed, Dec 6, 2017 at 5:59 PM, Ista Zahn  wrote:

Hi Ashta,

There are many ways to do it. Here is one:

vars <- sapply(split(DM$x, DM$GR), var)
DM[DM$GR %in% names(vars[vars > 0]), ]

Best
Ista

On Wed, Dec 6, 2017 at 6:58 PM, Ashta  wrote:

Thank you Jeff,

subset( DM, "B" != x ), this works if I know the group only.
But if I don't know that group in this case "B", how do I identify
group(s) that  all elements of x have the same value?

On Wed, Dec 6, 2017 at 5:48 PM, Jeff Newmiller  wrote:

subset( DM, "B" != x )

This is covered in the Introduction to R document that comes with R.
--
Sent from my phone. Please excuse my brevity.

On December 6, 2017 3:21:12 PM PST, David Winsemius  
wrote:



On Dec 6, 2017, at 3:15 PM, Ashta  wrote:

Hi all,
In a data set I have group(GR) and two variables   x and y. I want to
remove a  group that have  the same record for the x variable in each
row.

DM <- read.table( text='GR x y
A 25 125
A 23 135
A 14 145
A 12 230
B 25 321
B 25 512
B 25 123
B 25 451
C 11 521
C 14 235
C 15 258
C 10 654',header = TRUE, stringsAsFactors = FALSE)

In this example the output should contain group A and C  as group B
has   the same record  for the variable x .

The result will be
A 25 125
A 23 135
A 14 145
A 12 230
C 11 521
C 14 235
C 15 258
C 10 654


Try:

DM[ !duplicated(DM$x) , ]


How do I do it R?
Thank you.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'
-Gehm's Corollary to Clarke's Third Law

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   
-Gehm's Corollary to Clarke's Third Law







__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove

2017-12-08 Thread Jeff Newmiller

In this case I cannot see an advantage to using dplyr over subset, other 
than if dplyr is your hammer then the problem will look like a nail, or if 
this is one step in a larger context where dplyr is more useful.


Nor do I think this is a good use for mapply (or dplyr::group_by) because 
the groups are handled differently... better to introduce a data-driven 
columnar approach than to have three separate algorithms and bind the data 
frames together again.


Here are three ways I came up with. I sometimes use a variation of method 
3 when the logical tests are rather more complicated than this and I want 
to characterize those tests in the final report.


### reprex
DM <- read.table( text =
"GR x y
A 25 125
A 23 135
A 14 145
A 35 230
B 45 321
B 47 512
B 53 123
B 55 451
C 61 521
C 68 235
C 85 258
C 80 654", header = TRUE, stringsAsFactors = FALSE )

# 1 Hardcoded logic
DM1 <- subset( DM
 ,   "A" == GR & 15 <= x & x <= 30
   | "B" == GR & 40 <= x & x <= 50
   | "C" == GR & 60 <= x & x <= 75
 )
DM1
#>GR  x   y
#> 1   A 25 125
#> 2   A 23 135
#> 5   B 45 321
#> 6   B 47 512
#> 9   C 61 521
#> 10  C 68 235

# 2 relational approach
cond <- read.table( text =
"GR minx maxx
A   15   30
B   40   50
C   60   75
", header = TRUE )
DM2 <- merge( DM, cond, by = "GR" )
DM2 <- subset( DM2, minx <= x & x <= maxx, select = -c( minx, maxx ) )
DM2
#>GR  x   y
#> 1   A 25 125
#> 2   A 23 135
#> 5   B 45 321
#> 6   B 47 512
#> 9   C 61 521
#> 10  C 68 235

# 3 Construct selection vector
sel <- rep( FALSE, nrow( DM ) )
for ( i in seq.int( nrow( cond ) ) ) {
sel <- sel | ( cond$GR[ i ] == DM$GR
 & cond$minx[ i ] <= DM$x
 & DM$x <= cond$maxx[ i ]
 )
}
DM3 <- DM[ sel, ]
DM3
#>GR  x   y
#> 1   A 25 125
#> 2   A 23 135
#> 5   B 45 321
#> 6   B 47 512
#> 9   C 61 521
#> 10  C 68 235
###


On Fri, 8 Dec 2017, Michael Hannon wrote:


library(dplyr)

DM <- read.table( text='GR x y
A 25 125
A 23 135
.
.
.
)

DM %>% filter((GR == "A" & (x >= 15) & (x <= 30)) |
   (GR == "B" & (x >= 40) & (x <= 50)) |
   (GR == "C" & (x >= 60) & (x <= 75)))


On Fri, Dec 8, 2017 at 4:48 PM, Ashta  wrote:

Hi David, Ista and all,

I  have one related question  Within one group I want to keep records
conditionally.
example within
group A I want keep rows that have  " x" values  ranged  between 15 and 30.
group B I want keep rows that have  " x" values  ranged  between  40 and 50.
group C I want keep rows that have  " x" values  ranged  between  60 and 75.


DM <- read.table( text='GR x y
A 25 125
A 23 135
A 14 145
A 35 230
B 45 321
B 47 512
B 53 123
B 55 451
C 61 521
C 68 235
C 85 258
C 80 654',header = TRUE, stringsAsFactors = FALSE)


The end result will be
A 25 125
A 23 135
B 45 321
B 47 512
C 61 521
C 68 235

Thank you

On Wed, Dec 6, 2017 at 10:34 PM, David Winsemius  wrote:



On Dec 6, 2017, at 4:27 PM, Ashta  wrote:

Thank you Ista! Worked fine.


Here's another (possibly more direct in its logic?):

 DM[ !ave(DM$x, DM$GR, FUN= function(x) {!length(unique(x))==1}), ]
  GR  x   y
5  B 25 321
6  B 25 512
7  B 25 123
8  B 25 451

--
David


On Wed, Dec 6, 2017 at 5:59 PM, Ista Zahn  wrote:

Hi Ashta,

There are many ways to do it. Here is one:

vars <- sapply(split(DM$x, DM$GR), var)
DM[DM$GR %in% names(vars[vars > 0]), ]

Best
Ista

On Wed, Dec 6, 2017 at 6:58 PM, Ashta  wrote:

Thank you Jeff,

subset( DM, "B" != x ), this works if I know the group only.
But if I don't know that group in this case "B", how do I identify
group(s) that  all elements of x have the same value?

On Wed, Dec 6, 2017 at 5:48 PM, Jeff Newmiller  wrote:

subset( DM, "B" != x )

This is covered in the Introduction to R document that comes with R.
--
Sent from my phone. Please excuse my brevity.

On December 6, 2017 3:21:12 PM PST, David Winsemius  
wrote:



On Dec 6, 2017, at 3:15 PM, Ashta  wrote:

Hi all,
In a data set I have group(GR) and two variables   x and y. I want to
remove a  group that have  the same record for the x variable in each
row.

DM <- read.table( text='GR x y
A 25 125
A 23 135
A 14 145
A 12 230
B 25 321
B 25 512
B 25 123
B 25 451
C 11 521
C 14 235
C 15 258
C 10 654',header = TRUE, stringsAsFactors = FALSE)

In this example the output should contain group A and C  as group B
has   the same record  for the variable x .

The result will be
A 25 125
A 23 135
A 14 145
A 12 230
C 11 521
C 14 235
C 15 258
C 10 654


Try:

DM[ !duplicated(DM$x) , ]


How do I do it R?
Thank you.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented,

Re: [R] Remove

2017-12-08 Thread Michael Hannon

library(dplyr)

DM <- read.table( text='GR x y
A 25 125
A 23 135
.
.
.
)

DM %>% filter((GR == "A" & (x >= 15) & (x <= 30)) |
(GR == "B" & (x >= 40) & (x <= 50)) |
(GR == "C" & (x >= 60) & (x <= 75)))


On Fri, Dec 8, 2017 at 4:48 PM, Ashta  wrote:
> Hi David, Ista and all,
>
> I  have one related question  Within one group I want to keep records
> conditionally.
> example within
> group A I want keep rows that have  " x" values  ranged  between 15 and 30.
> group B I want keep rows that have  " x" values  ranged  between  40 and 50.
> group C I want keep rows that have  " x" values  ranged  between  60 and 75.
>
>
> DM <- read.table( text='GR x y
> A 25 125
> A 23 135
> A 14 145
> A 35 230
> B 45 321
> B 47 512
> B 53 123
> B 55 451
> C 61 521
> C 68 235
> C 85 258
> C 80 654',header = TRUE, stringsAsFactors = FALSE)
>
>
> The end result will be
> A 25 125
> A 23 135
> B 45 321
> B 47 512
> C 61 521
> C 68 235
>
> Thank you
>
> On Wed, Dec 6, 2017 at 10:34 PM, David Winsemius  
> wrote:
>>
>>> On Dec 6, 2017, at 4:27 PM, Ashta  wrote:
>>>
>>> Thank you Ista! Worked fine.
>>
>> Here's another (possibly more direct in its logic?):
>>
>>  DM[ !ave(DM$x, DM$GR, FUN= function(x) {!length(unique(x))==1}), ]
>>   GR  x   y
>> 5  B 25 321
>> 6  B 25 512
>> 7  B 25 123
>> 8  B 25 451
>>
>> --
>> David
>>
>>> On Wed, Dec 6, 2017 at 5:59 PM, Ista Zahn  wrote:
 Hi Ashta,

 There are many ways to do it. Here is one:

 vars <- sapply(split(DM$x, DM$GR), var)
 DM[DM$GR %in% names(vars[vars > 0]), ]

 Best
 Ista

 On Wed, Dec 6, 2017 at 6:58 PM, Ashta  wrote:
> Thank you Jeff,
>
> subset( DM, "B" != x ), this works if I know the group only.
> But if I don't know that group in this case "B", how do I identify
> group(s) that  all elements of x have the same value?
>
> On Wed, Dec 6, 2017 at 5:48 PM, Jeff Newmiller  
> wrote:
>> subset( DM, "B" != x )
>>
>> This is covered in the Introduction to R document that comes with R.
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> On December 6, 2017 3:21:12 PM PST, David Winsemius 
>>  wrote:
>>>
 On Dec 6, 2017, at 3:15 PM, Ashta  wrote:

 Hi all,
 In a data set I have group(GR) and two variables   x and y. I want to
 remove a  group that have  the same record for the x variable in each
 row.

 DM <- read.table( text='GR x y
 A 25 125
 A 23 135
 A 14 145
 A 12 230
 B 25 321
 B 25 512
 B 25 123
 B 25 451
 C 11 521
 C 14 235
 C 15 258
 C 10 654',header = TRUE, stringsAsFactors = FALSE)

 In this example the output should contain group A and C  as group B
 has   the same record  for the variable x .

 The result will be
 A 25 125
 A 23 135
 A 14 145
 A 12 230
 C 11 521
 C 14 235
 C 15 258
 C 10 654
>>>
>>> Try:
>>>
>>> DM[ !duplicated(DM$x) , ]

 How do I do it R?
 Thank you.

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
>>>
>>> David Winsemius
>>> Alameda, CA, USA
>>>
>>> 'Any technology distinguishable from magic is insufficiently advanced.'
>>> -Gehm's Corollary to Clarke's Third Law
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> David

Re: [R] Remove

2017-12-08 Thread David Winsemius


> On Dec 8, 2017, at 4:48 PM, Ashta  wrote:
> 
> Hi David, Ista and all,
> 
> I  have one related question  Within one group I want to keep records
> conditionally.
> example within
> group A I want keep rows that have  " x" values  ranged  between 15 and 30.
> group B I want keep rows that have  " x" values  ranged  between  40 and 50.
> group C I want keep rows that have  " x" values  ranged  between  60 and 75.

When you have a problem where there are multiple "parallel: parameters, the 
function to "reach for" is `mapply`. 

mapply( your_selection_func, group_vec, min_vec, max_vec)

... and this will probably return the values as a list (of dataframes if you 
build the function correctly,  so you may may need to then do:

do.call(rbind, ...)

-- 
David.
> 
> 
> DM <- read.table( text='GR x y
> A 25 125
> A 23 135
> A 14 145
> A 35 230
> B 45 321
> B 47 512
> B 53 123
> B 55 451
> C 61 521
> C 68 235
> C 85 258
> C 80 654',header = TRUE, stringsAsFactors = FALSE)
> 
> 
> The end result will be
> A 25 125
> A 23 135
> B 45 321
> B 47 512
> C 61 521
> C 68 235
> 
> Thank you
> 
> On Wed, Dec 6, 2017 at 10:34 PM, David Winsemius  
> wrote:
>> 
>>> On Dec 6, 2017, at 4:27 PM, Ashta  wrote:
>>> 
>>> Thank you Ista! Worked fine.
>> 
>> Here's another (possibly more direct in its logic?):
>> 
>> DM[ !ave(DM$x, DM$GR, FUN= function(x) {!length(unique(x))==1}), ]
>>  GR  x   y
>> 5  B 25 321
>> 6  B 25 512
>> 7  B 25 123
>> 8  B 25 451
>> 
>> --
>> David
>> 
>>> On Wed, Dec 6, 2017 at 5:59 PM, Ista Zahn  wrote:
 Hi Ashta,
 
 There are many ways to do it. Here is one:
 
 vars <- sapply(split(DM$x, DM$GR), var)
 DM[DM$GR %in% names(vars[vars > 0]), ]
 
 Best
 Ista
 
 On Wed, Dec 6, 2017 at 6:58 PM, Ashta  wrote:
> Thank you Jeff,
> 
> subset( DM, "B" != x ), this works if I know the group only.
> But if I don't know that group in this case "B", how do I identify
> group(s) that  all elements of x have the same value?
> 
> On Wed, Dec 6, 2017 at 5:48 PM, Jeff Newmiller  
> wrote:
>> subset( DM, "B" != x )
>> 
>> This is covered in the Introduction to R document that comes with R.
>> --
>> Sent from my phone. Please excuse my brevity.
>> 
>> On December 6, 2017 3:21:12 PM PST, David Winsemius 
>>  wrote:
>>> 
 On Dec 6, 2017, at 3:15 PM, Ashta  wrote:
 
 Hi all,
 In a data set I have group(GR) and two variables   x and y. I want to
 remove a  group that have  the same record for the x variable in each
 row.
 
 DM <- read.table( text='GR x y
 A 25 125
 A 23 135
 A 14 145
 A 12 230
 B 25 321
 B 25 512
 B 25 123
 B 25 451
 C 11 521
 C 14 235
 C 15 258
 C 10 654',header = TRUE, stringsAsFactors = FALSE)
 
 In this example the output should contain group A and C  as group B
 has   the same record  for the variable x .
 
 The result will be
 A 25 125
 A 23 135
 A 14 145
 A 12 230
 C 11 521
 C 14 235
 C 15 258
 C 10 654
>>> 
>>> Try:
>>> 
>>> DM[ !duplicated(DM$x) , ]
 
 How do I do it R?
 Thank you.
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
>>> 
>>> David Winsemius
>>> Alameda, CA, USA
>>> 
>>> 'Any technology distinguishable from magic is insufficiently advanced.'
>>> -Gehm's Corollary to Clarke's Third Law
>>> 
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>>> 
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the

Re: [R] Remove

2017-12-08 Thread Ashta

Hi David, Ista and all,

I  have one related question  Within one group I want to keep records
conditionally.
example within
group A I want keep rows that have  " x" values  ranged  between 15 and 30.
group B I want keep rows that have  " x" values  ranged  between  40 and 50.
group C I want keep rows that have  " x" values  ranged  between  60 and 75.


DM <- read.table( text='GR x y
A 25 125
A 23 135
A 14 145
A 35 230
B 45 321
B 47 512
B 53 123
B 55 451
C 61 521
C 68 235
C 85 258
C 80 654',header = TRUE, stringsAsFactors = FALSE)


The end result will be
A 25 125
A 23 135
B 45 321
B 47 512
C 61 521
C 68 235

Thank you

On Wed, Dec 6, 2017 at 10:34 PM, David Winsemius  wrote:
>
>> On Dec 6, 2017, at 4:27 PM, Ashta  wrote:
>>
>> Thank you Ista! Worked fine.
>
> Here's another (possibly more direct in its logic?):
>
>  DM[ !ave(DM$x, DM$GR, FUN= function(x) {!length(unique(x))==1}), ]
>   GR  x   y
> 5  B 25 321
> 6  B 25 512
> 7  B 25 123
> 8  B 25 451
>
> --
> David
>
>> On Wed, Dec 6, 2017 at 5:59 PM, Ista Zahn  wrote:
>>> Hi Ashta,
>>>
>>> There are many ways to do it. Here is one:
>>>
>>> vars <- sapply(split(DM$x, DM$GR), var)
>>> DM[DM$GR %in% names(vars[vars > 0]), ]
>>>
>>> Best
>>> Ista
>>>
>>> On Wed, Dec 6, 2017 at 6:58 PM, Ashta  wrote:
 Thank you Jeff,

 subset( DM, "B" != x ), this works if I know the group only.
 But if I don't know that group in this case "B", how do I identify
 group(s) that  all elements of x have the same value?

 On Wed, Dec 6, 2017 at 5:48 PM, Jeff Newmiller  
 wrote:
> subset( DM, "B" != x )
>
> This is covered in the Introduction to R document that comes with R.
> --
> Sent from my phone. Please excuse my brevity.
>
> On December 6, 2017 3:21:12 PM PST, David Winsemius 
>  wrote:
>>
>>> On Dec 6, 2017, at 3:15 PM, Ashta  wrote:
>>>
>>> Hi all,
>>> In a data set I have group(GR) and two variables   x and y. I want to
>>> remove a  group that have  the same record for the x variable in each
>>> row.
>>>
>>> DM <- read.table( text='GR x y
>>> A 25 125
>>> A 23 135
>>> A 14 145
>>> A 12 230
>>> B 25 321
>>> B 25 512
>>> B 25 123
>>> B 25 451
>>> C 11 521
>>> C 14 235
>>> C 15 258
>>> C 10 654',header = TRUE, stringsAsFactors = FALSE)
>>>
>>> In this example the output should contain group A and C  as group B
>>> has   the same record  for the variable x .
>>>
>>> The result will be
>>> A 25 125
>>> A 23 135
>>> A 14 145
>>> A 12 230
>>> C 11 521
>>> C 14 235
>>> C 15 258
>>> C 10 654
>>
>> Try:
>>
>> DM[ !duplicated(DM$x) , ]
>>>
>>> How do I do it R?
>>> Thank you.
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius
>> Alameda, CA, USA
>>
>> 'Any technology distinguishable from magic is insufficiently advanced.'
>> -Gehm's Corollary to Clarke's Third Law
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
> 'Any technology distinguishable from magic is insufficiently advanced.'   
> -Gehm's Corollary to Clarke's Third Law
>
>
>
>
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove

2017-12-06 Thread David Winsemius


> On Dec 6, 2017, at 4:27 PM, Ashta  wrote:
> 
> Thank you Ista! Worked fine.

Here's another (possibly more direct in its logic?):

 DM[ !ave(DM$x, DM$GR, FUN= function(x) {!length(unique(x))==1}), ]
  GR  x   y
5  B 25 321
6  B 25 512
7  B 25 123
8  B 25 451

-- 
David

> On Wed, Dec 6, 2017 at 5:59 PM, Ista Zahn  wrote:
>> Hi Ashta,
>> 
>> There are many ways to do it. Here is one:
>> 
>> vars <- sapply(split(DM$x, DM$GR), var)
>> DM[DM$GR %in% names(vars[vars > 0]), ]
>> 
>> Best
>> Ista
>> 
>> On Wed, Dec 6, 2017 at 6:58 PM, Ashta  wrote:
>>> Thank you Jeff,
>>> 
>>> subset( DM, "B" != x ), this works if I know the group only.
>>> But if I don't know that group in this case "B", how do I identify
>>> group(s) that  all elements of x have the same value?
>>> 
>>> On Wed, Dec 6, 2017 at 5:48 PM, Jeff Newmiller  
>>> wrote:
 subset( DM, "B" != x )
 
 This is covered in the Introduction to R document that comes with R.
 --
 Sent from my phone. Please excuse my brevity.
 
 On December 6, 2017 3:21:12 PM PST, David Winsemius 
  wrote:
> 
>> On Dec 6, 2017, at 3:15 PM, Ashta  wrote:
>> 
>> Hi all,
>> In a data set I have group(GR) and two variables   x and y. I want to
>> remove a  group that have  the same record for the x variable in each
>> row.
>> 
>> DM <- read.table( text='GR x y
>> A 25 125
>> A 23 135
>> A 14 145
>> A 12 230
>> B 25 321
>> B 25 512
>> B 25 123
>> B 25 451
>> C 11 521
>> C 14 235
>> C 15 258
>> C 10 654',header = TRUE, stringsAsFactors = FALSE)
>> 
>> In this example the output should contain group A and C  as group B
>> has   the same record  for the variable x .
>> 
>> The result will be
>> A 25 125
>> A 23 135
>> A 14 145
>> A 12 230
>> C 11 521
>> C 14 235
>> C 15 258
>> C 10 654
> 
> Try:
> 
> DM[ !duplicated(DM$x) , ]
>> 
>> How do I do it R?
>> Thank you.
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius
> Alameda, CA, USA
> 
> 'Any technology distinguishable from magic is insufficiently advanced.'
> -Gehm's Corollary to Clarke's Third Law
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>>> 
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   
-Gehm's Corollary to Clarke's Third Law

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove

2017-12-06 Thread Ashta

Thank you Ista! Worked fine.

On Wed, Dec 6, 2017 at 5:59 PM, Ista Zahn  wrote:
> Hi Ashta,
>
> There are many ways to do it. Here is one:
>
> vars <- sapply(split(DM$x, DM$GR), var)
> DM[DM$GR %in% names(vars[vars > 0]), ]
>
> Best
> Ista
>
> On Wed, Dec 6, 2017 at 6:58 PM, Ashta  wrote:
>> Thank you Jeff,
>>
>> subset( DM, "B" != x ), this works if I know the group only.
>> But if I don't know that group in this case "B", how do I identify
>> group(s) that  all elements of x have the same value?
>>
>> On Wed, Dec 6, 2017 at 5:48 PM, Jeff Newmiller  
>> wrote:
>>> subset( DM, "B" != x )
>>>
>>> This is covered in the Introduction to R document that comes with R.
>>> --
>>> Sent from my phone. Please excuse my brevity.
>>>
>>> On December 6, 2017 3:21:12 PM PST, David Winsemius 
>>>  wrote:

> On Dec 6, 2017, at 3:15 PM, Ashta  wrote:
>
> Hi all,
> In a data set I have group(GR) and two variables   x and y. I want to
> remove a  group that have  the same record for the x variable in each
> row.
>
> DM <- read.table( text='GR x y
> A 25 125
> A 23 135
> A 14 145
> A 12 230
> B 25 321
> B 25 512
> B 25 123
> B 25 451
> C 11 521
> C 14 235
> C 15 258
> C 10 654',header = TRUE, stringsAsFactors = FALSE)
>
> In this example the output should contain group A and C  as group B
> has   the same record  for the variable x .
>
> The result will be
> A 25 125
> A 23 135
> A 14 145
> A 12 230
> C 11 521
> C 14 235
> C 15 258
> C 10 654

Try:

DM[ !duplicated(DM$x) , ]
>
> How do I do it R?
> Thank you.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'
  -Gehm's Corollary to Clarke's Third Law

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove

2017-12-06 Thread Ista Zahn

Hi Ashta,

There are many ways to do it. Here is one:

vars <- sapply(split(DM$x, DM$GR), var)
DM[DM$GR %in% names(vars[vars > 0]), ]

Best
Ista

On Wed, Dec 6, 2017 at 6:58 PM, Ashta  wrote:
> Thank you Jeff,
>
> subset( DM, "B" != x ), this works if I know the group only.
> But if I don't know that group in this case "B", how do I identify
> group(s) that  all elements of x have the same value?
>
> On Wed, Dec 6, 2017 at 5:48 PM, Jeff Newmiller  
> wrote:
>> subset( DM, "B" != x )
>>
>> This is covered in the Introduction to R document that comes with R.
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> On December 6, 2017 3:21:12 PM PST, David Winsemius  
>> wrote:
>>>
 On Dec 6, 2017, at 3:15 PM, Ashta  wrote:

 Hi all,
 In a data set I have group(GR) and two variables   x and y. I want to
 remove a  group that have  the same record for the x variable in each
 row.

 DM <- read.table( text='GR x y
 A 25 125
 A 23 135
 A 14 145
 A 12 230
 B 25 321
 B 25 512
 B 25 123
 B 25 451
 C 11 521
 C 14 235
 C 15 258
 C 10 654',header = TRUE, stringsAsFactors = FALSE)

 In this example the output should contain group A and C  as group B
 has   the same record  for the variable x .

 The result will be
 A 25 125
 A 23 135
 A 14 145
 A 12 230
 C 11 521
 C 14 235
 C 15 258
 C 10 654
>>>
>>>Try:
>>>
>>>DM[ !duplicated(DM$x) , ]

 How do I do it R?
 Thank you.

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
>>>http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
>>>
>>>David Winsemius
>>>Alameda, CA, USA
>>>
>>>'Any technology distinguishable from magic is insufficiently advanced.'
>>>  -Gehm's Corollary to Clarke's Third Law
>>>
>>>__
>>>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>PLEASE do read the posting guide
>>>http://www.R-project.org/posting-guide.html
>>>and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove

2017-12-06 Thread Ashta

Thank you Jeff,

subset( DM, "B" != x ), this works if I know the group only.
But if I don't know that group in this case "B", how do I identify
group(s) that  all elements of x have the same value?

On Wed, Dec 6, 2017 at 5:48 PM, Jeff Newmiller  wrote:
> subset( DM, "B" != x )
>
> This is covered in the Introduction to R document that comes with R.
> --
> Sent from my phone. Please excuse my brevity.
>
> On December 6, 2017 3:21:12 PM PST, David Winsemius  
> wrote:
>>
>>> On Dec 6, 2017, at 3:15 PM, Ashta  wrote:
>>>
>>> Hi all,
>>> In a data set I have group(GR) and two variables   x and y. I want to
>>> remove a  group that have  the same record for the x variable in each
>>> row.
>>>
>>> DM <- read.table( text='GR x y
>>> A 25 125
>>> A 23 135
>>> A 14 145
>>> A 12 230
>>> B 25 321
>>> B 25 512
>>> B 25 123
>>> B 25 451
>>> C 11 521
>>> C 14 235
>>> C 15 258
>>> C 10 654',header = TRUE, stringsAsFactors = FALSE)
>>>
>>> In this example the output should contain group A and C  as group B
>>> has   the same record  for the variable x .
>>>
>>> The result will be
>>> A 25 125
>>> A 23 135
>>> A 14 145
>>> A 12 230
>>> C 11 521
>>> C 14 235
>>> C 15 258
>>> C 10 654
>>
>>Try:
>>
>>DM[ !duplicated(DM$x) , ]
>>>
>>> How do I do it R?
>>> Thank you.
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>David Winsemius
>>Alameda, CA, USA
>>
>>'Any technology distinguishable from magic is insufficiently advanced.'
>>  -Gehm's Corollary to Clarke's Third Law
>>
>>__
>>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove

2017-12-06 Thread Jeff Newmiller

subset( DM, "B" != x )

This is covered in the Introduction to R document that comes with R.
-- 
Sent from my phone. Please excuse my brevity.

On December 6, 2017 3:21:12 PM PST, David Winsemius  
wrote:
>
>> On Dec 6, 2017, at 3:15 PM, Ashta  wrote:
>> 
>> Hi all,
>> In a data set I have group(GR) and two variables   x and y. I want to
>> remove a  group that have  the same record for the x variable in each
>> row.
>> 
>> DM <- read.table( text='GR x y
>> A 25 125
>> A 23 135
>> A 14 145
>> A 12 230
>> B 25 321
>> B 25 512
>> B 25 123
>> B 25 451
>> C 11 521
>> C 14 235
>> C 15 258
>> C 10 654',header = TRUE, stringsAsFactors = FALSE)
>> 
>> In this example the output should contain group A and C  as group B
>> has   the same record  for the variable x .
>> 
>> The result will be
>> A 25 125
>> A 23 135
>> A 14 145
>> A 12 230
>> C 11 521
>> C 14 235
>> C 15 258
>> C 10 654
>
>Try:
>
>DM[ !duplicated(DM$x) , ]
>> 
>> How do I do it R?
>> Thank you.
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>David Winsemius
>Alameda, CA, USA
>
>'Any technology distinguishable from magic is insufficiently advanced.'
>  -Gehm's Corollary to Clarke's Third Law
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove

2017-12-06 Thread Ashta

Thank you David.
This will not work.  Tthis removes only duplicate records.
DM[ !duplicated(DM$x) , ]

My goal is to remove the group if all elements of x in that group have
 the same value.


On Wed, Dec 6, 2017 at 5:21 PM, David Winsemius  wrote:
>
>> On Dec 6, 2017, at 3:15 PM, Ashta  wrote:
>>
>> Hi all,
>> In a data set I have group(GR) and two variables   x and y. I want to
>> remove a  group that have  the same record for the x variable in each
>> row.
>>
>> DM <- read.table( text='GR x y
>> A 25 125
>> A 23 135
>> A 14 145
>> A 12 230
>> B 25 321
>> B 25 512
>> B 25 123
>> B 25 451
>> C 11 521
>> C 14 235
>> C 15 258
>> C 10 654',header = TRUE, stringsAsFactors = FALSE)
>>
>> In this example the output should contain group A and C  as group B
>> has   the same record  for the variable x .
>>
>> The result will be
>> A 25 125
>> A 23 135
>> A 14 145
>> A 12 230
>> C 11 521
>> C 14 235
>> C 15 258
>> C 10 654
>
> Try:
>
> DM[ !duplicated(DM$x) , ]
>>
>> How do I do it R?
>> Thank you.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
> 'Any technology distinguishable from magic is insufficiently advanced.'   
> -Gehm's Corollary to Clarke's Third Law
>
>
>
>
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove

2017-12-06 Thread David Winsemius


> On Dec 6, 2017, at 3:15 PM, Ashta  wrote:
> 
> Hi all,
> In a data set I have group(GR) and two variables   x and y. I want to
> remove a  group that have  the same record for the x variable in each
> row.
> 
> DM <- read.table( text='GR x y
> A 25 125
> A 23 135
> A 14 145
> A 12 230
> B 25 321
> B 25 512
> B 25 123
> B 25 451
> C 11 521
> C 14 235
> C 15 258
> C 10 654',header = TRUE, stringsAsFactors = FALSE)
> 
> In this example the output should contain group A and C  as group B
> has   the same record  for the variable x .
> 
> The result will be
> A 25 125
> A 23 135
> A 14 145
> A 12 230
> C 11 521
> C 14 235
> C 15 258
> C 10 654

Try:

DM[ !duplicated(DM$x) , ]
> 
> How do I do it R?
> Thank you.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   
-Gehm's Corollary to Clarke's Third Law

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Remove

2017-12-06 Thread Ashta

Hi all,
In a data set I have group(GR) and two variables   x and y. I want to
remove a  group that have  the same record for the x variable in each
row.

DM <- read.table( text='GR x y
A 25 125
A 23 135
A 14 145
A 12 230
B 25 321
B 25 512
B 25 123
B 25 451
C 11 521
C 14 235
C 15 258
C 10 654',header = TRUE, stringsAsFactors = FALSE)

In this example the output should contain group A and C  as group B
has   the same record  for the variable x .

The result will be
A 25 125
A 23 135
A 14 145
A 12 230
C 11 521
C 14 235
C 15 258
C 10 654

How do I do it R?
Thank you.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove spacing at the top and bottom of a plot

2017-09-24 Thread AbouEl-Makarim Aboueissa

Dear David:

Thank you very much.

with thanks
abou

On Sun, Sep 24, 2017 at 5:28 PM, David L Carlson <dcarl...@tamu.edu> wrote:

> The default margins are set as lines below, left, top, and right using
> mar=c(5.1, 4.1, 4.1, 2.1). Just change the top margin something like 1.1:
>
> par(mfrow=c(1,2), mar=c(5.1, 4.1, 1.1, 2.1))
>
> ---
> David L. Carlson
> Department of Anthropology
> Texas A University
>
> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
> AbouEl-Makarim Aboueissa
> Sent: Sunday, September 24, 2017 3:59 PM
> To: R mailing list <r-help@r-project.org>
> Subject: [R] Remove spacing at the top and bottom of a plot
>
> Dear All:
>
> Is there is away to remove spacing at the top and the bottom of a plot? If
> so, any help will be appreciated.
>
>
> Please use this code as an example:
>
>
> par(mfrow=c(1,2))
>
>
> lizard <- c(6.2, 6.6, 7.1, 7.4, 7.6, 7.9, 8, 8.3, 8.4, 8.5, 8.6,8.8, 8.8,
> 9.1, 9.2, 9.4, 9.4, 9.7, 9.9, 10.2, 10.4, 10.8,11.3, 11.9)
>
> n.draw <- 100
> mu <- 9
> n <- 24
> SD <- sd(lizard)
> draws <- matrix(rnorm(n.draw * n, mu, SD), n)
>
> get.conf.int <- function(x) {
>   t.test(x)$conf.int
> }
>
> conf.int <- apply(draws, 2, get.conf.int)
>
> plot(range(conf.int), c(0, 1 + n.draw), type = "n", xlab = "mean tail
> length", ylab = "sample run")
>
> for (i in 1:n.draw)  {
>   if(conf.int[1,i] <= mu & conf.int[2,i] >= mu ){
> lines(conf.int[, i], rep(i, 2), lwd = 2, col = 'green')
> lines(conf.int[, i], rep(i, 2), lwd = 2)
>   }
>   else {
> lines(conf.int[, i], rep(i, 2), lwd = 2, col = 'red')
>   }
> }
>
> abline(v = 9, lwd = 3, col='blue')    lty = 2,
>
>
>
>
>
> Thank you very much for your help.
>
>
> with many thanks
> abou
> __
> AbouEl-Makarim Aboueissa, PhD
>
> Professor of Statistics
> Department of Mathematics and Statistics
> University of Southern Maine
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
__
AbouEl-Makarim Aboueissa, PhD
Professor of Statistics
Department of Mathematics and Statistics
University of Southern Maine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove spacing at the top and bottom of a plot

2017-09-24 Thread David L Carlson

The default margins are set as lines below, left, top, and right using 
mar=c(5.1, 4.1, 4.1, 2.1). Just change the top margin something like 1.1:

par(mfrow=c(1,2), mar=c(5.1, 4.1, 1.1, 2.1))

---
David L. Carlson
Department of Anthropology
Texas A University

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of AbouEl-Makarim 
Aboueissa
Sent: Sunday, September 24, 2017 3:59 PM
To: R mailing list <r-help@r-project.org>
Subject: [R] Remove spacing at the top and bottom of a plot

Dear All:

Is there is away to remove spacing at the top and the bottom of a plot? If
so, any help will be appreciated.


Please use this code as an example:


par(mfrow=c(1,2))


lizard <- c(6.2, 6.6, 7.1, 7.4, 7.6, 7.9, 8, 8.3, 8.4, 8.5, 8.6,8.8, 8.8,
9.1, 9.2, 9.4, 9.4, 9.7, 9.9, 10.2, 10.4, 10.8,11.3, 11.9)

n.draw <- 100
mu <- 9
n <- 24
SD <- sd(lizard)
draws <- matrix(rnorm(n.draw * n, mu, SD), n)

get.conf.int <- function(x) {
  t.test(x)$conf.int
}

conf.int <- apply(draws, 2, get.conf.int)

plot(range(conf.int), c(0, 1 + n.draw), type = "n", xlab = "mean tail
length", ylab = "sample run")

for (i in 1:n.draw)  {
  if(conf.int[1,i] <= mu & conf.int[2,i] >= mu ){
lines(conf.int[, i], rep(i, 2), lwd = 2, col = 'green')
lines(conf.int[, i], rep(i, 2), lwd = 2)
  }
  else {
lines(conf.int[, i], rep(i, 2), lwd = 2, col = 'red')
  }
}

abline(v = 9, lwd = 3, col='blue')    lty = 2,





Thank you very much for your help.


with many thanks
abou
__
AbouEl-Makarim Aboueissa, PhD

Professor of Statistics
Department of Mathematics and Statistics
University of Southern Maine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Remove spacing at the top and bottom of a plot

2017-09-24 Thread AbouEl-Makarim Aboueissa

Dear All:

Is there is away to remove spacing at the top and the bottom of a plot? If
so, any help will be appreciated.


Please use this code as an example:


par(mfrow=c(1,2))


lizard <- c(6.2, 6.6, 7.1, 7.4, 7.6, 7.9, 8, 8.3, 8.4, 8.5, 8.6,8.8, 8.8,
9.1, 9.2, 9.4, 9.4, 9.7, 9.9, 10.2, 10.4, 10.8,11.3, 11.9)

n.draw <- 100
mu <- 9
n <- 24
SD <- sd(lizard)
draws <- matrix(rnorm(n.draw * n, mu, SD), n)

get.conf.int <- function(x) {
  t.test(x)$conf.int
}

conf.int <- apply(draws, 2, get.conf.int)

plot(range(conf.int), c(0, 1 + n.draw), type = "n", xlab = "mean tail
length", ylab = "sample run")

for (i in 1:n.draw)  {
  if(conf.int[1,i] <= mu & conf.int[2,i] >= mu ){
lines(conf.int[, i], rep(i, 2), lwd = 2, col = 'green')
lines(conf.int[, i], rep(i, 2), lwd = 2)
  }
  else {
lines(conf.int[, i], rep(i, 2), lwd = 2, col = 'red')
  }
}

abline(v = 9, lwd = 3, col='blue')    lty = 2,





Thank you very much for your help.


with many thanks
abou
__
AbouEl-Makarim Aboueissa, PhD

Professor of Statistics
Department of Mathematics and Statistics
University of Southern Maine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove quotes from matrix

2017-09-19 Thread greg holly

Hi Bert;

I sincerely appreciate for this. When I follow your way I have got
dimnames(dm)
[[1]]
NULL

I think this is the reason why the matrix is being converted into  a column
vector.

Regards,

Greg

On Tue, Sep 19, 2017 at 11:32 AM, Bert Gunter 
wrote:

> Works fine for me. What do you object to in the following?
>
> Calling the above df "d",
>
> > dm <- as.matrix(d)
> > dm
>   Sub_PathwaysBMI_beta   SAT_beta   VAT_beta
> 1 "Alanine_and_Aspartate" " 0.23820" "-0.02409" " 0.94180"
> 2 "Alanine_and_Aspartate" "-0.31300" "-1.97510" "-2.22040"
> 3 "Alanine_and_Aspartate" " 0.12380" " 0.40950" " 0.68050"
> 4 "Alanine_and_Aspartate" " 0.30350" " 0.48610" " 0.70830"
> 5 "Alanine_and_Aspartate" "-0.00982" " 0.32930" " 0.01597"
>   VSR_beta
> 1 " 0.24690"
> 2 "-0.23540"
> 3 " 0.05539"
> 4 " 0.01337"
> 5 "-0.04353"
> > dimnames(dm)
> [[1]]
> [1] "1" "2" "3" "4" "5"
>
> [[2]]
> [1] "Sub_Pathways" "BMI_beta" "SAT_beta" "VAT_beta"
> [5] "VSR_beta"
>
> > dm <- noquote(dm)
> > dm
>   Sub_Pathways  BMI_beta SAT_beta VAT_beta VSR_beta
> 1 Alanine_and_Aspartate  0.23820 -0.02409  0.94180  0.24690
> 2 Alanine_and_Aspartate -0.31300 -1.97510 -2.22040 -0.23540
> 3 Alanine_and_Aspartate  0.12380  0.40950  0.68050  0.05539
> 4 Alanine_and_Aspartate  0.30350  0.48610  0.70830  0.01337
> 5 Alanine_and_Aspartate -0.00982  0.32930  0.01597 -0.04353
> > dimnames(dm)
> [[1]]
> [1] "1" "2" "3" "4" "5"
>
> [[2]]
> [1] "Sub_Pathways" "BMI_beta" "SAT_beta" "VAT_beta"
> [5] "VSR_beta"
>
>
> Perhaps you need to read ?noquote or ?matrix.
>
> -- Bert
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> On Tue, Sep 19, 2017 at 8:20 AM, greg holly  wrote:
>
>> Dear all;
>>
>> Thanks. Here are the dput results as Duncan suggested.
>>
>> Regards,
>>
>> Greg
>>
>> structure(list(Sub_Pathways = structure(c(3L, 3L, 3L, 3L, 3L), .Label =
>> c("Acetylated_Peptides",
>> "Advanced_Glycation_End-product", "Alanine_and_Aspartate", "Aminosugar",
>> "Ascorbate_and_Aldarate", "Carnitine", "Ceramides", "Creatine",
>> "Diacylglycerol", "Dipeptide", "Dipeptide_Derivative",
>> "Disaccharides_and_Oligosaccharides",
>> "Eicosanoid", "Endocannabinoid", "Fatty_Acid(Acyl_Carnitine)",
>> "Fatty_Acid(Acyl_Glycine)", "Fatty_Acid,_Amino", "Fatty_Acid,_Branched",
>> "Fatty_Acid,_Dicarboxylate", "Fatty_Acid,_Dihydroxy",
>> "Fatty_Acid,_Monohydroxy",
>> "Fatty_Acid_(Acyl_Choline)", "Fatty_Acid_(Acyl_Glutamine)",
>> "Fatty_Acid_(also_BCAA)",
>> "Fatty_Acid_Synthesis", "Fibrinogen_Cleavage_Peptide",
>> "Fructose,_Mannose_and_Galactose",
>> "Gamma-glutamyl_Amino_Acid", "Glutamate", "Glutathione", "Glycerolipid",
>> "Glycine,_Serine_and_Threonine", "Glycogen",
>> "Glycolysis,_Gluconeogenesis,_and_Pyruvate",
>> "Guanidino_and_Acetamido", "Hemoglobin_and_Porphyrin", "Histidine",
>> "Inositol", "Ketone_Bodies", "Leucine,_Isoleucine_and_Valine",
>> "Long_Chain_Fatty_Acid", "Lysine", "Lyso-phospho-ether", "Lysolipid",
>> "Lysoplasmalogen", "Medium_Chain_Fatty_Acid",
>> "Methionine,_Cysteine,_SAM_and_Taurine",
>> "Mevalonate", "Monoacylglycerol", "Nicotinate_and_Nicotinamide",
>> "Oxidative_Phosphorylation", "Pantothenate_and_CoA", "Pentose",
>> "Phenylalanine_and_Tyrosine", "Phospholipid", "Plasmalogen",
>> "Polyamine", "Polypeptide", "Polyunsaturated_Fatty_Acid_(n3_and_n6)",
>> "Primary_Bile_Acid", "Purine,_(Hypo)Xanthine/Inosine_containing",
>> "Purine,_Adenine_containing", "Purine,_Guanine_containing",
>> "Pyrimidine,_Cytidine_containing",
>> "Pyrimidine,_Orotate_containing", "Pyrimidine,_Thymine_containing",
>> "Pyrimidine,_Uracil_containing", "Riboflavin", "Secondary_Bile_Acid",
>> "Short_Chain_Fatty_Acid", "Sphingolipid", "Steroid", "Sterol",
>> "TCA_Cycle", "Tocopherol", "Tryptophan",
>> "Urea_cycle;_Arginine_and_Proline",
>> "Vitamin_A", "Vitamin_B6"), class = "factor"), BMI_beta = c(0.2382,
>> -0.313, 0.1238, 0.3035, -0.00982), SAT_beta = c(-0.02409, -1.9751,
>> 0.4095, 0.4861, 0.3293), VAT_beta = c(0.9418, -2.2204, 0.6805,
>> 0.7083, 0.01597), VSR_beta = c(0.2469, -0.2354, 0.05539, 0.01337,
>> -0.04353)), .Names = c("Sub_Pathways", "BMI_beta", "SAT_beta",
>> "VAT_beta", "VSR_beta"), row.names = c(NA, 5L), class = "data.frame")
>>
>> On Tue, Sep 19, 2017 at 10:04 AM, Duncan Murdoch <
>> murdoch.dun...@gmail.com>
>> wrote:
>>
>> > On 19/09/2017 9:47 AM, greg holly wrote:
>> >
>> >> Hi all;
>> >>
>> >> I have data at 734*22 dimensions with rows and columns names are
>> >> non-numeric.When I convert this data into matrix then all values show
>> up
>> >> with quotes. Then when I use
>> >> x1= noquotes(x) to remove the quotes from the matrix then non-numeric
>> row
>> >> names remain all other values in matrix disappear.
>> >>
>> >> Your help is greatly appreciated.
>> >>
>> >>
>> >
>> > Matrices in R can have only one type.  If you start with a

Re: [R] remove quotes from matrix

2017-09-19 Thread David L Carlson

Your description was confusing. You do not have row names that are non-numeric:

> str(dta)
'data.frame':   5 obs. of  5 variables:
 $ Sub_Pathways: Factor w/ 79 levels "Acetylated_Peptides",..: 3 3 3 3 3
 $ BMI_beta: num  0.2382 -0.313 0.1238 0.3035 -0.00982
 $ SAT_beta: num  -0.0241 -1.9751 0.4095 0.4861 0.3293
 $ VAT_beta: num  0.942 -2.22 0.68 0.708 0.016
 $ VSR_beta: num  0.2469 -0.2354 0.0554 0.0134 -0.0435

You have a column that is a factor with 79 levels. That cannot be row names 
because you indicated that the original data was 734*22 dimensions and row 
names cannot have duplications. If you want numeric values, you need to strip 
off the first column:

> as.matrix(dta[ , -1])
  BMI_beta SAT_beta VAT_beta VSR_beta
1  0.23820 -0.02409  0.94180  0.24690
2 -0.31300 -1.97510 -2.22040 -0.23540
3  0.12380  0.40950  0.68050  0.05539
4  0.30350  0.48610  0.70830  0.01337
5 -0.00982  0.32930  0.01597 -0.04353

If you just want to print the character values without quotes:

> print(as.matrix(dta), quote=FALSE)
  Sub_Pathways  BMI_beta SAT_beta VAT_beta VSR_beta
1 Alanine_and_Aspartate  0.23820 -0.02409  0.94180  0.24690
2 Alanine_and_Aspartate -0.31300 -1.97510 -2.22040 -0.23540
3 Alanine_and_Aspartate  0.12380  0.40950  0.68050  0.05539
4 Alanine_and_Aspartate  0.30350  0.48610  0.70830  0.01337
5 Alanine_and_Aspartate -0.00982  0.32930  0.01597 -0.04353

But do not forget that they are still character strings.


David L Carlson
Department of Anthropology
Texas A University
College Station, TX 77843-4352



-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of greg holly
Sent: Tuesday, September 19, 2017 10:21 AM
To: Duncan Murdoch <murdoch.dun...@gmail.com>
Cc: r-help mailing list <r-help@r-project.org>
Subject: Re: [R] remove quotes from matrix

Dear all;

Thanks. Here are the dput results as Duncan suggested.

Regards,

Greg

structure(list(Sub_Pathways = structure(c(3L, 3L, 3L, 3L, 3L), .Label = 
c("Acetylated_Peptides", "Advanced_Glycation_End-product", 
"Alanine_and_Aspartate", "Aminosugar", "Ascorbate_and_Aldarate", "Carnitine", 
"Ceramides", "Creatine", "Diacylglycerol", "Dipeptide", "Dipeptide_Derivative", 
"Disaccharides_and_Oligosaccharides",
"Eicosanoid", "Endocannabinoid", "Fatty_Acid(Acyl_Carnitine)", 
"Fatty_Acid(Acyl_Glycine)", "Fatty_Acid,_Amino", "Fatty_Acid,_Branched", 
"Fatty_Acid,_Dicarboxylate", "Fatty_Acid,_Dihydroxy", 
"Fatty_Acid,_Monohydroxy", "Fatty_Acid_(Acyl_Choline)", 
"Fatty_Acid_(Acyl_Glutamine)", "Fatty_Acid_(also_BCAA)", 
"Fatty_Acid_Synthesis", "Fibrinogen_Cleavage_Peptide", 
"Fructose,_Mannose_and_Galactose",
"Gamma-glutamyl_Amino_Acid", "Glutamate", "Glutathione", "Glycerolipid", 
"Glycine,_Serine_and_Threonine", "Glycogen", 
"Glycolysis,_Gluconeogenesis,_and_Pyruvate",
"Guanidino_and_Acetamido", "Hemoglobin_and_Porphyrin", "Histidine", "Inositol", 
"Ketone_Bodies", "Leucine,_Isoleucine_and_Valine", "Long_Chain_Fatty_Acid", 
"Lysine", "Lyso-phospho-ether", "Lysolipid", "Lysoplasmalogen", 
"Medium_Chain_Fatty_Acid", "Methionine,_Cysteine,_SAM_and_Taurine",
"Mevalonate", "Monoacylglycerol", "Nicotinate_and_Nicotinamide", 
"Oxidative_Phosphorylation", "Pantothenate_and_CoA", "Pentose", 
"Phenylalanine_and_Tyrosine", "Phospholipid", "Plasmalogen", "Polyamine", 
"Polypeptide", "Polyunsaturated_Fatty_Acid_(n3_and_n6)",
"Primary_Bile_Acid", "Purine,_(Hypo)Xanthine/Inosine_containing",
"Purine,_Adenine_containing", "Purine,_Guanine_containing", 
"Pyrimidine,_Cytidine_containing",
"Pyrimidine,_Orotate_containing", "Pyrimidine,_Thymine_containing", 
"Pyrimidine,_Uracil_containing", "Riboflavin", "Secondary_Bile_Acid", 
"Short_Chain_Fatty_Acid", "Sphingolipid", "Steroid", "Sterol", "TCA_Cycle", 
"Tocopherol", "Tryptophan", "Urea_cycle;_Arginine_and_Proline",
"Vitamin_A", "Vitamin_B6"), class = "factor"), BMI_beta = c(0.2382, -0.313, 
0.1238, 0.3035, -0.00982), SAT_beta = c(-0.02409, -1.9751, 0.4095, 0.4861, 
0.3293), VAT_beta = c(0.9418, -2.2204, 0.6805, 0.7083, 0.01597), VSR_beta = 
c(0.2469, -0.2354, 0.05539, 0.01337, -0.04353)), .Names = c("Sub_Pathways

Re: [R] remove quotes from matrix

2017-09-19 Thread Bert Gunter

Works fine for me. What do you object to in the following?

Calling the above df "d",

> dm <- as.matrix(d)
> dm
  Sub_PathwaysBMI_beta   SAT_beta   VAT_beta
1 "Alanine_and_Aspartate" " 0.23820" "-0.02409" " 0.94180"
2 "Alanine_and_Aspartate" "-0.31300" "-1.97510" "-2.22040"
3 "Alanine_and_Aspartate" " 0.12380" " 0.40950" " 0.68050"
4 "Alanine_and_Aspartate" " 0.30350" " 0.48610" " 0.70830"
5 "Alanine_and_Aspartate" "-0.00982" " 0.32930" " 0.01597"
  VSR_beta
1 " 0.24690"
2 "-0.23540"
3 " 0.05539"
4 " 0.01337"
5 "-0.04353"
> dimnames(dm)
[[1]]
[1] "1" "2" "3" "4" "5"

[[2]]
[1] "Sub_Pathways" "BMI_beta" "SAT_beta" "VAT_beta"
[5] "VSR_beta"

> dm <- noquote(dm)
> dm
  Sub_Pathways  BMI_beta SAT_beta VAT_beta VSR_beta
1 Alanine_and_Aspartate  0.23820 -0.02409  0.94180  0.24690
2 Alanine_and_Aspartate -0.31300 -1.97510 -2.22040 -0.23540
3 Alanine_and_Aspartate  0.12380  0.40950  0.68050  0.05539
4 Alanine_and_Aspartate  0.30350  0.48610  0.70830  0.01337
5 Alanine_and_Aspartate -0.00982  0.32930  0.01597 -0.04353
> dimnames(dm)
[[1]]
[1] "1" "2" "3" "4" "5"

[[2]]
[1] "Sub_Pathways" "BMI_beta" "SAT_beta" "VAT_beta"
[5] "VSR_beta"


Perhaps you need to read ?noquote or ?matrix.

-- Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Tue, Sep 19, 2017 at 8:20 AM, greg holly  wrote:

> Dear all;
>
> Thanks. Here are the dput results as Duncan suggested.
>
> Regards,
>
> Greg
>
> structure(list(Sub_Pathways = structure(c(3L, 3L, 3L, 3L, 3L), .Label =
> c("Acetylated_Peptides",
> "Advanced_Glycation_End-product", "Alanine_and_Aspartate", "Aminosugar",
> "Ascorbate_and_Aldarate", "Carnitine", "Ceramides", "Creatine",
> "Diacylglycerol", "Dipeptide", "Dipeptide_Derivative",
> "Disaccharides_and_Oligosaccharides",
> "Eicosanoid", "Endocannabinoid", "Fatty_Acid(Acyl_Carnitine)",
> "Fatty_Acid(Acyl_Glycine)", "Fatty_Acid,_Amino", "Fatty_Acid,_Branched",
> "Fatty_Acid,_Dicarboxylate", "Fatty_Acid,_Dihydroxy",
> "Fatty_Acid,_Monohydroxy",
> "Fatty_Acid_(Acyl_Choline)", "Fatty_Acid_(Acyl_Glutamine)",
> "Fatty_Acid_(also_BCAA)",
> "Fatty_Acid_Synthesis", "Fibrinogen_Cleavage_Peptide",
> "Fructose,_Mannose_and_Galactose",
> "Gamma-glutamyl_Amino_Acid", "Glutamate", "Glutathione", "Glycerolipid",
> "Glycine,_Serine_and_Threonine", "Glycogen",
> "Glycolysis,_Gluconeogenesis,_and_Pyruvate",
> "Guanidino_and_Acetamido", "Hemoglobin_and_Porphyrin", "Histidine",
> "Inositol", "Ketone_Bodies", "Leucine,_Isoleucine_and_Valine",
> "Long_Chain_Fatty_Acid", "Lysine", "Lyso-phospho-ether", "Lysolipid",
> "Lysoplasmalogen", "Medium_Chain_Fatty_Acid",
> "Methionine,_Cysteine,_SAM_and_Taurine",
> "Mevalonate", "Monoacylglycerol", "Nicotinate_and_Nicotinamide",
> "Oxidative_Phosphorylation", "Pantothenate_and_CoA", "Pentose",
> "Phenylalanine_and_Tyrosine", "Phospholipid", "Plasmalogen",
> "Polyamine", "Polypeptide", "Polyunsaturated_Fatty_Acid_(n3_and_n6)",
> "Primary_Bile_Acid", "Purine,_(Hypo)Xanthine/Inosine_containing",
> "Purine,_Adenine_containing", "Purine,_Guanine_containing",
> "Pyrimidine,_Cytidine_containing",
> "Pyrimidine,_Orotate_containing", "Pyrimidine,_Thymine_containing",
> "Pyrimidine,_Uracil_containing", "Riboflavin", "Secondary_Bile_Acid",
> "Short_Chain_Fatty_Acid", "Sphingolipid", "Steroid", "Sterol",
> "TCA_Cycle", "Tocopherol", "Tryptophan",
> "Urea_cycle;_Arginine_and_Proline",
> "Vitamin_A", "Vitamin_B6"), class = "factor"), BMI_beta = c(0.2382,
> -0.313, 0.1238, 0.3035, -0.00982), SAT_beta = c(-0.02409, -1.9751,
> 0.4095, 0.4861, 0.3293), VAT_beta = c(0.9418, -2.2204, 0.6805,
> 0.7083, 0.01597), VSR_beta = c(0.2469, -0.2354, 0.05539, 0.01337,
> -0.04353)), .Names = c("Sub_Pathways", "BMI_beta", "SAT_beta",
> "VAT_beta", "VSR_beta"), row.names = c(NA, 5L), class = "data.frame")
>
> On Tue, Sep 19, 2017 at 10:04 AM, Duncan Murdoch  >
> wrote:
>
> > On 19/09/2017 9:47 AM, greg holly wrote:
> >
> >> Hi all;
> >>
> >> I have data at 734*22 dimensions with rows and columns names are
> >> non-numeric.When I convert this data into matrix then all values show up
> >> with quotes. Then when I use
> >> x1= noquotes(x) to remove the quotes from the matrix then non-numeric
> row
> >> names remain all other values in matrix disappear.
> >>
> >> Your help is greatly appreciated.
> >>
> >>
> >
> > Matrices in R can have only one type.  If you start with a dataframe and
> > any columns contain character data, all entries will be converted to
> > character, and the matrix will be displayed with quotes.
> >
> > When you say all values disappear, it sounds as though you are displaying
> > strings containing nothing (or just blanks).  Those will be displayed as
> ""
> > normally, but if the matrix is marked to display without quotes, they are
> > displayed as empty strings, so it will appear that nothing is

Re: [R] remove quotes from matrix

2017-09-19 Thread greg holly

Dear all;

Thanks. Here are the dput results as Duncan suggested.

Regards,

Greg

structure(list(Sub_Pathways = structure(c(3L, 3L, 3L, 3L, 3L), .Label =
c("Acetylated_Peptides",
"Advanced_Glycation_End-product", "Alanine_and_Aspartate", "Aminosugar",
"Ascorbate_and_Aldarate", "Carnitine", "Ceramides", "Creatine",
"Diacylglycerol", "Dipeptide", "Dipeptide_Derivative",
"Disaccharides_and_Oligosaccharides",
"Eicosanoid", "Endocannabinoid", "Fatty_Acid(Acyl_Carnitine)",
"Fatty_Acid(Acyl_Glycine)", "Fatty_Acid,_Amino", "Fatty_Acid,_Branched",
"Fatty_Acid,_Dicarboxylate", "Fatty_Acid,_Dihydroxy",
"Fatty_Acid,_Monohydroxy",
"Fatty_Acid_(Acyl_Choline)", "Fatty_Acid_(Acyl_Glutamine)",
"Fatty_Acid_(also_BCAA)",
"Fatty_Acid_Synthesis", "Fibrinogen_Cleavage_Peptide",
"Fructose,_Mannose_and_Galactose",
"Gamma-glutamyl_Amino_Acid", "Glutamate", "Glutathione", "Glycerolipid",
"Glycine,_Serine_and_Threonine", "Glycogen",
"Glycolysis,_Gluconeogenesis,_and_Pyruvate",
"Guanidino_and_Acetamido", "Hemoglobin_and_Porphyrin", "Histidine",
"Inositol", "Ketone_Bodies", "Leucine,_Isoleucine_and_Valine",
"Long_Chain_Fatty_Acid", "Lysine", "Lyso-phospho-ether", "Lysolipid",
"Lysoplasmalogen", "Medium_Chain_Fatty_Acid",
"Methionine,_Cysteine,_SAM_and_Taurine",
"Mevalonate", "Monoacylglycerol", "Nicotinate_and_Nicotinamide",
"Oxidative_Phosphorylation", "Pantothenate_and_CoA", "Pentose",
"Phenylalanine_and_Tyrosine", "Phospholipid", "Plasmalogen",
"Polyamine", "Polypeptide", "Polyunsaturated_Fatty_Acid_(n3_and_n6)",
"Primary_Bile_Acid", "Purine,_(Hypo)Xanthine/Inosine_containing",
"Purine,_Adenine_containing", "Purine,_Guanine_containing",
"Pyrimidine,_Cytidine_containing",
"Pyrimidine,_Orotate_containing", "Pyrimidine,_Thymine_containing",
"Pyrimidine,_Uracil_containing", "Riboflavin", "Secondary_Bile_Acid",
"Short_Chain_Fatty_Acid", "Sphingolipid", "Steroid", "Sterol",
"TCA_Cycle", "Tocopherol", "Tryptophan",
"Urea_cycle;_Arginine_and_Proline",
"Vitamin_A", "Vitamin_B6"), class = "factor"), BMI_beta = c(0.2382,
-0.313, 0.1238, 0.3035, -0.00982), SAT_beta = c(-0.02409, -1.9751,
0.4095, 0.4861, 0.3293), VAT_beta = c(0.9418, -2.2204, 0.6805,
0.7083, 0.01597), VSR_beta = c(0.2469, -0.2354, 0.05539, 0.01337,
-0.04353)), .Names = c("Sub_Pathways", "BMI_beta", "SAT_beta",
"VAT_beta", "VSR_beta"), row.names = c(NA, 5L), class = "data.frame")

On Tue, Sep 19, 2017 at 10:04 AM, Duncan Murdoch 
wrote:

> On 19/09/2017 9:47 AM, greg holly wrote:
>
>> Hi all;
>>
>> I have data at 734*22 dimensions with rows and columns names are
>> non-numeric.When I convert this data into matrix then all values show up
>> with quotes. Then when I use
>> x1= noquotes(x) to remove the quotes from the matrix then non-numeric row
>> names remain all other values in matrix disappear.
>>
>> Your help is greatly appreciated.
>>
>>
>
> Matrices in R can have only one type.  If you start with a dataframe and
> any columns contain character data, all entries will be converted to
> character, and the matrix will be displayed with quotes.
>
> When you say all values disappear, it sounds as though you are displaying
> strings containing nothing (or just blanks).  Those will be displayed as ""
> normally, but if the matrix is marked to display without quotes, they are
> displayed as empty strings, so it will appear that nothing is displayed.
>
> You can see the structure of the original data using the str() function,
> e.g. str(x) should display types for each column.
>
> If this isn't enough to explain what's going on, please show us more
> detail.  For example, show us the result of
>
> y <- x[1:5, 1:5]
> dput(y)
>
> both before and after converting x to a matrix.
>
> Duncan Murdoch
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove quotes from matrix

2017-09-19 Thread Bert Gunter

Your  claims are false -- or at least confused.

> d <- data.frame(a = I(letters[1:3]), b = 1:3)
## the I() is to prevent automatic conversion to factor

> d
  a b
1 a 1
2 b 2
3 c 3
> dm <- as.matrix(d)
> dm
 a   b
[1,] "a" "1"
[2,] "b" "2"
[3,] "c" "3"
> dimnames(dm)
[[1]]
NULL

[[2]]
[1] "a" "b"

## Note that there are no rownames, as d had none.
> dm <- noquote(dm)
> dm
 a b
[1,] a 1
[2,] b 2
[3,] c 3

We still need a reprex to resolve the confusion.

-- Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Tue, Sep 19, 2017 at 7:49 AM, greg holly  wrote:

> Hi Duncan and Bert;
>
> I do appreciate for your replies. I just figured out that after x1=
> noquotes(x) commend my 733*22 matrix returns into n*1 vector. Is there way
> to keep this as matrix with the dimension of 733*22?
>
> Regards,
>
> Greg
>
>
> On Tue, Sep 19, 2017 at 10:04 AM, Duncan Murdoch  >
> wrote:
>
> > On 19/09/2017 9:47 AM, greg holly wrote:
> >
> >> Hi all;
> >>
> >> I have data at 734*22 dimensions with rows and columns names are
> >> non-numeric.When I convert this data into matrix then all values show up
> >> with quotes. Then when I use
> >> x1= noquotes(x) to remove the quotes from the matrix then non-numeric
> row
> >> names remain all other values in matrix disappear.
> >>
> >> Your help is greatly appreciated.
> >>
> >>
> >
> > Matrices in R can have only one type.  If you start with a dataframe and
> > any columns contain character data, all entries will be converted to
> > character, and the matrix will be displayed with quotes.
> >
> > When you say all values disappear, it sounds as though you are displaying
> > strings containing nothing (or just blanks).  Those will be displayed as
> ""
> > normally, but if the matrix is marked to display without quotes, they are
> > displayed as empty strings, so it will appear that nothing is displayed.
> >
> > You can see the structure of the original data using the str() function,
> > e.g. str(x) should display types for each column.
> >
> > If this isn't enough to explain what's going on, please show us more
> > detail.  For example, show us the result of
> >
> > y <- x[1:5, 1:5]
> > dput(y)
> >
> > both before and after converting x to a matrix.
> >
> > Duncan Murdoch
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove quotes from matrix

2017-09-19 Thread Jeff Newmiller

Greg, I think you should stop using noquote, because it is doing something that 
will not be useful to you for preparing your data for analysis.

Please follow Duncan's advice and provide us with a sample of your data.  Also, 
please set your email program to send plain text rather than HTML formatted 
text. 
-- 
Sent from my phone. Please excuse my brevity.

On September 19, 2017 7:49:12 AM PDT, greg holly  wrote:
>Hi Duncan and Bert;
>
>I do appreciate for your replies. I just figured out that after x1=
>noquotes(x) commend my 733*22 matrix returns into n*1 vector. Is there
>way
>to keep this as matrix with the dimension of 733*22?
>
>Regards,
>
>Greg
>
>
>On Tue, Sep 19, 2017 at 10:04 AM, Duncan Murdoch
>
>wrote:
>
>> On 19/09/2017 9:47 AM, greg holly wrote:
>>
>>> Hi all;
>>>
>>> I have data at 734*22 dimensions with rows and columns names are
>>> non-numeric.When I convert this data into matrix then all values
>show up
>>> with quotes. Then when I use
>>> x1= noquotes(x) to remove the quotes from the matrix then
>non-numeric row
>>> names remain all other values in matrix disappear.
>>>
>>> Your help is greatly appreciated.
>>>
>>>
>>
>> Matrices in R can have only one type.  If you start with a dataframe
>and
>> any columns contain character data, all entries will be converted to
>> character, and the matrix will be displayed with quotes.
>>
>> When you say all values disappear, it sounds as though you are
>displaying
>> strings containing nothing (or just blanks).  Those will be displayed
>as ""
>> normally, but if the matrix is marked to display without quotes, they
>are
>> displayed as empty strings, so it will appear that nothing is
>displayed.
>>
>> You can see the structure of the original data using the str()
>function,
>> e.g. str(x) should display types for each column.
>>
>> If this isn't enough to explain what's going on, please show us more
>> detail.  For example, show us the result of
>>
>> y <- x[1:5, 1:5]
>> dput(y)
>>
>> both before and after converting x to a matrix.
>>
>> Duncan Murdoch
>>
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove quotes from matrix

2017-09-19 Thread greg holly

Hi Duncan and Bert;

I do appreciate for your replies. I just figured out that after x1=
noquotes(x) commend my 733*22 matrix returns into n*1 vector. Is there way
to keep this as matrix with the dimension of 733*22?

Regards,

Greg


On Tue, Sep 19, 2017 at 10:04 AM, Duncan Murdoch 
wrote:

> On 19/09/2017 9:47 AM, greg holly wrote:
>
>> Hi all;
>>
>> I have data at 734*22 dimensions with rows and columns names are
>> non-numeric.When I convert this data into matrix then all values show up
>> with quotes. Then when I use
>> x1= noquotes(x) to remove the quotes from the matrix then non-numeric row
>> names remain all other values in matrix disappear.
>>
>> Your help is greatly appreciated.
>>
>>
>
> Matrices in R can have only one type.  If you start with a dataframe and
> any columns contain character data, all entries will be converted to
> character, and the matrix will be displayed with quotes.
>
> When you say all values disappear, it sounds as though you are displaying
> strings containing nothing (or just blanks).  Those will be displayed as ""
> normally, but if the matrix is marked to display without quotes, they are
> displayed as empty strings, so it will appear that nothing is displayed.
>
> You can see the structure of the original data using the str() function,
> e.g. str(x) should display types for each column.
>
> If this isn't enough to explain what's going on, please show us more
> detail.  For example, show us the result of
>
> y <- x[1:5, 1:5]
> dput(y)
>
> both before and after converting x to a matrix.
>
> Duncan Murdoch
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove quotes from matrix

2017-09-19 Thread Duncan Murdoch


On 19/09/2017 9:47 AM, greg holly wrote:

Hi all;

I have data at 734*22 dimensions with rows and columns names are
non-numeric.When I convert this data into matrix then all values show up
with quotes. Then when I use
x1= noquotes(x) to remove the quotes from the matrix then non-numeric row
names remain all other values in matrix disappear.

Your help is greatly appreciated.




Matrices in R can have only one type.  If you start with a dataframe and 
any columns contain character data, all entries will be converted to 
character, and the matrix will be displayed with quotes.


When you say all values disappear, it sounds as though you are 
displaying strings containing nothing (or just blanks).  Those will be 
displayed as "" normally, but if the matrix is marked to display without 
quotes, they are displayed as empty strings, so it will appear that 
nothing is displayed.


You can see the structure of the original data using the str() function, 
e.g. str(x) should display types for each column.


If this isn't enough to explain what's going on, please show us more 
detail.  For example, show us the result of


y <- x[1:5, 1:5]
dput(y)

both before and after converting x to a matrix.

Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] remove quotes from matrix

2017-09-19 Thread greg holly

Hi all;

I have data at 734*22 dimensions with rows and columns names are
non-numeric.When I convert this data into matrix then all values show up
with quotes. Then when I use
x1= noquotes(x) to remove the quotes from the matrix then non-numeric row
names remain all other values in matrix disappear.

Your help is greatly appreciated.

Greg

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove attribute from netcdf4 object

2017-08-02 Thread Marc Girondot

Le 02/08/2017 à 12:03, raphael.fel...@agroscope.admin.ch a écrit :
> Dear all
>
> For a model I need to combine several netCDF files into one (which works 
> fine). For better overview I'd like to delete/remove some of the attributes. 
> Is there a simple way doing this?
>
> I'm using the package netcdf4, which creates an object of class(nc) = 
> "ncdf4". It seems that for earlier versions of netcdf objects, there was the 
> function att.delete.nc{RNetCDF}. But this functions returns the following 
> error, when applied to ncdf4-classes:
> Error: class(ncfile) == "NetCDF" is not TRUE
You should use package ncdf4 or package RNetCDF, but not mixed both.
Marc
>
> Thanks a lot for any help.
>
> Kind regards
>
> Raphael Felber
>
> 
> Raphael Felber, Dr. sc.
> Wissenschaftlicher Mitarbeiter, Klima und Lufthygiene
>
> Eidgen�ssisches Departement f�r
> Wirtschaft, Bildung und Forschung WBF
> Agroscope
> Forschungsbereich Agrar�kologie und Umwelt
>
> Reckenholzstrasse 191, 8046 Z�rich
> Tel. 058 468 75 11
> Fax 058 468 72 01
> raphael.fel...@agroscope.admin.ch
> www.agroscope.ch
>
>
>   [[alternative HTML version deleted]]
>
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
__
Marc Girondot, Pr

Laboratoire Ecologie, Systématique et Evolution
Equipe de Conservation des Populations et des Communautés
CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079
Bâtiment 362
91405 Orsay Cedex, France

Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1.69.15.73.53
e-mail: marc.giron...@u-psud.fr
Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html
Skype: girondot


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove attribute from netcdf4 object

2017-08-02 Thread raphael.felber

Hi Marc

That's a workaround I can use. Thanks. I'm a newbie regarding netCDF data. Is 
there any information I'm losing when switching between the packages?

Raphael

Von: Marc Girondot [mailto:marc.giron...@u-psud.fr]
Gesendet: Mittwoch, 2. August 2017 15:13
An: Felber Raphael Agroscope <raphael.fel...@agroscope.admin.ch>
Betreff: Re: AW: [R] Remove attribute from netcdf4 object

ok. Sorry, I didn't understood good.
I don't think you can do it in ncdf4 functions. The only solution would be to 
open it in RNetCDF, delete the attribute, save it and then open it in ncdf4.

Marc

Le 02/08/2017 à 15:02, 
raphael.fel...@agroscope.admin.ch<mailto:raphael.fel...@agroscope.admin.ch> a 
écrit :
Dear Marc

Thanks for your remark. I don't want to use both packages. I mentioned the 
package RNetCDF to show that there is a similar function I' d like to use.

Raphael

Von: Marc Girondot [mailto:marc.giron...@u-psud.fr]
Gesendet: Mittwoch, 2. August 2017 14:51
An: Felber Raphael Agroscope 
<raphael.fel...@agroscope.admin.ch><mailto:raphael.fel...@agroscope.admin.ch>; 
r-help@r-project.org<mailto:r-help@r-project.org>
Betreff: Re: [R] Remove attribute from netcdf4 object

Le 02/08/2017 à 12:03, 
raphael.fel...@agroscope.admin.ch<mailto:raphael.fel...@agroscope.admin.ch> a 
écrit :

Dear all



For a model I need to combine several netCDF files into one (which works fine). 
For better overview I'd like to delete/remove some of the attributes. Is there 
a simple way doing this?



I'm using the package netcdf4, which creates an object of class(nc) = "ncdf4". 
It seems that for earlier versions of netcdf objects, there was the function 
att.delete.nc{RNetCDF}. But this functions returns the following error, when 
applied to ncdf4-classes:

Error: class(ncfile) == "NetCDF" is not TRUE
You should use package ncdf4 or package RNetCDF, but not mixed both.
Marc







Thanks a lot for any help.



Kind regards



Raphael Felber





Raphael Felber, Dr. sc.

Wissenschaftlicher Mitarbeiter, Klima und Lufthygiene



Eidgen�ssisches Departement f�r

Wirtschaft, Bildung und Forschung WBF

Agroscope

Forschungsbereich Agrar�kologie und Umwelt



Reckenholzstrasse 191, 8046 Z�rich

Tel. 058 468 75 11

Fax 058 468 72 01

raphael.fel...@agroscope.admin.ch<mailto:raphael.fel...@agroscope.admin.ch><mailto:raphael.fel...@agroscope.admin.ch><mailto:raphael.fel...@agroscope.admin.ch>

www.agroscope.ch<http://www.agroscope.ch><http://www.agroscope.ch/><http://www.agroscope.ch/>





 [[alternative HTML version deleted]]







__

R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
UNSUBSCRIBE and more, see

https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--

__

Marc Girondot, Pr



Laboratoire Ecologie, Systématique et Evolution

Equipe de Conservation des Populations et des Communautés

CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079

Bâtiment 362

91405 Orsay Cedex, France



Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1.69.15.73.53

e-mail: marc.giron...@u-psud.fr<mailto:marc.giron...@u-psud.fr>

Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html

Skype: girondot



--

__

Marc Girondot, Pr



Laboratoire Ecologie, Systématique et Evolution

Equipe de Conservation des Populations et des Communautés

CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079

Bâtiment 362

91405 Orsay Cedex, France



Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1.69.15.73.53

e-mail: marc.giron...@u-psud.fr<mailto:marc.giron...@u-psud.fr>

Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html

Skype: girondot

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove attribute from netcdf4 object

2017-08-02 Thread raphael.felber

Dear Marc

Thanks for your remark. I don't want to use both packages. I mentioned the 
package RNetCDF to show that there is a similar function I' d like to use.

Raphael

Von: Marc Girondot [mailto:marc.giron...@u-psud.fr]
Gesendet: Mittwoch, 2. August 2017 14:51
An: Felber Raphael Agroscope <raphael.fel...@agroscope.admin.ch>; 
r-help@r-project.org
Betreff: Re: [R] Remove attribute from netcdf4 object

Le 02/08/2017 à 12:03, 
raphael.fel...@agroscope.admin.ch<mailto:raphael.fel...@agroscope.admin.ch> a 
écrit :

Dear all



For a model I need to combine several netCDF files into one (which works fine). 
For better overview I'd like to delete/remove some of the attributes. Is there 
a simple way doing this?



I'm using the package netcdf4, which creates an object of class(nc) = "ncdf4". 
It seems that for earlier versions of netcdf objects, there was the function 
att.delete.nc{RNetCDF}. But this functions returns the following error, when 
applied to ncdf4-classes:

Error: class(ncfile) == "NetCDF" is not TRUE
You should use package ncdf4 or package RNetCDF, but not mixed both.
Marc






Thanks a lot for any help.



Kind regards



Raphael Felber





Raphael Felber, Dr. sc.

Wissenschaftlicher Mitarbeiter, Klima und Lufthygiene



Eidgen�ssisches Departement f�r

Wirtschaft, Bildung und Forschung WBF

Agroscope

Forschungsbereich Agrar�kologie und Umwelt



Reckenholzstrasse 191, 8046 Z�rich

Tel. 058 468 75 11

Fax 058 468 72 01

raphael.fel...@agroscope.admin.ch<mailto:raphael.fel...@agroscope.admin.ch><mailto:raphael.fel...@agroscope.admin.ch><mailto:raphael.fel...@agroscope.admin.ch>

www.agroscope.ch<http://www.agroscope.ch><http://www.agroscope.ch/><http://www.agroscope.ch/>





 [[alternative HTML version deleted]]






__

R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
UNSUBSCRIBE and more, see

https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--

__

Marc Girondot, Pr



Laboratoire Ecologie, Systématique et Evolution

Equipe de Conservation des Populations et des Communautés

CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079

Bâtiment 362

91405 Orsay Cedex, France



Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1.69.15.73.53

e-mail: marc.giron...@u-psud.fr<mailto:marc.giron...@u-psud.fr>

Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html

Skype: girondot

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Remove attribute from netcdf4 object

2017-08-02 Thread raphael.felber

Dear all

For a model I need to combine several netCDF files into one (which works fine). 
For better overview I'd like to delete/remove some of the attributes. Is there 
a simple way doing this?

I'm using the package netcdf4, which creates an object of class(nc) = "ncdf4". 
It seems that for earlier versions of netcdf objects, there was the function 
att.delete.nc{RNetCDF}. But this functions returns the following error, when 
applied to ncdf4-classes:
Error: class(ncfile) == "NetCDF" is not TRUE

Thanks a lot for any help.

Kind regards

Raphael Felber


Raphael Felber, Dr. sc.
Wissenschaftlicher Mitarbeiter, Klima und Lufthygiene

Eidgen�ssisches Departement f�r
Wirtschaft, Bildung und Forschung WBF
Agroscope
Forschungsbereich Agrar�kologie und Umwelt

Reckenholzstrasse 191, 8046 Z�rich
Tel. 058 468 75 11
Fax 058 468 72 01
raphael.fel...@agroscope.admin.ch
www.agroscope.ch


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove

2017-06-11 Thread Jeff Newmiller

The usual way I filter is:

KL$Dt <- as.Date( KL$date, format='%d-%m-%y' )
KL2 <- KL[ !is.na( KL$Dt ), ]

-- 
Sent from my phone. Please excuse my brevity.

On June 10, 2017 10:17:52 PM PDT, Jeff Newmiller  
wrote:
>You are using a slash in your format string to separate sub-fields but
>your data uses a dash.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove

2017-06-10 Thread Bert Gunter

Also ?ifelse  rather than if()  I think.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Jun 10, 2017 at 10:17 PM, Jeff Newmiller
 wrote:
> You are using a slash in your format string to separate sub-fields but your 
> data uses a dash.
> --
> Sent from my phone. Please excuse my brevity.
>
> On June 10, 2017 8:18:37 PM PDT, Val  wrote:
>>Hi all,
>>I have  a date  issue and would appreciate any help.
>>
>>I am reading a field data and  n one of the columns I am expecting a
>>date but has  non date  values  such as  character and  empty. space.
>>Here is a sample of my data.
>>
>>KL <- read.table(header=TRUE, text='ID date
>>711 Dead
>>712 Uknown
>>713 20-11-08
>>714 11-28-07
>>301
>>302 09-02-02
>>303 09-21-02',stringsAsFactors = FALSE, fill =T)
>>
>>str(KL)
>>data.frame': 7 obs. of  2 variables:
>> $ ID  : int  711 712 713 714 301 302 303
>> $ date: chr  "Dead" "Uknown" "20-11-08" "11-28-07" .
>>
>>I wanted to convert the date column as follows.
>>if (max(unique(nchar(as.character(KL$date==10) {
>>  KL$date <- as.Date(KL$date,"%m/%d/%Y")
>>}
>>but not working.
>>
>>
>>How  could I to remove the corresponding entire row. that do not have
>>a date format and do the operation?
>>thank you in advance
>>
>>__
>>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove

2017-06-10 Thread Jeff Newmiller

You are using a slash in your format string to separate sub-fields but your 
data uses a dash.
-- 
Sent from my phone. Please excuse my brevity.

On June 10, 2017 8:18:37 PM PDT, Val  wrote:
>Hi all,
>I have  a date  issue and would appreciate any help.
>
>I am reading a field data and  n one of the columns I am expecting a
>date but has  non date  values  such as  character and  empty. space.
>Here is a sample of my data.
>
>KL <- read.table(header=TRUE, text='ID date
>711 Dead
>712 Uknown
>713 20-11-08
>714 11-28-07
>301
>302 09-02-02
>303 09-21-02',stringsAsFactors = FALSE, fill =T)
>
>str(KL)
>data.frame': 7 obs. of  2 variables:
> $ ID  : int  711 712 713 714 301 302 303
> $ date: chr  "Dead" "Uknown" "20-11-08" "11-28-07" .
>
>I wanted to convert the date column as follows.
>if (max(unique(nchar(as.character(KL$date==10) {
>  KL$date <- as.Date(KL$date,"%m/%d/%Y")
>}
>but not working.
>
>
>How  could I to remove the corresponding entire row. that do not have
>a date format and do the operation?
>thank you in advance
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] remove

2017-06-10 Thread Val

Hi all,
I have  a date  issue and would appreciate any help.

I am reading a field data and  n one of the columns I am expecting a
date but has  non date  values  such as  character and  empty. space.
Here is a sample of my data.

KL <- read.table(header=TRUE, text='ID date
711 Dead
712 Uknown
713 20-11-08
714 11-28-07
301
302 09-02-02
303 09-21-02',stringsAsFactors = FALSE, fill =T)

str(KL)
data.frame': 7 obs. of  2 variables:
 $ ID  : int  711 712 713 714 301 302 303
 $ date: chr  "Dead" "Uknown" "20-11-08" "11-28-07" .

I wanted to convert the date column as follows.
if (max(unique(nchar(as.character(KL$date==10) {
  KL$date <- as.Date(KL$date,"%m/%d/%Y")
}
but not working.


How  could I to remove the corresponding entire row. that do not have
a date format and do the operation?
thank you in advance

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 3 4 5 6 7 >

1 - 100 of 632 matches

Mail list logo