[R] Density or Boxplot with median and mean

2014-01-22 Thread Alaios
Hi there,
I would like to be able to draw a density plot or a box plot where the median 
and the median and the mean would be visible.

If I decide a density plot I need to put two big marks one for the median and 
one for the mean, which I do not know how I can achieve to put marks in a 
density plot. For that I am using plot(density(myVector))

while on the boxplot median is already visible but mean not. To have the mean 
there I would have to add one more line on each boxplot, perhaps of different 
color but I am not sure if that is possible in R. boxplot(myVector) I am using

where myVector can be something like myVector-seq(1,200)

I would like to thank you all in advance 
Regards
Alex
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Density or Boxplot with median and mean

2014-01-22 Thread Jim Lemon

On 01/22/2014 07:37 PM, Alaios wrote:

Hi there,
I would like to be able to draw a density plot or a box plot where the median 
and the median and the mean would be visible.

If I decide a density plot I need to put two big marks one for the median and 
one for the mean, which I do not know how I can achieve to put marks in a 
density plot. For that I am using plot(density(myVector))

while on the boxplot median is already visible but mean not. To have the mean 
there I would have to add one more line on each boxplot, perhaps of different 
color but I am not sure if that is possible in R. boxplot(myVector) I am using

where myVector can be something like myVector-seq(1,200)


Hi Alex,
On a density plot you can use abline:

abline(v=mean(myVector),col=red)
abline(v=median(myVector),col=green)

I don't know of any boxplot function that will plot two measures of 
central tendency, but box.heresy in plotrix will plot any one measure 
that you like.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] collapsing records

2014-01-22 Thread Bill
Hi
That is great!
Thanks


On Mon, Jan 20, 2014 at 12:10 PM, Jim Lemon j...@bitwrit.com.au wrote:

 On 01/20/2014 11:44 AM, Bill wrote:

 I am trying to read a csv file with a date-time field. There are many rows
 with the same date but different times. I first want to clear the times so
 that rows from the same day have the same date-time field (called Date).
 There is another field called Text and I want to collapse all the records
 with the same date so that there is only one record for this date and with
 a text field that contains all the strings from all the corresponding text
 fields. At the same time I want to create a new field that has the count
 of
 how many records were collapsed for each date. There is a third field
 called Tw.ID and I was trying to use tapply on this field to do this.
 Later
 I will create a DocumentTermMatrix with the tm package on this dataframe.
 In the code below I have not figured out how to collapse the data so that
 there is only one record for each date and I don't really have a good way
 to add in a count field. Can anyone make any suggestions?
 Thanks.

 install.packages(c(tm))
 library(tm)
 y.df=read.csv(YHOO3000.csv, header=TRUE)
 y.df$Date= as.POSIXlt( y.df$Date)
 ysub14.df=y.df
 ysub14.df$Date=y.df$Date -14*3600 #I pushed the record times back a little
 here.
 ysub14.df$Date=as.Date(ysub14.df$Date, %Y-%m-%d)
 # might want to use groups-
 unstack(data.frame(ysub14.df$Text,ysub14.df$Date))
 # to put all the tweets for one day into a group. This makes a list
 # I think, with the name of the list being the Date and
 # the tweets for that date being stored in a vector.
 countgroup2=tapply(ysub14.df$Tw.ID,ysub14.df$Date,length)

  Hi Bill,
 Here is one way:

 # get some date-time strings
 dates-paste(2014-01-,10:15, ,sample(0:23,20),
  :,sample(0:60,20),:,sample(0:60,20),sep=)
 # function to return stupid text
 sillytext-function(n) {
  return(paste(sample(letters[1:26],n),sep=,collapse=))
 }
 # get the stupid text
 ttext-sapply(rep(10,20),sillytext)
 # make the data frame
 y.df-data.frame(dates,ttext)
 # convert the date-time strings to dates
 y.df$dates-
  as.Date(format(as.Date(dates,%Y-%m-%d %H:%M:%S),
  Y-%m-%d),Y-%m-%d)
 library(prettyR)
 # stretch out all the text strings for each day
 y2.df-stretch_df(y.df,dates,ttext)
 # get the dimension of the resulting data frame
 ydim-dim(y2.df)
 # function to count the NAs
 nna-function(x) return(sum(is.na(x)))
 # add a column with a count of _not_ NAs
 y2.df$nrec-
  (ydim[2]-1)-apply(as.matrix(y2.df[,2:ydim[2]]),1,nna)

 Jim


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] My problem with R

2014-01-22 Thread PIKAL Petr
Hi

First of all without posting some of your data it is really difficult to 
understand what you really want. Just select only a small part for 2 your data 
frames and post an output from dput.

e.g.

dput(data3[1:10, 1:7])
dput(data4[1:10, 1:7])

Most probably resulting timestamp is in POSIXlt mode which has list structure. 
If you want to work with it as with vector you need to transfer it to POSIXct

timestamp -as.POSIXct(timestamp)

and only after that to do rbinding. However I am not sure that rbind preserves 
time/date format. see below

 test[100:110]
[1] 2014-01-08 15:00:00 CET 2014-01-08 15:00:00 CET
[3] 2014-01-08 15:00:00 CET 2014-01-08 15:00:00 CET
[5] 2014-01-08 15:00:00 CET 2014-01-08 15:00:00 CET
[7] 2014-01-08 15:00:00 CET 2014-01-08 15:00:00 CET
[9] 2014-01-08 15:00:00 CET 2014-01-08 15:00:00 CET
[11] 2014-01-08 15:00:00 CET

 rbind(test[100:110], test[110:120])
   [,1]   [,2]   [,3]   [,4]   [,5]   [,6]
[1,] 1389189600 1389189600 1389189600 1389189600 1389189600 1389189600
[2,] 1389189600 1389189600 1389189600 1389189600 1389189600 1389189600
   [,7]   [,8]   [,9]  [,10]  [,11]
[1,] 1389189600 1389189600 1389189600 1389189600 1389189600
[2,] 1389189600 1389189600 1389189600 1389189600 1389189600

Resulting number are seconds from 1.1.1970 UTC. Look at Examples in POSIX help 
page.

I am still just guessing, without actual data it is difficult to understand 
where is the problem. I still believe that

merge(data1, data2, all=T)

is the most efficient way.
Petr

From: Lee Marine [mailto:marine1...@gmail.com]
Sent: Wednesday, January 22, 2014 9:16 AM
To: PIKAL Petr
Subject: Re: [R] My problem with R

Thanks for your re-email.
I success merge function(really appreciate you) but I have one problem.

That is,,, time data rbind...
I can not solved it.
I want to see

2013-10-04 04:42:00
2013-10-04 04:42:01
2013-10-04 04:42:02
...

timestamp(as you know) is really good,

timestamp3-strptime(paste(data3[,1],data3[,2], sep=),format=%Y-%m-%d 
%H:%M:%S)
timestamp4-strptime(paste(data4[,1],data4[,2], sep=),format=%Y-%m-%d 
%H:%M:%S)
...

I can not use merge because I need row binding, not column binding.

but It did not work.
Time-rbind(rbind(timestamp3,timestamp4,timestamp5,timestamp6,timestamp7,timestamp8,timestamp9,timestamp10,timestamp11,timestamp12,timestamp13,timestamp14,timestamp15,timestamp16)

 str(Time)
'data.frame':  14 obs. of  9 variables:
 $ sec  :List of 14
  ..$ timestamp3 : num  0 1 2 3 4 5 6 7 8 9 ...
  ..$ timestamp4 : num  0 1 2 3 4 5 6 7 8 9 ...
  ..$ timestamp5 : num  0 1 2 3 4 5 6 7 8 9 ...
  ..$ timestamp6 : num  0 1 2 3 4 5 6 7 8 9 ...
  ..$ timestamp7 : num  0 1 2 3 4 5 6 7 8 9 ...
  ..$ timestamp8 : num  0 1 2 3 4 5 6 7 8 9 ...
  ..$ timestamp9 : num  0 1 2 3 4 5 6 7 8 9 ...
  ..$ timestamp10: num  0 1 2 3 4 5 6 7 8 9 ...
  ..$ timestamp11: num  0 1 2 3 4 5 6 7 8 9 ...
  ..$ timestamp12: num  0 1 2 4 5 6 7 8 9 10 ...
  ..$ timestamp13: num  0 1 2 3 4 5 6 7 8 9 ...
  ..$ timestamp14: num  0 1 2 3 4 5 6 7 8 9 ...
  ..$ timestamp15: num  1 2 3 4 5 6 7 8 9 10 ...
  ..$ timestamp16: num  0 1 2 3 4 5 6 7 8 9 ...
 $ min  :List of 14
  ..$ timestamp3 : int  42 42 42 42 42 42 42 42 42 42 ...
  ..$ timestamp4 : int  0 0 0 0 0 0 0 0 0 0 ...

 summary(Time)
 sec.Length  sec.Class  sec.Mode min.Length  min.Class  min.Mode
 36568-none-   numeric   36568-none-   numeric
 85206-none-   numeric   85206-none-   numeric
 85207-none-   numeric   85207-none-   numeric
 85206-none-   numeric   85206-none-   numeric
...
 hour.Length  hour.Class  hour.Mode mday.Length  mday.Class  mday.Mode
 36568-none-   numeric  36568-none-   numeric
 85206-none-   numeric  85206-none-   numeric
 85207-none-   numeric  85207-none-   numeric
...
 mon.Length  mon.Class  mon.Mode year.Length  year.Class  year.Mode
 36568-none-   numeric   36568-none-   numeric
 85206-none-   numeric   85206-none-   numeric
...
 wday.Length  wday.Class  wday.Mode yday.Length  yday.Class  yday.Mode
 36568-none-   numeric  36568-none-   numeric
 85206-none-   numeric  85206-none-   numeric
...
 isdst.Length  isdst.Class  isdst.Mode
 36568-none-   numeric
 85206-none-   numeric


I want to result just like that;;;
[1] 2013-10-04 04:42:00 2013-10-04 04:42:01 2013-10-04 04:42:02
[4] 2013-10-04 04:42:03 2013-10-04 04:42:04 2013-10-04 04:42:05
...

I tried to find many ways
 mid-rbind(timestamp3=timestamp3, timestamp4=timestamp4)
mid-merge(timestamp3, timestamp4)
mid-rbind(as.POSIXct(timestamp3,timestamp4))
...
etc

all fail...T^ T

What should I do?

Best Regards,
Marine

2014/1/22 PIKAL Petr petr.pi...@precheza.czmailto:petr.pi...@precheza.cz
Hi

From: Lee Marine [mailto:marine1...@gmail.commailto:marine1...@gmail.com]
Sent: Wednesday, January 22, 2014 2:01 AM
To: PIKAL Petr
Subject: Re: [R] My problem with R

ps.


Re: [R] Density or Boxplot with median and mean

2014-01-22 Thread Alaios
Thanks Jim.. once again your rock





On Wednesday, January 22, 2014 9:51 AM, Jim Lemon j...@bitwrit.com.au wrote:
 
On 01/22/2014 07:37 PM, Alaios wrote:

 Hi there,
 I would like to be able to draw a density plot or a box plot where the median 
 and the median and the mean would be visible.

 If I decide a density plot I need to put two big marks one for the median and 
 one for the mean, which I do not know how I can achieve to put marks in a 
 density plot. For that I am using plot(density(myVector))

 while on the boxplot median is already visible but mean not. To have the mean 
 there I would have to add one more line on each boxplot, perhaps of different 
 color but I am not sure if that is possible in R. boxplot(myVector) I am using

 where myVector can be something like myVector-seq(1,200)

Hi Alex,
On a density plot you can use abline:

abline(v=mean(myVector),col=red)
abline(v=median(myVector),col=green)

I don't know of any boxplot function that will plot two measures of 
central tendency, but box.heresy in plotrix will plot any one measure 
that you like.

Jim
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] complicated IF

2014-01-22 Thread Bill
Hello. I am trying  to work out some complicated if() logic.
I thought of using which() and if() but cannot get it.

I have a dataframe that looks like this:

head(deleteFridayTest)

   Date nrec

1 2011-07-17  667

2 2011-07-18  266

3 2009-10-29   29

4 2009-10-30  211

5 2009-10-31  237

6 2009-11-01  898

I want to take the values in nrec for consecutive Friday, Saturday and
Sundays and average them and replace Sundays value with that average.

I came up with this:

deleteFridayTest[dayOfWeek(deleteFridayTest$Date)==Sun,]$nrec -
(deleteFridayTest[dayOfWeek(deleteFridayTest$Date)==Sun,]$nrec +
deleteFridayTest[dayOfWeek(deleteFridayTest$Date)==Sat,]$nrec +
deleteFridayTest[dayOfWeek(deleteFridayTest$Date)==Fri,]$nrec)/3

but this won't work for my data because sometimes one or more of the days
of data may be missing. For example Friday's data could be missing, or
Friday and Saturday, or Sunday may be missing, or they all may be missing,
etc.

The rule I want to implement is that
if any of Friday, Saturday, or Sunday is available then I want to have an
entry for Sunday (call it 'X'). If all 3 days are missing then nothing
should be done and there will be no entry for X. If any of the days Fri,
Sat, Sun are available then X should be the average of those values (e.g.
if two days are available then sum and divide by 2, if just one day is
available then just use that value for X).

Can anyone suggest how to go about this?
 Thank you.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Setting up an R server.

2014-01-22 Thread aldi

Hi John,

A server it means a computer that has an operating system, where you can 
run R. For example, a Linux OS can be run in a computer connected to 
your Local Area Network at home. There you can install R and you can 
communicate with it via batch mode or interactive. The simplest is you 
become root of your system and use a software yum to install R (yum 
install R). As Linux OS I use Fedora (free software), but one can use a 
number of other flavors of Linux OS. This software (yum) will place the 
executable of R in one of the bin dirs which is reachable from any 
directory you enter in the system. What you need after it is an ssh 
(secure shell) from the apps of ipad (apple), or if windows (x-win 
software). At home, as long as you have connected your server with your 
router, you can reach the server with no problem, by learning local 
addresses or defining a name to the system, while remotely (distant from 
your home) you need to know a fixed system address (probably you need to 
buy it) from your internet company. If more than one server you can do R 
parallel computing at home by using software such as gridware to 
distribute jobs.


Hope this gives an idea about what you are planning to do :-) .

Best,

Aldi



On 1/20/2014 10:53 AM, R. Michael Weylandt wrote:

Perhaps http://www.rstudio.com/ide/docs/server/getting_started

Michael

On Mon, Jan 20, 2014 at 9:12 AM, John Sorkin
jsor...@grecc.umaryland.edu wrote:

Can someone provide suggestions about how to best set up an R server? I would 
like to be able to run R on my IPad. It sounds like the only way to do this is 
to have the IPad access an R server. The server will be at my home, connected 
to the internet via my cable company (comcast). I don't yet know if the server 
will be a linux box or a windows box. I would appreciate advice about setting 
up both kinds of servers.
Thank you,
John

John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric 
Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for ...{{dropped:14}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R in remote mode

2014-01-22 Thread Michael Haenlein
Dear all,

I have written a simulation in R that has a significant running time
(probably 60-80 hours). While I can run the code on my laptop, it tends to
slow things down to a significant extent and it leads to a very high CPU
temperature overall.

Is there an easy and convenient way to run R remotely on some outside
server or PC? Any services that you are aware off? I know that there is a
way to run R on Amazon EC but I'm wondering whether there is something even
simpler. Ideally I am looking for a remote access to a PC where R is
already installed and where I can simply copy-paste my code and run it.

Please let me know in case you have any ideas,

Thanks in advance,

Michael


Michael Haenlein
Professor of Marketing
ESCP Europe
Paris, France

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] complicated IF

2014-01-22 Thread jim holtman
Here's one way of doing it.  Does not use complicated IFs; just
splits the data and works on it.

 x - read.table(text =Date nrec
+
+ 1 2011-07-17  667
+
+ 2 2011-07-18  266
+
+ 3 2009-10-29   29
+
+ 4 2009-10-30  211
+
+ 5 2009-10-31  237
+
+ 6 2009-11-01  898, header = TRUE, as.is = TRUE)
 # convert to Date
 x$Date - as.Date(x$Date)
 # add week of year
 x$week - format(x$Date, %Y%W)
 # add the day of week
 x$day - format(x$Date, %w)
 # process each week, substituting the mean if Sunday exists
 result - do.call(rbind
+ , lapply(split(x, x$week), function(.week){
+ means - mean(.week$nrec[.week$day %in% c('0', '5', '6')])
+ .week$nrec[.week$day == '0'] - means
+ .week
+ })
+ )


 result
   Date nrec   week day
200943.3 2009-10-29  29. 200943   4
200943.4 2009-10-30 211. 200943   5
200943.5 2009-10-31 237. 200943   6
200943.6 2009-11-01 448.6667 200943   0
201128   2011-07-17 667. 201128   0
201129   2011-07-18 266. 201129   1
 x
Date nrec   week day
1 2011-07-17  667 201128   0
2 2011-07-18  266 201129   1
3 2009-10-29   29 200943   4
4 2009-10-30  211 200943   5
5 2009-10-31  237 200943   6
6 2009-11-01  898 200943   0


Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Wed, Jan 22, 2014 at 6:33 AM, Bill william...@gmail.com wrote:
 Hello. I am trying  to work out some complicated if() logic.
 I thought of using which() and if() but cannot get it.

 I have a dataframe that looks like this:

 head(deleteFridayTest)

Date nrec

 1 2011-07-17  667

 2 2011-07-18  266

 3 2009-10-29   29

 4 2009-10-30  211

 5 2009-10-31  237

 6 2009-11-01  898

 I want to take the values in nrec for consecutive Friday, Saturday and
 Sundays and average them and replace Sundays value with that average.

 I came up with this:

 deleteFridayTest[dayOfWeek(deleteFridayTest$Date)==Sun,]$nrec -
 (deleteFridayTest[dayOfWeek(deleteFridayTest$Date)==Sun,]$nrec +
 deleteFridayTest[dayOfWeek(deleteFridayTest$Date)==Sat,]$nrec +
 deleteFridayTest[dayOfWeek(deleteFridayTest$Date)==Fri,]$nrec)/3

 but this won't work for my data because sometimes one or more of the days
 of data may be missing. For example Friday's data could be missing, or
 Friday and Saturday, or Sunday may be missing, or they all may be missing,
 etc.

 The rule I want to implement is that
 if any of Friday, Saturday, or Sunday is available then I want to have an
 entry for Sunday (call it 'X'). If all 3 days are missing then nothing
 should be done and there will be no entry for X. If any of the days Fri,
 Sat, Sun are available then X should be the average of those values (e.g.
 if two days are available then sum and divide by 2, if just one day is
 available then just use that value for X).

 Can anyone suggest how to go about this?
  Thank you.

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] complicated IF

2014-01-22 Thread Bill
Hello Jim,

Thanks for this. I will study it. One thing, you wrote # process each
week, substituting the mean if Sunday exists. Even if Sunday's data does
not exist, I need an entry for Sunday if Friday or Saturday (or both)
exist. I don't yet understand what you wrote so I am not sure if that is
the case.
Bill


On Wed, Jan 22, 2014 at 10:04 PM, jim holtman jholt...@gmail.com wrote:

 Here's one way of doing it.  Does not use complicated IFs; just
 splits the data and works on it.

  x - read.table(text =Date nrec
 +
 + 1 2011-07-17  667
 +
 + 2 2011-07-18  266
 +
 + 3 2009-10-29   29
 +
 + 4 2009-10-30  211
 +
 + 5 2009-10-31  237
 +
 + 6 2009-11-01  898, header = TRUE, as.is = TRUE)
  # convert to Date
  x$Date - as.Date(x$Date)
  # add week of year
  x$week - format(x$Date, %Y%W)
  # add the day of week
  x$day - format(x$Date, %w)
  # process each week, substituting the mean if Sunday exists
  result - do.call(rbind
 + , lapply(split(x, x$week), function(.week){
 + means - mean(.week$nrec[.week$day %in% c('0', '5', '6')])
 + .week$nrec[.week$day == '0'] - means
 + .week
 + })
 + )
 
 
  result
Date nrec   week day
 200943.3 2009-10-29  29. 200943   4
 200943.4 2009-10-30 211. 200943   5
 200943.5 2009-10-31 237. 200943   6
 200943.6 2009-11-01 448.6667 200943   0
 201128   2011-07-17 667. 201128   0
 201129   2011-07-18 266. 201129   1
  x
 Date nrec   week day
 1 2011-07-17  667 201128   0
 2 2011-07-18  266 201129   1
 3 2009-10-29   29 200943   4
 4 2009-10-30  211 200943   5
 5 2009-10-31  237 200943   6
 6 2009-11-01  898 200943   0
 

 Jim Holtman
 Data Munger Guru

 What is the problem that you are trying to solve?
 Tell me what you want to do, not how you want to do it.


 On Wed, Jan 22, 2014 at 6:33 AM, Bill william...@gmail.com wrote:
  Hello. I am trying  to work out some complicated if() logic.
  I thought of using which() and if() but cannot get it.
 
  I have a dataframe that looks like this:
 
  head(deleteFridayTest)
 
 Date nrec
 
  1 2011-07-17  667
 
  2 2011-07-18  266
 
  3 2009-10-29   29
 
  4 2009-10-30  211
 
  5 2009-10-31  237
 
  6 2009-11-01  898
 
  I want to take the values in nrec for consecutive Friday, Saturday and
  Sundays and average them and replace Sundays value with that average.
 
  I came up with this:
 
  deleteFridayTest[dayOfWeek(deleteFridayTest$Date)==Sun,]$nrec -
  (deleteFridayTest[dayOfWeek(deleteFridayTest$Date)==Sun,]$nrec +
  deleteFridayTest[dayOfWeek(deleteFridayTest$Date)==Sat,]$nrec +
  deleteFridayTest[dayOfWeek(deleteFridayTest$Date)==Fri,]$nrec)/3
 
  but this won't work for my data because sometimes one or more of the days
  of data may be missing. For example Friday's data could be missing, or
  Friday and Saturday, or Sunday may be missing, or they all may be
 missing,
  etc.
 
  The rule I want to implement is that
  if any of Friday, Saturday, or Sunday is available then I want to have an
  entry for Sunday (call it 'X'). If all 3 days are missing then nothing
  should be done and there will be no entry for X. If any of the days Fri,
  Sat, Sun are available then X should be the average of those values
 (e.g.
  if two days are available then sum and divide by 2, if just one day is
  available then just use that value for X).
 
  Can anyone suggest how to go about this?
   Thank you.
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] complicated IF

2014-01-22 Thread jim holtman
Here is the change to create a Sunday in a week if it does not exist.
I took out the Sunday (2009-11-01) for testing and you will notice
that week 201129 did not have a Sunday, so it has NaN as the result.

 x - read.table(text =Date nrec
+
+ 1 2011-07-17  667
+
+ 2 2011-07-18  266
+
+ 3 2009-10-29   29
+
+ 4 2009-10-30  211
+
+ 5 2009-10-31  237, header = TRUE, as.is = TRUE)
 # convert to Date
 x$Date - as.Date(x$Date)
 # add week of year
 x$week - format(x$Date, %Y%W)
 # add the day of week
 x$day - format(x$Date, %w)
 # process each week, substituting the mean if Sunday exists
 result - do.call(rbind
+ , lapply(split(x, x$week), function(.week){
+ means - mean(.week$nrec[.week$day %in% c('0', '5', '6')])
+ # check if Sunday exists; if not, create it
+ if (!any(.week$day == '0')){
+ # create a new entry for Sunday
+ .week - rbind(.week[1, ], .week)  # new entry in row 1
+ # convert date to Sunday by backing off the days of the week
+ .week$Date[1L] - .week$Date[1L] - as.numeric(.week$day[1L]) + 7
+ .week$day[1L] - '0'  # make it a Sunday
+ }
+ .week$nrec[.week$day == '0'] - means
+ .week
+ })
+ )


 result
Date nrec   week day
200943.3  2009-11-01  224 200943   0  # added
200943.31 2009-10-29   29 200943   4
200943.4  2009-10-30  211 200943   5
200943.5  2009-10-31  237 200943   6
2011282011-07-17  667 201128   0
201129.2  2011-07-24  NaN 201129   0  # no other days to average
201129.21 2011-07-18  266 201129   1
 x
Date nrec   week day
1 2011-07-17  667 201128   0
2 2011-07-18  266 201129   1
3 2009-10-29   29 200943   4
4 2009-10-30  211 200943   5
5 2009-10-31  237 200943   6


Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Wed, Jan 22, 2014 at 8:25 AM, Bill william...@gmail.com wrote:
 Hello Jim,

 Thanks for this. I will study it. One thing, you wrote # process each week,
 substituting the mean if Sunday exists. Even if Sunday's data does not
 exist, I need an entry for Sunday if Friday or Saturday (or both) exist. I
 don't yet understand what you wrote so I am not sure if that is the case.
 Bill


 On Wed, Jan 22, 2014 at 10:04 PM, jim holtman jholt...@gmail.com wrote:

 Here's one way of doing it.  Does not use complicated IFs; just
 splits the data and works on it.

  x - read.table(text =Date nrec
 +
 + 1 2011-07-17  667
 +
 + 2 2011-07-18  266
 +
 + 3 2009-10-29   29
 +
 + 4 2009-10-30  211
 +
 + 5 2009-10-31  237
 +
 + 6 2009-11-01  898, header = TRUE, as.is = TRUE)
  # convert to Date
  x$Date - as.Date(x$Date)
  # add week of year
  x$week - format(x$Date, %Y%W)
  # add the day of week
  x$day - format(x$Date, %w)
  # process each week, substituting the mean if Sunday exists
  result - do.call(rbind
 + , lapply(split(x, x$week), function(.week){
 + means - mean(.week$nrec[.week$day %in% c('0', '5', '6')])
 + .week$nrec[.week$day == '0'] - means
 + .week
 + })
 + )
 
 
  result
Date nrec   week day
 200943.3 2009-10-29  29. 200943   4
 200943.4 2009-10-30 211. 200943   5
 200943.5 2009-10-31 237. 200943   6
 200943.6 2009-11-01 448.6667 200943   0
 201128   2011-07-17 667. 201128   0
 201129   2011-07-18 266. 201129   1
  x
 Date nrec   week day
 1 2011-07-17  667 201128   0
 2 2011-07-18  266 201129   1
 3 2009-10-29   29 200943   4
 4 2009-10-30  211 200943   5
 5 2009-10-31  237 200943   6
 6 2009-11-01  898 200943   0
 

 Jim Holtman
 Data Munger Guru

 What is the problem that you are trying to solve?
 Tell me what you want to do, not how you want to do it.


 On Wed, Jan 22, 2014 at 6:33 AM, Bill william...@gmail.com wrote:
  Hello. I am trying  to work out some complicated if() logic.
  I thought of using which() and if() but cannot get it.
 
  I have a dataframe that looks like this:
 
  head(deleteFridayTest)
 
 Date nrec
 
  1 2011-07-17  667
 
  2 2011-07-18  266
 
  3 2009-10-29   29
 
  4 2009-10-30  211
 
  5 2009-10-31  237
 
  6 2009-11-01  898
 
  I want to take the values in nrec for consecutive Friday, Saturday and
  Sundays and average them and replace Sundays value with that average.
 
  I came up with this:
 
  deleteFridayTest[dayOfWeek(deleteFridayTest$Date)==Sun,]$nrec -
  (deleteFridayTest[dayOfWeek(deleteFridayTest$Date)==Sun,]$nrec +
  deleteFridayTest[dayOfWeek(deleteFridayTest$Date)==Sat,]$nrec +
  deleteFridayTest[dayOfWeek(deleteFridayTest$Date)==Fri,]$nrec)/3
 
  but this won't work for my data because sometimes one or more of the
  days
  of data may be missing. For example Friday's data could be missing, or
  Friday and Saturday, or Sunday may be missing, or they all may be
  missing,
  etc.
 
  The rule I want to implement is that
  if any of Friday, Saturday, or Sunday is available then I want to have
  

[R] ETAS-Help

2014-01-22 Thread katerina stavrianaki
Hello,
My name is Katerina, i am new to R and i am working with the ETAS package.
My goal is to fit the spatiotemporal etas model to an aftershock sequence ( 
atach file example.csv).I have installed the packages: spatstat, SAPP and ETAS. 
By reading the ETAS package manual i saw the data must be in class ppx.
Could you please help me on how to convert my data (example.csv) into class ppx 
?
Thank you in advance,Katerina __
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to get the numbers of factors in a matrix

2014-01-22 Thread William Dunlap
 sapply(X, function(m){nlevels(factor(m$latitudes))})

I think that length(unique(x)) is a more direct and easier to remember
way of determining the number of unique values in the vector x,
rather than nlevels(factor(x)).

Bill Dunlap
TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of PIKAL Petr
 Sent: Tuesday, January 21, 2014 11:29 PM
 To: 张以春
 Cc: r-help@r-project.org
 Subject: Re: [R] how to get the numbers of factors in a matrix
 
 Hi
 
 elaborating answers you already got
 
 sapply(X, function(m){nlevels(factor(m$latitudes))})
 tapply(N$latitudes, N$species, function(x) nlevels(factor(x)))
 
 shall do the trick
 
 Petr
 
 From: 张以春 [mailto:yczh...@nigpas.ac.cn]
 Sent: Tuesday, January 21, 2014 2:58 PM
 To: PIKAL Petr
 Cc: r-help@r-project.org
 Subject: Re: RE: [R] how to get the numbers of factors in a matrix
 
 Dear Pikal,
 
 Thank you very much for your answer.
 
 I think your example is just the problem I have.
 
 In the following example you gave to me,
 
   ff-factor(letters[1:5])
   levels(ff[1:2])
  [1] a b c d e
   fff-ff[1:2]
   nlevels(fff)
  [1] 5
 
   fff
  [1] a b
  Levels: a b c d e
 
 In my understanding, fff is a subset of ff. Why fff's levels is not a, b 
 but a,b,c,d,e.
 
 My problem is quite similar to the example. I just want to split the matrix 
 into many
 subsets and calculate the levels of every subset. Can you tell me how to do? 
 Thank you
 very much!
 
 Best regards,
 Yichun
 
  -原始邮件-
  发件人: PIKAL Petr petr.pi...@precheza.czmailto:petr.pi...@precheza.cz
  发送时间: 2014年1月21日 星期二
  收件人: 张以春 yczh...@nigpas.ac.cnmailto:yczh...@nigpas.ac.cn, r-
 h...@r-project.orgmailto:r-help@r-project.org 
 r-help@r-project.orgmailto:r-
 h...@r-project.org
  抄送:
  主题: RE: [R] how to get the numbers of factors in a matrix
 
  Hi
 
  It is rather difficult to understand what problem you have.
 
  post some data e.g. by
 
  dput(head(bigmatrix))
 
  Maybe your problem is in a factor feature that it preserves also empty 
  levels until you
 specifically drop them.
 
   ff-factor(letters[1:5])
   levels(ff[1:2])
  [1] a b c d e
   fff-ff[1:2]
   nlevels(fff)
  [1] 5
 
   fff
  [1] a b
  Levels: a b c d e
 
  Regards
  Petr
 
   -Original Message-
   From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org
 [mailto:r-help-bounces@r-
   project.orgmailto:r-help-bounces@r-%0b%3e %3e project.org] On Behalf Of 
   ???
   Sent: Tuesday, January 21, 2014 7:36 AM
   To: r-help@r-project.orgmailto:r-help@r-project.org
   Subject: [R] how to get the numbers of factors in a matrix
  
   Dear friends,
  
  
   I have a question do not know how to resolve.
  
  
   I have a big matrix composed of different columns (I use N here). A
   column is species and another one is latitudes. Now, I want to know
   how I can get the number of different latitudes for every species.
   I have tried to split the matrix according to species (X-split(N,
   N$species) and then use sapply(X, function(m){nlevels(m$latitudes)}) to
   get that. But the result shows the total factor numbers of latitudes
   but not the factor numbers of every species I splitted. Also, I have
   tried to use tapply(N$latitudes, N$species, nlevels) to do this. The
   result is the same. I am confused about this. Can someone help me with
   that? Thank you very much!
  
  
   Best regards,
   Yichun
  
  
  
  
  
  
 [[alternative HTML version deleted]]
  
   __
   R-help@r-project.orgmailto:R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide http://www.R-project.org/posting-
   guide.html
   and provide commented, minimal, self-contained, reproducible code.
 
  
  Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou 
  určeny pouze
 jeho adresátům.
  Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
  jeho
 odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze svého 
 systému.
  Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
  jakkoliv
 užívat, rozšiřovat, kopírovat či zveřejňovat.
  Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či
 zpožděním přenosu e-mailu.
 
  V případě, že je tento e-mail součástí obchodního jednání:
  - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření 
  smlouvy, a to z
 jakéhokoliv důvodu i bez uvedení důvodu.
  - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
  Odesílatel
 tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce s 
 dodatkem či
 odchylkou.
  - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
  dosažením
 shody na všech jejích náležitostech.
  - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za 
  společnost žádné
 smlouvy 

[R] a problem with table() and duplicates

2014-01-22 Thread Simone Gabbriellini
Dear List,

I have a data.frame like this:

name religion neighbor religion.neighbor
pippo a minnie a
pluto a mickey a
paperino b donald a
paperino b minnie b

when I table(dataframe$religion) my data.frame, I get
a b
2 2

of course, paperino is cited twice but should be counted once. Is
there anything I can do in order to keep the data.frame the way it is
but tell R to count values once if they are repeated?

the point is that each row represent a relation, thus I cannot simply
remove duplicates...

any help more than welcome!

Best regards,
Simone

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] a problem with table() and duplicates

2014-01-22 Thread jim holtman
try:

table(dataframe$religion[!duplicated(dataframe$name)])


Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Wed, Jan 22, 2014 at 11:04 AM, Simone Gabbriellini
simone.gabbriell...@gmail.com wrote:
 Dear List,

 I have a data.frame like this:

 name religion neighbor religion.neighbor
 pippo a minnie a
 pluto a mickey a
 paperino b donald a
 paperino b minnie b

 when I table(dataframe$religion) my data.frame, I get
 a b
 2 2

 of course, paperino is cited twice but should be counted once. Is
 there anything I can do in order to keep the data.frame the way it is
 but tell R to count values once if they are repeated?

 the point is that each row represent a relation, thus I cannot simply
 remove duplicates...

 any help more than welcome!

 Best regards,
 Simone

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] a problem with table() and duplicates

2014-01-22 Thread Simone Gabbriellini
that is awesome, thank you Jim!

2014/1/22 jim holtman jholt...@gmail.com:
 try:

 table(dataframe$religion[!duplicated(dataframe$name)])


 Jim Holtman
 Data Munger Guru

 What is the problem that you are trying to solve?
 Tell me what you want to do, not how you want to do it.


 On Wed, Jan 22, 2014 at 11:04 AM, Simone Gabbriellini
 simone.gabbriell...@gmail.com wrote:
 Dear List,

 I have a data.frame like this:

 name religion neighbor religion.neighbor
 pippo a minnie a
 pluto a mickey a
 paperino b donald a
 paperino b minnie b

 when I table(dataframe$religion) my data.frame, I get
 a b
 2 2

 of course, paperino is cited twice but should be counted once. Is
 there anything I can do in order to keep the data.frame the way it is
 but tell R to count values once if they are repeated?

 the point is that each row represent a relation, thus I cannot simply
 remove duplicates...

 any help more than welcome!

 Best regards,
 Simone

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
-

Simone Gabbriellini, PhD

Post-doctoral Researcher
ANR founded research project DIFFCERAM
GEMASS, CNRS  Paris-Sorbonne.

mobile: +39 340 39 75 626
email: simone.gabbriell...@cnrs.fr

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] New version of document on R programming - with videos [French]

2014-01-22 Thread Vincent Goulet
I hereby announce the availability of the Fourth edition of my document 
«Introduction à la programmation en R» (in French) in the contributed 
documentation section of CRAN. The document is now accompanied by a set of 
short videos on more challenging topics like creation and indexing of arrays, 
the order() function, etc. In my highly biased opinion, the illustration are 
pretty good. I might consider making English versions if there is sufficient 
interest. The videos are in the YouTube channel

http://www.youtube.com/user/VincentGouletIntroR

Please note that I don't currently monitor r-help regularly, so do not hesitate 
to write to me directly.

***

[The rest of this message is in French for the target audience]

La quatrième édition de mon document «Introduction à la programmation en R» est 
maintenant disponible dans la section de la documentation par les tiers 
(Contributed Documentation) de CRAN:

http://cran.r-project.org/other-docs.html

L’ouvrage est basé sur des notes de cours et des exercices utilisés à l’École 
d’actuariat de l’Université Laval. L’enseignement du langage R est axé sur 
l’exposition à un maximum de code — que nous avons la prétention de croire bien 
écrit — et sur la pratique de la programmation. C’est pourquoi les chapitres 
sont rédigés de manière synthétique et qu’ils comportent peu d’exemples au fil 
du texte. En revanche, le lecteur est appelé à lire et à exécuter le code 
informatique se trouvant dans les sections d’exemples à la fin de chacun des 
chapitres. Ce code et les commentaires qui l’accompagnent reviennent sur 
l’essentiel des concepts du chapitre et les complémentent souvent. Nous 
considérons l’exercice d’«étude active» consistant à exécuter du code et à voir 
ses effet comme essentielle à l’apprentissage du langage R.

Le code des sections d’exemples est disponible en format électronique au même 
endroit que le document.

Cette quatrième édition de l’ouvrage se distingue principalement de la 
précédente par l’ajout de liens vers des vidéos réalisées par l’auteur qui 
reviennent sur certains sujets plus délicats. Ces vidéos sont disponibles dans 
la chaîne YouTube

http://www.youtube.com/user/VincentGouletIntroR

Le document est publié sous licence Creative Commons.

Vincent Goulet, Ph.D.
Professor of Actuarial Science
Directeur général de la formation continue
Université Laval

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to get the numbers of factors in a matrix

2014-01-22 Thread peter dalgaard

On 22 Jan 2014, at 15:51 , William Dunlap wdun...@tibco.com wrote:

 sapply(X, function(m){nlevels(factor(m$latitudes))})
 
 I think that length(unique(x)) is a more direct and easier to remember
 way of determining the number of unique values in the vector x,
 rather than nlevels(factor(x)).

However, it may make you forget the possibility of NA:

 length(unique(factor(c(1,2,NA
[1] 3
 nlevels(factor(c(1,2,NA)))
[1] 2

-pd

 
 Bill Dunlap
 TIBCO Software
 wdunlap tibco.com
 
 
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of PIKAL Petr
 Sent: Tuesday, January 21, 2014 11:29 PM
 To: 张以春
 Cc: r-help@r-project.org
 Subject: Re: [R] how to get the numbers of factors in a matrix
 
 Hi
 
 elaborating answers you already got
 
 sapply(X, function(m){nlevels(factor(m$latitudes))})
 tapply(N$latitudes, N$species, function(x) nlevels(factor(x)))
 
 shall do the trick
 
 Petr
 
 From: 张以春 [mailto:yczh...@nigpas.ac.cn]
 Sent: Tuesday, January 21, 2014 2:58 PM
 To: PIKAL Petr
 Cc: r-help@r-project.org
 Subject: Re: RE: [R] how to get the numbers of factors in a matrix
 
 Dear Pikal,
 
 Thank you very much for your answer.
 
 I think your example is just the problem I have.
 
 In the following example you gave to me,
 
 ff-factor(letters[1:5])
 levels(ff[1:2])
 [1] a b c d e
 fff-ff[1:2]
 nlevels(fff)
 [1] 5
 
 fff
 [1] a b
 Levels: a b c d e
 
 In my understanding, fff is a subset of ff. Why fff's levels is not a, b 
 but a,b,c,d,e.
 
 My problem is quite similar to the example. I just want to split the matrix 
 into many
 subsets and calculate the levels of every subset. Can you tell me how to do? 
 Thank you
 very much!
 
 Best regards,
 Yichun
 
 -原始邮件-
 发件人: PIKAL Petr petr.pi...@precheza.czmailto:petr.pi...@precheza.cz
 发送时间: 2014年1月21日 星期二
 收件人: 张以春 yczh...@nigpas.ac.cnmailto:yczh...@nigpas.ac.cn, r-
 h...@r-project.orgmailto:r-help@r-project.org 
 r-help@r-project.orgmailto:r-
 h...@r-project.org
 抄送:
 主题: RE: [R] how to get the numbers of factors in a matrix
 
 Hi
 
 It is rather difficult to understand what problem you have.
 
 post some data e.g. by
 
 dput(head(bigmatrix))
 
 Maybe your problem is in a factor feature that it preserves also empty 
 levels until you
 specifically drop them.
 
 ff-factor(letters[1:5])
 levels(ff[1:2])
 [1] a b c d e
 fff-ff[1:2]
 nlevels(fff)
 [1] 5
 
 fff
 [1] a b
 Levels: a b c d e
 
 Regards
 Petr
 
 -Original Message-
 From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org
 [mailto:r-help-bounces@r-
 project.orgmailto:r-help-bounces@r-%0b%3e %3e project.org] On Behalf Of 
 ???
 Sent: Tuesday, January 21, 2014 7:36 AM
 To: r-help@r-project.orgmailto:r-help@r-project.org
 Subject: [R] how to get the numbers of factors in a matrix
 
 Dear friends,
 
 
 I have a question do not know how to resolve.
 
 
 I have a big matrix composed of different columns (I use N here). A
 column is species and another one is latitudes. Now, I want to know
 how I can get the number of different latitudes for every species.
 I have tried to split the matrix according to species (X-split(N,
 N$species) and then use sapply(X, function(m){nlevels(m$latitudes)}) to
 get that. But the result shows the total factor numbers of latitudes
 but not the factor numbers of every species I splitted. Also, I have
 tried to use tapply(N$latitudes, N$species, nlevels) to do this. The
 result is the same. I am confused about this. Can someone help me with
 that? Thank you very much!
 
 
 Best regards,
 Yichun
 
 
 
 
 
 
  [[alternative HTML version deleted]]
 
 __
 R-help@r-project.orgmailto:R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou 
 určeny pouze
 jeho adresátům.
 Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
 jeho
 odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze svého 
 systému.
 Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
 jakkoliv
 užívat, rozšiřovat, kopírovat či zveřejňovat.
 Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či
 zpožděním přenosu e-mailu.
 
 V případě, že je tento e-mail součástí obchodního jednání:
 - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření 
 smlouvy, a to z
 jakéhokoliv důvodu i bez uvedení důvodu.
 - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
 Odesílatel
 tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce s 
 dodatkem či
 odchylkou.
 - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
 dosažením
 shody na všech jejích náležitostech.
 - 

[R] Variance analysis

2014-01-22 Thread Grenier, Cecile (CIAT)
Dear R-helpers...

I've be trying to run a variance analysis to compare means between various 
lines in various treatments.
I have 10 genotypes (GEN), tested in 2 environments (ENV) and in each 
environment there are 3 repetitions (REP). Several traits were recoded (yield, 
flowering, plant height...)

First I checked whether the residuals were normally distributed and then the 
homogeneity of variances.
For those which satisfied the assumptions for ANOVA, I performed aov. I tested 
two models, one simple (GEN and ENV being fixed effects) and the other mixed 
effects (REP)

aov1 - aov (Y~GEN*ENV, data=mydata)
aov2  - aov (Y~GEN*ENV+Error(REP/ENV, data=mydata)
When I wanted to compare the likelihood of these models, I failed performing 
the extractAIC for the mixed model (aov2).
Is there any reason why extractAIC doesn't work in models including a random 
effect?


As for other traits the assumption of homoscedasticity was violated I ran a lmer

When I ran the following model
lmer1 - lmer(Y~GEN*ENV + (1|REP), data=mydata)

the following error message came
Error in `contrasts-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
  contrasts can be applied only to factors with 2 or more levels

Could you please help me with this?
Thanks
Cecile

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problems with a R-packages

2014-01-22 Thread Andreas Rybicki
Hallo,

Trust, that my request is one that can be posted under this mailing list

Would like to install

lossDev, Version in the Repository 3.0.0-4


and I did it under CRAN(sources).

get as response

versuche URL 'http://cran.at.r-project.org/src/contrib/lossDev_3.0.0-4.tar.gz'
Content type 'application/x-gzip' length 1539914 bytes (1.5 Mb)
URL geöffnet
==
downloaded 1.5 Mb

ERROR: dependencies ‘rjags’, ‘logspline’ are not available for package 
‘lossDev’
* removing 
‘/Library/Frameworks/R.framework/Versions/3.0/Resources/library/lossDev’

Die heruntergeladenen Quellpakete sind in 
   
‘/private/var/folders/1d/7y5t0hy57q5bphrlsrgkq43hgp/T/RtmpipUlzT/downloaded_packages’
 

Question: Was this download successful? Because in the 
R-packages-installation-window, there appears no remark in the column 
installed version, it is still blank.

Next I tried to install rjags and logspline. Same result regarding download.

Even if this question is simple, any help highly appreciated.

I am available via Skype too, to share my screen.

Using MacBook Pro.

Regards,

Andreas



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] reduce space between factors groups in a graph

2014-01-22 Thread Luigi Marongiu
Dear all,
I am preparing a graph in which values derived from 2 variable are
displayed using the stripchart function.
I have applied the factor function to separate the 2 variables in two
groups, although I noticed that the graph works anyway even without the
factorisation of the variables.
However there is a lot of space between the two factors, thus most of the
graph is empty.

It is possible to reduce the space between the factors?

The function I have implemented is more or less like this:

stripchart
(
Y ~ factor(X),
method = stack , offset=1/3, vertical = TRUE,
las=1, pch=19,
ylim=Y, xlim=c(1, 2),
par(mai=c(1,1,0.5,0.1))
)

Best regards
Luigi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] install ggplot2

2014-01-22 Thread Dai, Jie
Hi Dear helper,


I installed Rx643.0.2 on my windows 7 Enterprise computer, and I installed all 
the packages as well. However, when I tried to use ggplot2 package with the 
commend library(ggplot2, 
lib.loc=C:/Users/JXD043/Documents/R/win-library/3.0), I got the following 
error message: Error in library(ggplot2, lib.loc = 
C:/Users/JXD043/Documents/R/win-library/3.0) :
  there is no package called 'ggplot2'

Then I tried to reinstall the package with install.packages(ggplot2)

The message I got is : Installing package into 
'C:/Users/JXD043/Documents/R/win-library/3.0'
(as 'lib' is unspecified)
trying URL 'http://cran.rstudio.com/bin/windows/contrib/3.0/ggplot2_0.9.3.1.zip'
Content type 'application/zip' length 2657708 bytes (2.5 Mb)
opened URL
downloaded 2.5 Mb

package 'ggplot2' successfully unpacked and MD5 sums checked
Warning in install.packages :
  cannot remove prior installation of package 'ggplot2'

The downloaded binary packages are in
C:\Users\JXD043\AppData\Local\Temp\RtmpuIVA2s\downloaded_packages
So I extracted the files into the folder, however, I still can't get the 
library work.

Please help me and let me know what can I do to fix it.

Thanks!

Jie DAI




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multiple corrgrams or joining jpg/png

2014-01-22 Thread Georg Hörmann

Hello world,

I have a database with time series of concentration of nutrients for 
several lakes. I wanted *one* corrgram for each
nutrient in all lakes (correlation of a single nutrient content of all 
lakes in different years). The single corrgram works pretty well,

but I cannot create a page with all nutrients on one page, i.e.
several corrgrams on one page. The usual mfrow and layout
commands do not work (splom has the same problem).
I wonder if anyone already has a solution.

A workaround would be to write a single jpg/png for each corrgram
and join them. Is there a possibility to do this *automatically* or even 
*within* R? (I do not want to align 300 figures manually :-)


Merci and greetings,
GEorg


--
Georg Hoermann,
Department of Hydrology and Water Resources Management
Kiel University, Germany
+49/431/2190916, mo: +49/176/64335754, icq:348340729, skype: ghoermann

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] geo_bar x= and y= warnings and error help

2014-01-22 Thread Matthew Henn
Any insight on issues leading to the following error modes would be 
appreciated.

#Version_1 CALL
alphaDivOTU - ggplot(data=alphaDivOTU_pt1to5, aes(y = Num.OTUs,x = 
Patient,fill = Timepoint)) +
 geom_bar(position = position_dodge) +
 theme(text = element_text(family = 'Helvetica-Narrow',size = 18.0)) +
 scale_fill_manual(guide = guide_legend(),values = 
c(forestgreen,gray44,dodgerblue2,royalblue2,royalblue4,blue3)) +
 scale_y_continuous(breaks = pretty_breaks(n = 10.0,min.n = 5.0))

ggsave(plot=alphaDivOTU, filename='alphaDivOTU.png', scale=1, dpi=300, 
width=10, height=10, units=c(cm))

#Version_1 Error modes
Mapping a variable to y and also using stat=bin.
   With stat=bin, it will attempt to set the y value to the count of 
cases in each group.
   This can result in unexpected behavior and will not be allowed in a 
future version of ggplot2.
   If you want y to represent counts of cases, use stat=bin and don't 
map a variable to y.
   If you want y to represent values in the data, use stat=identity.
   See ?geom_bar for examples. (Deprecated; last used in version 0.9.2)
Error in .$position$adjust : object of type 'closure' is not subsettable

#Version_2 CALL
alphaDivOTU - ggplot(data=alphaDivOTU_pt1to5, aes(y = Num.OTUs,x = 
Patient,fill = Timepoint)) +
 geom_bar(position = position_dodge, stat = identity) +
 theme(text = element_text(family = 'Helvetica-Narrow',size = 18.0)) +
 scale_fill_manual(guide = guide_legend(),values = 
c(forestgreen,gray44,dodgerblue2,royalblue2,royalblue4,blue3)) +
 scale_y_continuous(breaks = pretty_breaks(n = 10.0,min.n = 5.0))

ggsave(plot=alphaDivOTU, filename='alphaDivOTU.png', scale=1, dpi=300, 
width=10, height=10, units=c(cm))

#For Version_2 I get the error:
Error in stat$parameters : object of type 'closure' is not subsettable

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple corrgrams or joining jpg/png

2014-01-22 Thread Kevin Wright
1. When using a package from CRAN, you usually want to copy the package
author on the question.  (In this case, me.)

2. The corrgram function is basically a wrapper around the pairs()
function.   What you want to do doesn't seem to be possible based on this
discussion:
https://stat.ethz.ch/pipermail/r-help/2004-December/063112.html

Kevin



On Wed, Jan 22, 2014 at 3:40 PM, Georg Hörmann 
ghoerm...@hydrology.uni-kiel.de wrote:

 Hello world,

 I have a database with time series of concentration of nutrients for
 several lakes. I wanted *one* corrgram for each
 nutrient in all lakes (correlation of a single nutrient content of all
 lakes in different years). The single corrgram works pretty well,
 but I cannot create a page with all nutrients on one page, i.e.
 several corrgrams on one page. The usual mfrow and layout
 commands do not work (splom has the same problem).
 I wonder if anyone already has a solution.

 A workaround would be to write a single jpg/png for each corrgram
 and join them. Is there a possibility to do this *automatically* or even
 *within* R? (I do not want to align 300 figures manually :-)

 Merci and greetings,
 GEorg


 --
 Georg Hoermann,
 Department of Hydrology and Water Resources Management
 Kiel University, Germany
 +49/431/2190916, mo: +49/176/64335754, icq:348340729, skype: ghoermann

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/
 posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Kevin Wright

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] xyplots in lattice - strange behaviour, possible bug?

2014-01-22 Thread Manlio Calvi
Hello everyone,

I'm very green on R,  I'm following a Coursera course about it when I
hit a problem when I rewrote the same code the professor use in the
lecture.
I'm running Win 7 x64, R 3.0.2 x64 and the last version of Rstudio IDE

 I put up this script:


library(lattice)
x - rnorm (100)
z - x + rnorm(100)
f - gl(2,50,labels =c(Groups 1 , Groups 2))
xyplot (z ~ x | f,
panel = function (x, z, ...) {
  panel.xyplot(x,z, ...)
  panel.abline(h = median(z),
   lty=2
   )})


In my box don't work, it give no error in the terminal, the plotting
windonw will be opened, the graphbox drawed with all the ticks and the
titles as intended but instead of the actual data plot inside the
graph I have this error Error using packet x argment z is
missing, with no default where x is 1 or 2 as the script draw two
graphs.

I reported this behaviour in the lecture forum and someone replicated it.

I replicated this behaviour even with R alone running the above script
with the same results.

If I call traceback() no value is given, there is no traceback.

Apparently not everyone could replicate this behaviour for some reason.

As you could see the code must work but didn't.

A similar thing happens if I change the part after the function with
another like:

... same code of above...
 panel= function(x,y, ...) {
  panel.xyplot(x,z, ...)
  fit - lm(y~x)
  panel.abline(fit)
})

but don't happens if I call a xyplot without calling a function in it.

Have any ideas?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] xyplots in lattice - strange behaviour, possible bug?

2014-01-22 Thread Bert Gunter
Well, if the professor wrote that, it wouldn't have run for him
either! You need to take better notes.

What's going on: You need to distinguish between formal and actual arguments.
?panel.xyplot
tells you that the formal arguments for this function are x,**y** ,...
 (emphasis added) and NOT x,**z**,...

The **actual** argument for y passed to the function will be z. So
change your z to a y in your function call and it will run:

library(lattice)
x - rnorm (100)
z - x + rnorm(100)
f - gl(2,50,labels =c(Groups 1 , Groups 2))
xyplot (z ~ x | f,
panel = function (x, y, ...) {
  panel.xyplot(x,y, ...)
  panel.abline(h = median(y),
   lty=2
   )})


Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
H. Gilbert Welch




On Wed, Jan 22, 2014 at 2:10 PM, Manlio Calvi manlio.ca...@gmail.com wrote:
 Hello everyone,

 I'm very green on R,  I'm following a Coursera course about it when I
 hit a problem when I rewrote the same code the professor use in the
 lecture.
 I'm running Win 7 x64, R 3.0.2 x64 and the last version of Rstudio IDE

  I put up this script:


 library(lattice)
 x - rnorm (100)
 z - x + rnorm(100)
 f - gl(2,50,labels =c(Groups 1 , Groups 2))
 xyplot (z ~ x | f,
 panel = function (x, z, ...) {
   panel.xyplot(x,z, ...)
   panel.abline(h = median(z),
lty=2
)})


 In my box don't work, it give no error in the terminal, the plotting
 windonw will be opened, the graphbox drawed with all the ticks and the
 titles as intended but instead of the actual data plot inside the
 graph I have this error Error using packet x argment z is
 missing, with no default where x is 1 or 2 as the script draw two
 graphs.

 I reported this behaviour in the lecture forum and someone replicated it.

 I replicated this behaviour even with R alone running the above script
 with the same results.

 If I call traceback() no value is given, there is no traceback.

 Apparently not everyone could replicate this behaviour for some reason.

 As you could see the code must work but didn't.

 A similar thing happens if I change the part after the function with
 another like:

 ... same code of above...
  panel= function(x,y, ...) {
   panel.xyplot(x,z, ...)
   fit - lm(y~x)
   panel.abline(fit)
 })

 but don't happens if I call a xyplot without calling a function in it.

 Have any ideas?

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] xyplots in lattice - strange behaviour, possible bug?

2014-01-22 Thread Don McKenzie

On Jan 22, 2014, at 3:02 PM, Bert Gunter gunter.ber...@gene.com wrote:

 Well, if the professor wrote that, it wouldn't have run for him
 either! 

Fortune?  Or just a great line?


Don McKenzie
Research Ecologist
Pacific Wildland Fire Science Lab
US Forest Service

Affiliate Professor
School of Environmental and Forest Sciences
University of Washington
d...@uw.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] xyplots in lattice - strange behaviour, possible bug?

2014-01-22 Thread Bert Gunter
... and I should have added (more complexity!) that the formula method
of xyplot parses the formula and passes down what's on the left hand
side of ~ to the y argument of the panel function.

And if all else fails, read the docs! -- in this case for ?xyplot --
where it explicitly says:

... A panel function appropriate for the functions described here
would usually expect arguments named x and y, which would be provided
by the conditioning process

And please oh please do not suggest as a newbie that your confusion is
due to bugs in long used and extensively tested R code. That just
seems arrogant to me (I didn't get it so the software must be
buggy).

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
H. Gilbert Welch




On Wed, Jan 22, 2014 at 3:02 PM, Bert Gunter bgun...@gene.com wrote:
 Well, if the professor wrote that, it wouldn't have run for him
 either! You need to take better notes.

 What's going on: You need to distinguish between formal and actual arguments.
 ?panel.xyplot
 tells you that the formal arguments for this function are x,**y** ,...
  (emphasis added) and NOT x,**z**,...

 The **actual** argument for y passed to the function will be z. So
 change your z to a y in your function call and it will run:

 library(lattice)
 x - rnorm (100)
 z - x + rnorm(100)
 f - gl(2,50,labels =c(Groups 1 , Groups 2))
 xyplot (z ~ x | f,
 panel = function (x, y, ...) {
   panel.xyplot(x,y, ...)
   panel.abline(h = median(y),
lty=2
)})


 Cheers,
 Bert

 Bert Gunter
 Genentech Nonclinical Biostatistics
 (650) 467-7374

 Data is not information. Information is not knowledge. And knowledge
 is certainly not wisdom.
 H. Gilbert Welch




 On Wed, Jan 22, 2014 at 2:10 PM, Manlio Calvi manlio.ca...@gmail.com wrote:
 Hello everyone,

 I'm very green on R,  I'm following a Coursera course about it when I
 hit a problem when I rewrote the same code the professor use in the
 lecture.
 I'm running Win 7 x64, R 3.0.2 x64 and the last version of Rstudio IDE

  I put up this script:


 library(lattice)
 x - rnorm (100)
 z - x + rnorm(100)
 f - gl(2,50,labels =c(Groups 1 , Groups 2))
 xyplot (z ~ x | f,
 panel = function (x, z, ...) {
   panel.xyplot(x,z, ...)
   panel.abline(h = median(z),
lty=2
)})


 In my box don't work, it give no error in the terminal, the plotting
 windonw will be opened, the graphbox drawed with all the ticks and the
 titles as intended but instead of the actual data plot inside the
 graph I have this error Error using packet x argment z is
 missing, with no default where x is 1 or 2 as the script draw two
 graphs.

 I reported this behaviour in the lecture forum and someone replicated it.

 I replicated this behaviour even with R alone running the above script
 with the same results.

 If I call traceback() no value is given, there is no traceback.

 Apparently not everyone could replicate this behaviour for some reason.

 As you could see the code must work but didn't.

 A similar thing happens if I change the part after the function with
 another like:

 ... same code of above...
  panel= function(x,y, ...) {
   panel.xyplot(x,z, ...)
   fit - lm(y~x)
   panel.abline(fit)
 })

 but don't happens if I call a xyplot without calling a function in it.

 Have any ideas?

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ETAS-Help

2014-01-22 Thread Jim Lemon

On 01/22/2014 10:03 PM, katerina stavrianaki wrote:

Hello,
My name is Katerina, i am new to R and i am working with the ETAS package.
My goal is to fit the spatiotemporal etas model to an aftershock sequence ( 
atach file example.csv).I have installed the packages: spatstat, SAPP and ETAS. 
By reading the ETAS package manual i saw the data must be in class ppx.
Could you please help me on how to convert my data (example.csv) into class ppx 
?
Thank you in advance,Katerina   


Hi Katerina,
Your example data didn't make it through to the list, but it seems that 
what you want to do is to read that file into a data frame (see 
read.csv) and then pass that data frame to the ppx function in spatstat. 
Read the help page for ppx to see how this is done, I think that all you 
need is the data frame and a vector of data types in the ccord.type 
argument. It looks like you should only pass those columns of the data 
frame that are spatial, temporal or local coordinates, and marks, so 
make sure that you know how to specify a subset of columns (e.g. 
df[,c(2,4,5,6)]) if you have other information in the data frame.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] subset and na.rm not really suppressing NA values

2014-01-22 Thread Jeff Johnson
I have a dataset mydf with a field EMAIL_ADDRESS. When importing, I
specified:
mydf - read.csv(file = extract, header = TRUE, stringsAsFactors = FALSE,
na.strings=c(NA,))

I've also tried setting na.strings= c(NA,,NA) but I don't know if
it's appropriate to put NA there.

I'm running
a - subset(mydf, VALID_EMAIL == FALSE, na.rm = TRUE, select =
EMAIL_ADDRESS)
dput(head(a,5))

structure(list(EMAIL_ADDRESS = c(NA_character_, NA_character_,
NA_character_, NA_character_, NA_character_)), .Names = EMAIL_ADDRESS,
row.names = c(17L,
22L, 23L, 24L, 30L), class = data.frame)

The results show a lot of NA values on screen and in the dput statement.

I don't quite understand why it is doing that. I would have expected it to
exclude those since I had the na.rm = TRUE statement. Do you have any
suggestions?

Thanks!
-- 
Jeff

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset and na.rm not really suppressing NA values

2014-01-22 Thread Jeff Newmiller
I don't think na.rm is a valid at parameter for the subset function. I would 
normally use the is.na function to logically test for NA values. I also don't 
know where your VALID_EMAIL variable is coming from.

a - subset(mydf, !is.na(EMAIL_ADDRESS))

The na.strings argument to read.csv and friends is used to help recognise 
strings in the input that should be treated as NA. If you don't see NA in 
your input file then it will have no effect on the data import.

---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Jeff Johnson mrjeffto...@gmail.com wrote:
I have a dataset mydf with a field EMAIL_ADDRESS. When importing, I
specified:
mydf - read.csv(file = extract, header = TRUE, stringsAsFactors =
FALSE,
na.strings=c(NA,))

I've also tried setting na.strings= c(NA,,NA) but I don't know
if
it's appropriate to put NA there.

I'm running
a - subset(mydf, VALID_EMAIL == FALSE, na.rm = TRUE, select =
EMAIL_ADDRESS)
dput(head(a,5))

structure(list(EMAIL_ADDRESS = c(NA_character_, NA_character_,
NA_character_, NA_character_, NA_character_)), .Names =
EMAIL_ADDRESS,
row.names = c(17L,
22L, 23L, 24L, 30L), class = data.frame)

The results show a lot of NA values on screen and in the dput
statement.

I don't quite understand why it is doing that. I would have expected it
to
exclude those since I had the na.rm = TRUE statement. Do you have any
suggestions?

Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] unable to carry through object in a nested function

2014-01-22 Thread Chen, George
Hi There,
I am having trouble carrying through an object listed in the outer function 
into the inner function of a nested pair.
This is my code:

FindGreaterThanProportion- function (y, SSThreshold) {
DenomCells-length(y)
NumerCells-subset(y,ySSThreshold)
PropCells-length(NumerCells)/DenomCells
return (PropCells)
}


GetPropPlot- function (PrePropData, direction, SSThreshold, cell, stim) {
# Split up the dataframe with ddply and apply the function
print(direction)
print (SSThreshold)

if (direction==) {
PropData-ddply(PrePropData,.(SUBJECT, STIM, CELL, SIGNAL, DAY), 
summarise,
PROP=FindGreaterThanProportion(SIMSCORE, SSThreshold))
}

PlotSubset - subset(PropData, PropData$STIM==stim  PropData$CELL == cell)
PropPlot-ggplot(data=PlotSubset, aes(x=DAY, y=PROP)) +
geom_line() +
geom_point() +
facet_grid(SIGNAL~SUBJECT)
return(FinalPropPlot)
}

When I run the code, I get this error:
[1] 
[1] 2
Error in subset.default(y, y  SSThreshold) :
  object 'SSThreshold' not found

It seems as if SSThreshold is not being passed into FindGreaterThanProportion.

Any help would be appreciated to determine what I am doing incorrectly.
Thanks in advance.

George Chen



This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unable to carry through object in a nested function

2014-01-22 Thread Chen, George
Hi,
This is a resend of a previous message reproduced below but with sample data to 
run.
Thanks.
George Chen



Hi There,
I am having trouble carrying through an object listed in the outer function 
into the inner function of a nested pair.

sample data below -
library(plyr)

SUBJECT-c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
STIM-c(No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No)
CELL-c(CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4)
SIGNAL-c(ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC)
DAY-c(7,7,7,7,7,7,7,7,7,7,1,1,1,1,1,1,1,1,1,1)
SIMSCORE-c(2,3,4,5,2,3,4,5,2,3,4,5,2,3,4,5,2,3,4,5)
PrePropData-data.frame(SUBJECT, STIM, CELL, SIGNAL, DAY, SIMSCORE)



This is my code:

FindGreaterThanProportion- function (y, SSThreshold) {
DenomCells-length(y)
NumerCells-subset(y,ySSThreshold)
PropCells-length(NumerCells)/DenomCells
return (PropCells)
}


GetPropPlot- function (PrePropData, direction, SSThreshold, cell, stim) {
# Split up the dataframe with ddply and apply the function
print(direction)
print (SSThreshold)

if (direction==) {
PropData-ddply(PrePropData,.(SUBJECT, STIM, CELL, SIGNAL, DAY), 
summarise,
PROP=FindGreaterThanProportion(SIMSCORE, SSThreshold))
}

PlotSubset - subset(PropData, PropData$STIM==stim  PropData$CELL == cell)
PropPlot-ggplot(data=PlotSubset, aes(x=DAY, y=PROP)) +
geom_line() +
geom_point() +
facet_grid(SIGNAL~SUBJECT)
return(FinalPropPlot)
}

GetPropPlot(PrePropData, direction=, SSThreshold=3, CD4, No)

When I run the code, I get this error:
[1] 
[1] 3
Error in subset.default(y, y  SSThreshold) :
  object 'SSThreshold' not found

It seems as if SSThreshold is not being passed into FindGreaterThanProportion.

Any help would be appreciated to determine what I am doing incorrectly.
Thanks in advance.

George Chen



This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unable to carry through object in a nested function

2014-01-22 Thread arun
Hi,
Try this:
GetPropPlot- function (PrePropData, direction, SSThreshold, cell, stim) {
   if (direction== ) {
    PropData-ddply(PrePropData,.(SUBJECT, STIM, CELL, SIGNAL, DAY), 
here(summarise),
    PROP=FindGreaterThanProportion(SIMSCORE, SSThreshold))
    }
    PlotSubset - subset(PropData, STIM==stim  CELL == cell)
    FinalPropPlot-ggplot(data=PlotSubset, aes(x=DAY, y=PROP)) +
    geom_line() +
    geom_point() +
    facet_grid(SIGNAL~SUBJECT)
    return(FinalPropPlot)
    }


GetPropPlot(PrePropData, direction=, SSThreshold=3, CD4, No)
A.K.


On Wednesday, January 22, 2014 10:51 PM, Chen, George 
george.c...@roswellpark.org wrote:
Hi,
This is a resend of a previous message reproduced below but with sample data to 
run.
Thanks.
George Chen



Hi There,
I am having trouble carrying through an object listed in the outer function 
into the inner function of a nested pair.

sample data below -
library(plyr)

SUBJECT-c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
STIM-c(No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No)
CELL-c(CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4,CD4)
SIGNAL-c(ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC,ABC)
DAY-c(7,7,7,7,7,7,7,7,7,7,1,1,1,1,1,1,1,1,1,1)
SIMSCORE-c(2,3,4,5,2,3,4,5,2,3,4,5,2,3,4,5,2,3,4,5)
PrePropData-data.frame(SUBJECT, STIM, CELL, SIGNAL, DAY, SIMSCORE)



This is my code:

FindGreaterThanProportion- function (y, SSThreshold) {
    DenomCells-length(y)
    NumerCells-subset(y,ySSThreshold)
    PropCells-length(NumerCells)/DenomCells
    return (PropCells)
    }


GetPropPlot- function (PrePropData, direction, SSThreshold, cell, stim) {
    # Split up the dataframe with ddply and apply the function
    print(direction)
    print (SSThreshold)

    if (direction==) {
        PropData-ddply(PrePropData,.(SUBJECT, STIM, CELL, SIGNAL, DAY), 
summarise,
            PROP=FindGreaterThanProportion(SIMSCORE, SSThreshold))
        }

    PlotSubset - subset(PropData, PropData$STIM==stim  PropData$CELL == cell)
    PropPlot-ggplot(data=PlotSubset, aes(x=DAY, y=PROP)) +
        geom_line() +
        geom_point() +
        facet_grid(SIGNAL~SUBJECT)
    return(FinalPropPlot)
    }

GetPropPlot(PrePropData, direction=, SSThreshold=3, CD4, No)

When I run the code, I get this error:
[1] 
[1] 3

Error in subset.default(y, y  SSThreshold) :
  object 'SSThreshold' not found

It seems as if SSThreshold is not being passed into FindGreaterThanProportion.

Any help would be appreciated to determine what I am doing incorrectly.
Thanks in advance.

George Chen



This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] install ggplot2

2014-01-22 Thread Jeff Newmiller
Hard to say. Read the Posting Guide for useful suggestions on getting help, 
including providing the output of sessionInfo() and avoiding HTML formatted 
emails because what you see is not what we see.

I suggest you delete the ggplot2 subdirectory in your win-library directory and 
try the

install.packages(ggplot2)

command again. If you have installed packages while running R As 
Administrator then you may have permissions problems on your win-library 
directory and need to delete it (As Administrator) and reinstall R. 
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Dai, Jie j...@amfam.com wrote:
Hi Dear helper,


I installed Rx643.0.2 on my windows 7 Enterprise computer, and I
installed all the packages as well. However, when I tried to use
ggplot2 package with the commend library(ggplot2,
lib.loc=C:/Users/JXD043/Documents/R/win-library/3.0), I got the
following error message: Error in library(ggplot2, lib.loc =
C:/Users/JXD043/Documents/R/win-library/3.0) :
  there is no package called 'ggplot2'

Then I tried to reinstall the package with install.packages(ggplot2)

The message I got is : Installing package into
'C:/Users/JXD043/Documents/R/win-library/3.0'
(as 'lib' is unspecified)
trying URL
'http://cran.rstudio.com/bin/windows/contrib/3.0/ggplot2_0.9.3.1.zip'
Content type 'application/zip' length 2657708 bytes (2.5 Mb)
opened URL
downloaded 2.5 Mb

package 'ggplot2' successfully unpacked and MD5 sums checked
Warning in install.packages :
  cannot remove prior installation of package 'ggplot2'

The downloaded binary packages are in
  C:\Users\JXD043\AppData\Local\Temp\RtmpuIVA2s\downloaded_packages
So I extracted the files into the folder, however, I still can't get
the library work.

Please help me and let me know what can I do to fix it.

Thanks!

Jie DAI




   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.