[R] help.search not working - returns readRDS(file) : unknown input format

2013-03-11 Thread Gustaf Rydevik
Hi all,

I'm using R 2.15.1 under RStudio on a WinXP computer.

This morning as I started up RStudio, I noticed that there was
something wrong with the help database.
I entered and received the following:

??'xls'
#Error in readRDS(file) : unknown input format


writing help.search('xls') returned the same error. a plain help(plot)
worked fine/  Some googling led me to close RStudio, remove the .RData
and .RHistory I was using,
but the same error  remained. Another googling suggested that I'd use
the rebuild=T option, so I ran

lapply(rownames(installed.packages()),function(x)help.search('',package=X-x,rebuild=T))

Now there is no error, but help.search cannot find anything - it seems
as if the database is empty.
Does anyone have experience with a similar error?

Best regards,
Gustaf



-- 
Gustaf Rydevik, M.Sci.
tel: +44(0)74 253 760 42
address:St John's hill 18/5  EH8 9UQ Edinburgh, UK
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Remove space from string

2012-01-13 Thread Gustaf Rydevik
gsub( ,,a)

/Gustaf

On Fri, Jan 13, 2012 at 12:24 PM, Vikram Bahure
economics.vik...@gmail.comwrote:

 Dear R users,

 I have some trivial query.

 I have a string, I want to remove space from the string.

 For eg.

 Input:
 a -  Remove space 

 Output required:
 Removespace

 I tried using str_trim but only removes end spaces. library(stringr).

 Regards
 Vikram

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gustaf Rydevik, M.Sci.
tel: +44(0)74 253 760 42
address:St John's hill 18/5  EH8 9UQ Edinburgh, UK
skype:gustaf_rydevik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: WHO Anthro growth curve macros and R

2011-10-11 Thread Gustaf Rydevik
On Tue, Oct 11, 2011 at 1:21 AM, David Winsemius dwinsem...@comcast.netwrote:


 On Oct 10, 2011, at 4:48 PM, Gustaf Rydevik wrote:

  Hi all,
 some years ago, I sent a question to the mailing list regarding the WHO
 anthro macros. Since I've now received three mails asking how I solved it,
 I
 thought I'd cc R-help in for future reference. Attaching a zip file
 with  the relevant code parts that
 I used that I'm not sure gets through (if anyone has recommendations on
 how
 to manage such files for the list, I'd be grateful.
  What I ended up doing was importing the data in SPSS format, and
 adapting the Splus function igrowup.standard slightly.
 igrowup.standard2.R is the adapted function, while the ssc files are
 original splus functions. Let me know if anyone gets problems in figuring
 out how to use the files.


 The only files that reach the readership are .pdf and .txt files. I do not
 know how carefully these get inspected, so it is possible that a zip file
 named something.txt might make it through.


  best regards,
 Gustaf

  \
 David Winsemius, MD
 West Hartford, CT



Hi all again,

I noticed (and suspected) that as David said, zip files does not get
through.
Here's a google docs link for the Anthro example.zip file that won't
change in the foreseeable future:

*
https://docs.google.com/viewer?a=vpid=explorerchrome=truesrcid=0B77NeAmIHMaQMjJkZTQ0OTQtNTRkYy00ZWMzLThhNTUtMzg1ZDY5MjljOGQxhl=en_US

*(if the link is problematic due to it's length, try *
http://tinyurl.com/625vod6 *instead)*
*The most interesting files are igrowup.standard2.R (which is a modified
version of igrowup.standard) and anthro-example.R.
Hopes this comes in use for someone in the future!

Regards,
Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +44(0)704 253 760 42
address:St John's hill 18/5  EH8 9UQ Edinburgh, UK
skype:gustaf_rydevik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fwd: WHO Anthro growth curve macros and R

2011-10-10 Thread Gustaf Rydevik
Hi all,
some years ago, I sent a question to the mailing list regarding the WHO
anthro macros. Since I've now received three mails asking how I solved it, I
thought I'd cc R-help in for future reference. Attaching a zip file
with  the relevant code parts that
I used that I'm not sure gets through (if anyone has recommendations on how
to manage such files for the list, I'd be grateful.
  What I ended up doing was importing the data in SPSS format, and
 adapting the Splus function igrowup.standard slightly.
igrowup.standard2.R is the adapted function, while the ssc files are
original splus functions. Let me know if anyone gets problems in figuring
out how to use the files.

best regards,
Gustaf


-- 
Gustaf Rydevik, M.Sci.
tel: +44(0)704 253 760 42
address:St John's hill 18/5  EH8 9UQ Edinburgh, UK
skype:gustaf_rydevik
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Rd] New errors with difftime()-objects in 2.11.1 (was Re: Request: difftime method for cut())

2010-06-23 Thread Gustaf Rydevik
On Wed, Jun 23, 2010 at 7:13 AM, Peter Dalgaard pda...@gmail.com wrote:
 Gustaf Rydevik wrote:

 Oh, I forgot to mention that the workaround of using as.double (or
 as.numeric) works fine, and I've done that.
 It's just that it can take quite a while (as in several hours) to
 figure out that the reason for the error is that you have to force
 difftime objects to be numeric in 2.11.1, when the code's been working
 fine before and the error messages are obscure.

 I don't think you realize the problems that could occur by assuming that
 difftime objects are numerics ON ANY PARTICULAR SCALE!

 --
 Peter Dalgaard
 Center for Statistics, Copenhagen Business School
 Phone: (+45)38153501
 Email: pd@cbs.dk  Priv: pda...@gmail.com



Ah. Yes, you're right that it would be problematic to say the least to
assume that the difftime object is measured in days and not in, say,
seconds. And I suppose that it makes sense to prioritize avoiding
calculations that give misleading results over forcing changes in old
code.  I was just caught somewhat unprepared, and I know that my
colleagues  who is not quite as R-literate will be even more
unprepared for old stuff no longer working.
Usually, R prepares the user for these kind of things by throwing
warnings a version or two before the change is actually implemented.
But I guess that's not always practical.

I take it that your argument would also work agains implementing
simple difftime-methods of functions as well, where you force difftime
objectws to be numeric? In that case, people can disregard my
suggestion of adding a difftime-method to cut().

Anyhow, I'll stop whining now. Thanks for the good work you're doing
in the R Core team.

Regards,
Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] New errors with difftime()-objects in 2.11.1 (was Re: Request: difftime method for cut())

2010-06-22 Thread Gustaf Rydevik
On Thu, Jun 10, 2010 at 3:39 PM, Gustaf Rydevik
gustaf.ryde...@gmail.com wrote:
 Hi all,

 The recent change in 2.11 that made as.numeric() return false on
 difftime-objects broke some of my code that calculated age classes of
 individuals using cut(). While this was no big thing to fix for me, it
 might be wise
 to provide a cut.difftime method to  stop other old code from breaking.
 I'm guessing something as simple as

 cut.difftime-function(x,...){
 x-as.numeric(x)
 cut(x,...)
 }

 would suffice.

 best regards,
 Gustaf



As a followup, the change in how to treat difftime objects break even
more of my old code in a different project, since I'm used to treating
difftime as numeric in regressions and other analysis.
 And the error messages  become *very* obscure, I.e Error: NA/NaN/Inf
in foreign function call (arg 2) when applying loess to a difftime
object. Tracking down the source of those errors become quite a
nuisance.
I suppose there's no chance of reversing the change, but I'd
appreciate if someone could tell me the reason for introducing it so
abrubtly.

I'm cc'ing this to R-help, since there's probably more people than me
that will be bitten by this in the future when looking into old
projects.

Regards,
Gustaf Rydevik.




-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] New errors with difftime()-objects in 2.11.1 (was Re: Request: difftime method for cut())

2010-06-22 Thread Gustaf Rydevik
On Tue, Jun 22, 2010 at 7:50 PM, David Winsemius dwinsem...@comcast.net wrote:

 On Jun 22, 2010, at 1:33 PM, Gustaf Rydevik wrote:

 Cannot help you there, but have you looked at the help page for difftime?

 The as.double method returns the numeric value expressed in the specified
 units. Using units = auto means the units of the object.


 David Winsemius, MD
 West Hartford, CT



Oh, I forgot to mention that the workaround of using as.double (or
as.numeric) works fine, and I've done that.
It's just that it can take quite a while (as in several hours) to
figure out that the reason for the error is that you have to force
difftime objects to be numeric in 2.11.1, when the code's been working
fine before and the error messages are obscure.

Regards,
Gustaf




-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] remove last char of a text string

2010-06-14 Thread Gustaf Rydevik
On Mon, Jun 14, 2010 at 3:47 PM, glaporta glapo...@freeweb.org wrote:

 Dear R experts,
 is there a simple way to remove the last char of a text string?
 substr() function use as parameter start end only... but my strings are of
 different length...
 01asap05a - 01asap05
 02ee04b - 02ee04
 Thank you all,
 Gianandrea
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/remove-last-char-of-a-text-string-tp2254377p2254377.html
 Sent from the R help mailing list archive at Nabble.com.


It's not terribly elegant, but this works:

orig.text-c(01asap05a,02ee04b)
substr(orig.text,1,nchar(orig.text)-1)

Regards,
Gustaf


-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] moving average on irregular time series

2010-06-04 Thread Gustaf Rydevik
Dear William and Gabor,

Both solutions worked, and my problem is now solved.

Many thanks to both of you!

regards,
Gustaf



 On Thu, Jun 3, 2010 at 10:23 AM, Gustaf Rydevik
 gustaf.ryde...@gmail.com wrote:
 Hi all,


 I wonder if there is any way to calculate a moving average on an
 irregular time series, or use the rollapply function in zoo?
 I have a set of dates where I want to check if there has been an event
 14 days prior to each time point in order to mark these timepoints for
 removal, and can't figure out a good way to do it.

 Many thanks in advance!

 Gustaf


 Example data:

 exData-structure(list(Datebegin = structure(c(14476, 14569, 14576, 14621,
 14627, 14632, 14661, 14671, 14705, 14715, 14751, 14756, 14495,
 14518, 14523, 14526, 14528, 14529, 14545, 14548), class = Date),
    Event = c(TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,
    FALSE, FALSE, FALSE, TRUE, TRUE, TRUE, FALSE, FALSE, FALSE,
    TRUE, FALSE, FALSE, FALSE)), .Names = c(Datebegin, Event
 ), row.names = c(NA, 20L), class = data.frame)

 ###In this example, row 18 is a date less than 14 days after an event
 and should be marked for removal.



 --
 Gustaf Rydevik, M.Sci.
 tel: +46(0)703 051 451
 address:Essingetorget 40,112 66 Stockholm, SE
 skype:gustaf_rydevik

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ISO 8601 Weeks/Years on Windows with strptime

2010-06-03 Thread Gustaf Rydevik
2010/6/3 Michael Höhle michael.hoe...@stat.uni-muenchen.de:
 Dear R-help,

 I am working on a R package for public health surveillance where the ISO
 8601 representation of dates is of importance. Especially, the ISO Week
 and ISO Year of a date needs to be extracted. I was quite happy to find
 all of this implemented in the Date class with appropriate calls to
 strptime/format (using e.g. %G and %V).

 However, only later I realized that this functionality is currently not
 implemented on Windows (I'm a happy Mac/Linux user). As this seriously
 limits the applicability, I would like to enquire, if there are any plans
 to make this functionality available in Windows as well? Or are there any
 good workarounds to make

 format.Date(2001-12-31, %G)

 give 2002 instead of  on Windows?

 Best regards,

 Michael Höhle

 -

Hello,

This seems to be a problem that crops up from time to time.
I wrote a small function  that got the ISO week of a Date object, that
you can find in a bug-fixed version here:
http://tolstoy.newcastle.edu.au/R/e10/help/10/05/5588.html

Hope this is of help. I agree that it would be of interest to
incorporate OS-independent date management in R, but not being part of
the R development team, I'm not sure how to go about implementing
it...

Best regards,
Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] moving average on irregular time series

2010-06-03 Thread Gustaf Rydevik
Hi all,


I wonder if there is any way to calculate a moving average on an
irregular time series, or use the rollapply function in zoo?
I have a set of dates where I want to check if there has been an event
14 days prior to each time point in order to mark these timepoints for
removal, and can't figure out a good way to do it.

Many thanks in advance!

Gustaf


Example data:

exData-structure(list(Datebegin = structure(c(14476, 14569, 14576, 14621,
14627, 14632, 14661, 14671, 14705, 14715, 14751, 14756, 14495,
14518, 14523, 14526, 14528, 14529, 14545, 14548), class = Date),
Event = c(TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,
FALSE, FALSE, FALSE, TRUE, TRUE, TRUE, FALSE, FALSE, FALSE,
TRUE, FALSE, FALSE, FALSE)), .Names = c(Datebegin, Event
), row.names = c(NA, 20L), class = data.frame)

###In this example, row 18 is a date less than 14 days after an event
and should be marked for removal.



-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R on the iPhone/iPad? Not so much....a GPL violation

2010-06-02 Thread Gustaf Rydevik

 As I noted in my closing comments in my second post, if one has a desire to 
 make R's functionality available on smartphones (iPhone, Android, etc.) or 
 iPad-class devices, then a client/server approach may be the most efficient 
 means to do so. That approach also avails you of more powerful computing 
 platforms than the client side mobile devices have, at least at present, 
 which will also limit aspects of portable functionality.

 Regards,

 Marc


Indeed, the client/server approach is what is used in MatLab Mobile,
which is now on sale in the app store.
See
http://blogs.mathworks.com/desktop/2010/05/24/introducing-matlab-mobile-%E2%80%93-an-iphone-app-to-connect-remotely-to-your-matlab/

If matlab can do it, then surely the R community can as well.

Regards,
Gustaf


-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] p-values 2.2e-16 not reported

2010-05-19 Thread Gustaf Rydevik
On Wed, May 19, 2010 at 10:53 AM, Will Eagle will.ea...@gmx.net wrote:
 Dear all,

 how can I get the exact p-value of a statistical test like cor.test() if the
 p-value is below the default machine epsilon value of .Machine$double.eps =
  2.220446e-16?

 At the moment smaller p-values are reported as p-value  2.2e-16.
 .Machine$double.eps - 1E-100 does not solve this issue, although this value
 should be used by the format.pval() function.

 To know the exact p-values down to 1E-200 is very important since I have
 multiple tests which require a alpha error-threshold below 2.2E-16.

 Thanks in advance,

 Will


I would be interested to hear about what kind of multiple testing
you're doing. Genetics?

Intuitively, requiring that small p-values would seem to throw away
most any interesting results that are not simply errors in your data -
are you sure that there's not a better way of thinking about your
problem?

From a practical standpoint, I would be sceptical about the ability of
most R-algorithms to generate theoretically valid p-values of such a
small order.

Best regards,
Gustaf




-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] A revised function for getting ISO week

2010-05-19 Thread Gustaf Rydevik
Hi All,

Two years back, I posted a small function for getting the ISO 8601
defined week number of a date (such as the week number used in all
Swedish calendars),
in a os-independent manner.
I've since discovered an inaccuracy in that code, and so I thought I'd
repost the corrected version.
Hopefully this will come in handy for someone searching the mailing
list archives in the future.

Best Regards,
Gustaf

#



## Inputs a date object, posix object, or 3 numbers and gives back the iso week.
## By Gustaf Rydevik, revised 2010


getweek-function(Y,M=NULL,D=NULL){

  if(!class(Y)[1]%in%c(Date,POSIXt)) {
  date.posix-strptime(paste(Y,M,D,sep=-),%Y-%m-%d)
  }
  if(class(Y)[1]%in%c(POSIXt,Date)){
date.posix-as.POSIXlt(Y)
Y-as.numeric(format(date.posix,%Y))
M-as.numeric(format(date.posix,%m))
D-as.numeric(format(date.posix,%d))
  }


  LY- (Y%%4==0  !(Y%%100==0))|(Y%%400==0)
  LY.prev- ((Y-1)%%4==0  !((Y-1)%%100==0))|((Y-1)%%400==0)
  date.yday-date.posix$yday+1
  jan1.wday-strptime(paste(Y,01-01,sep=-),%Y-%m-%d)$wday
  jan1.wday-ifelse(jan1.wday==0,7,jan1.wday)
  date.wday-date.posix$wday
  date.wday-ifelse(date.wday==0,7,date.wday)


  If the date is in the beginning, or end of the year,
  ### does it fall into a week of the previous or next year?
  Yn-ifelse(date.yday=(8-jan1.wday)jan1.wday4,Y-1,
ifelse(((365+LY-date.yday)(4-date.wday)),Y+1,Y))

  ##Set the week differently if the date is in the beginning,middle or
end of the year

  Wn-ifelse(
  Yn==Y-1,
  ifelse((jan1.wday==5|(jan1.wday==6 LY.prev)),53,52),
  ifelse(Yn==Y+1,1,(date.yday+(7-date.wday)+(jan1.wday-1))/7-(jan1.wday4))
  )

return(list(Year=Yn,ISOWeek=Wn))
}





-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Set encoding when load()-ing workspaces?

2010-05-02 Thread Gustaf Rydevik
Hi all,

I hope that there is someone that can help me out here.
I am trying to load() a workspace on os x (R 2.11.0) that was saved in
windows XP (R 2.9). In that workspace, there's a data.frame with names
that contain swedish characters. These characters become garbled,
which is a major problem.
From the R windows FAQ, I read:

Note though that character data in a workspace will be in a
particular encoding that is not recorded in the workspace, so
workspaces containing non-ASCII character data may not be
interchangeable even on the same OS. Since R marks character data when
it knows it to be in UTF-8 or Latin-1 (including its Windows superset,
CP1252), strings in those encodings are likely to be transferred
correctly: fortunately this covers most of the common cases (Mac OS X
normally uses UTF-8, and Linux users are likely to use UTF-8 or
perhaps Latin-1 (which used to be used for English)). 

Apparently, my case is not the most common one, and I don't know why.
I've been trying to dig into the load() function, but since it uses a
lot of .Internal functions, I get stuck there.
I've also tried doing options(encoding=latin1), which doesn't seem
to change anything.

And now I'm stuck. Any suggestions on where to look?
I've run into this issue twice before. The first time I managed to get
it solved, but can't remember how (perhaps a .Rprofile setting
somewhere?).
The second time, I mailed R-Sig-Mac, got some tips that unfortunately
did not lead anywhere, and subsequently gave up. I hope third time's a
charm!

Many thanks in advance,
Gustaf


-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Set encoding when load()-ing workspaces?

2010-05-02 Thread Gustaf Rydevik
Many thanks Prof. and Duncan!

Iconv worked like a charm together with CP1252 as the windows
encoding, and now all the text shows up correctly

Because the data frame also contained factors with levels that had
swedish characters, i ended up writing a small function for converting
the encoding of everything inside a dataframe in one go. It is a bit
slow, but hopefully someone else will find it useful in the future:

iconv.data.frame-function(df,...){
     df.names-iconv(names(df),...)
     df.rownames-iconv(rownames(df),...)
     names(df)-df.names
     rownames(df)-df.rownames
     df.list-lapply(df,function(x){
             if(class(x)==factor){x-factor(iconv(as.character(x),...))}else
             if(class(x)==character){x-iconv(x,...)}else{x}
      })
     df.new-do.call(data.frame,df.list)
     return(df.new)
}


Best regards,
Gustaf


On Sun, May 2, 2010 at 8:36 PM, Prof Brian Ripley rip...@stats.ox.ac.uk wrote:
 On Sun, 2 May 2010, Duncan Murdoch wrote:

 Gustaf Rydevik wrote:

 Hi all,

 I hope that there is someone that can help me out here.
 I am trying to load() a workspace on os x (R 2.11.0) that was saved in
 windows XP (R 2.9). In that workspace, there's a data.frame with names
 that contain swedish characters. These characters become garbled,
 which is a major problem.
 From the R windows FAQ, I read:

 Note though that character data in a workspace will be in a
 particular encoding that is not recorded in the workspace, so
 workspaces containing non-ASCII character data may not be
 interchangeable even on the same OS. Since R marks character data when
 it knows it to be in UTF-8 or Latin-1 (including its Windows superset,
 CP1252), strings in those encodings are likely to be transferred
 correctly: fortunately this covers most of the common cases (Mac OS X
 normally uses UTF-8, and Linux users are likely to use UTF-8 or
 perhaps Latin-1 (which used to be used for English)). 

 Apparently, my case is not the most common one, and I don't know why.
 I've been trying to dig into the load() function, but since it uses a
 lot of .Internal functions, I get stuck there.
 I've also tried doing options(encoding=latin1), which doesn't seem
 to change anything.


 You can't change the encoding when you load, but you can convert the
 encoding later (using iconv()) if you know what encoding it is.  A good
 guess for a file created on Windows in my locale is latin1, but it's not
 certain, and I don't know what is commonly used on Windows in a Swedish
 locale.

 CP1252 (which is actually what you will get too).


 If you have an example where you know the correct version of the string
 and you can show us what you're getting, together with charToRaw() applied
 to it, someone will probably be able to make a guess at the encoding.

 Duncan Murdoch


 And now I'm stuck. Any suggestions on where to look?
 I've run into this issue twice before. The first time I managed to get
 it solved, but can't remember how (perhaps a .Rprofile setting
 somewhere?).
 The second time, I mailed R-Sig-Mac, got some tips that unfortunately
 did not lead anywhere, and subsequently gave up. I hope third time's a
 charm!

 Many thanks in advance,
 Gustaf




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 --
 Brian D. Ripley,                  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford,             Tel:  +44 1865 272861 (self)
 1 South Parks Road,                     +44 1865 272866 (PA)
 Oxford OX1 3TG, UK                Fax:  +44 1865 272595




--
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik



-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R loop.

2010-04-23 Thread Gustaf Rydevik
On Thu, Apr 22, 2010 at 7:20 PM, mhalsham mhals...@bradford.ac.uk wrote:

 Ok sorry for bad explanation from my side
 What I want. I have a txt file name is (table3.txt) this file contains 1293
 rows and some of these row will have 1 column and some of them will have up
 to may be 40 column. For example

        A                     B C       D       E       F       G       H      
  I
 1       Deafness                    EYA4        MYO7A   TECTA   COL11A2 
 POU4F3  MYH9    ACTG1   MYO6
 2       Leukemia                     TAL1       TAL2    ZNFN1A1 FLT3
 3       Colon_cancer    RAD54B  PTPN12  BCL10

 The orders below will show how I want the recorders to be.
        A              B
 1       Deafness            EYA4
 2       Deafness              MYO7A
 3       Deafness             TECTA
 4       Deafness            COL11A2
 5       Deafness            POU4F3
 6       Deafness             MYH9
 7       Deafness             ACTG1
 8       Deafness              MYO6
 9       Leukemia              TAL1
 10      Leukemia              TAL2
 11      Leukemia                ZNFN1A1
 12      Leukemia        FLT3
 13      Colon_cancer    RAD54B
 14      Colon_cancer    PTPN12
 15      Colon_cancer    BCL10

 Any help will very kind of every one, and thanks for those who trying to
 help and couldn’t understand me. Thank you




Have you managed to read your table3.txt into R,using read.table etc?
If so, could you copy/paste the result of using dput() on your object?


After a bit of work, I've gotten your example data into R, but please
post either comma-separated data or dput() results in the
future.Anyhow, here's an example of how to get what you want. Hope it
helps.
Regards,
Gustaf
..

example.data-structure(list(V1 = structure(c(2L, 3L, 1L), .Label =
c(Colon_cancer,
Deafness, Leukemia), class = factor), V2 = structure(c(1L,
3L, 2L), .Label = c(EYA4, RAD54B, TAL1), class = factor),
V3 = structure(c(1L, 3L, 2L), .Label = c(MYO7A, PTPN12,
TAL2), class = factor), V4 = structure(c(2L, 3L, 1L), .Label =
c(BCL10,
TECTA, ZNFN1A1), class = factor), V5 = structure(c(2L,
3L, 1L), .Label = c(, COL11A2, FLT3), class = factor),
V6 = structure(c(2L, 1L, 1L), .Label = c(, POU4F3), class = factor),
V7 = structure(c(2L, 1L, 1L), .Label = c(, MYH9), class = factor),
V8 = structure(c(2L, 1L, 1L), .Label = c(, ACTG1), class = factor),
V9 = structure(c(2L, 1L, 1L), .Label = c(, MYO6), class =
factor)), .Names = c(V1,
V2, V3, V4, V5, V6, V7, V8, V9), class = data.frame,
row.names = c(NA,
-3L))

library(reshape)
example.long-melt(exampledata,id.vars=V1)
example.long
# V1 variable   value
#1  Deafness   V2EYA4
#2  Leukemia   V2TAL1
#3  Colon_cancer   V2  RAD54B
#4  Deafness   V3   MYO7A
#5  Leukemia   V3TAL2
#6  Colon_cancer   V3  PTPN12
#7  Deafness   V4   TECTA
#8  Leukemia   V4 ZNFN1A1
#9  Colon_cancer   V4   BCL10
#10 Deafness   V5 COL11A2
#11 Leukemia   V5FLT3
#12 Colon_cancer   V5
#13 Deafness   V6  POU4F3
#14 Leukemia   V6
#15 Colon_cancer   V6
#16 Deafness   V7MYH9
#17 Leukemia   V7
#18 Colon_cancer   V7
#19 Deafness   V8   ACTG1
#20 Leukemia   V8
#21 Colon_cancer   V8
#22 Deafness   V9MYO6
#23 Leukemia   V9
#24 Colon_cancer   V9
##Or if you want it in the order of V1
example.long[order(example.long$V1),]
#V1 variable   value
#3  Colon_cancer   V2  RAD54B
#6  Colon_cancer   V3  PTPN12
#9  Colon_cancer   V4   BCL10
#12 Colon_cancer   V5
#15 Colon_cancer   V6
#18 Colon_cancer   V7
#21 Colon_cancer   V8
#24 Colon_cancer   V9
#1  Deafness   V2EYA4
#4  Deafness   V3   MYO7A
#7  Deafness   V4   TECTA
#10 Deafness   V5 COL11A2
#13 Deafness   V6  POU4F3
#16 Deafness   V7MYH9
#19 Deafness   V8   ACTG1
#22 Deafness   V9MYO6
#2  Leukemia   V2TAL1
#5  Leukemia   V3TAL2
#8  Leukemia   V4 ZNFN1A1
#11 Leukemia   V5FLT3
#14 Leukemia   V6
#17 Leukemia   V7
#20 Leukemia   V8
#23 Leukemia   V9


-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R loop.

2010-04-23 Thread Gustaf Rydevik
On Fri, Apr 23, 2010 at 11:14 AM, mhalsham mhals...@bradford.ac.uk wrote:

 Hi
 Yes I have managed to read the file (Table2.txt)
 The command I have used
 a- read.table(table3.txt, fill=TRUE, header=FALSE)
 If I read the first row the result output will be like that.
 a[1,]

 Result would be

        V1   V2     V3    V4    V5      V6     V7   V8    V9  V10  V11   V12
 1 Deafness EYA4 DIAPH1 MYO7A TECTA COL11A2 POU4F3 MYH9 ACTG1 MYO6 GJB3 KCNQ4
    V13  V14  V15  V16  V17  V18   V19   V20  V21   V22     V23   V24    V25
 1 GRHL2 GJB2 GJB6 TMC1 DSPP CRYM MYH14 DFNA5 COCH MYO1A TMPRSS3 CDH23 ATP2B2
   V26   V27  V28    V29    V30   V31 V32
 1 STRC USH1C OTOA PCDH15 CLDN14 MYO3A


Did you try my code in that case? I think that does what you wanted.
/Gustaf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Remove duplicated rows

2010-04-23 Thread Gustaf Rydevik
On Fri, Apr 23, 2010 at 4:05 AM, chrisli1223
chri...@austwaterenv.com.au wrote:

 Hi all,

 I have a dataset similar to the following

 Name    Date    Value
 A       1/01/2000       4
 A       2/01/2000       4
 A       3/01/2000       5
 A       4/01/2000       4
 A       5/01/2000       1
 B       6/01/2000       2
 B       7/01/2000       1
 B       8/01/2000       1

 I would like R to remove duplicates based on column 1 and 3 only. In
 addition, I would like R to remove duplicates based on the underlying and
 overlying row only. For example, for A, I would like to remove row 2 only
 and keep row 1, 3 and 4.

 I have tried: unique() and replicated(), but I do not have much success. I
 have also tried: dataset-c(1,diff(dataset)!=0), but I don't know how to
 apply it to this multi-column situation.

 Any help would be greatly appreciated.

 Thanks in advance,
 Chris
 --



Hi,

This code is a bit ugly, but it works. Hope it helps.
/Gustaf

library(zoo)
test-read.table(clipboard,header=T)
test$code-paste(test$Name,test$Value,sep=)

drop.ndx-rollapply(zoo(test$code),3,function(x)(x[2]%in%c(x[1],x[3])))

drop.ndx-c(FALSE,drop.ndx,FALSE)
test[!drop.ndx,]



-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning week numbers

2010-04-22 Thread Gustaf Rydevik
On Wed, Apr 21, 2010 at 6:50 PM, Michael Hosack mhosa...@hotmail.com wrote:


 I provided a minimized version of my dataframe at the bottom of this message 
 containing the results of David's code in variable ('wkoffset') and Jeff 
 Hallman's code in ('WEEK'). Jeff's code produced the correct results (thank 
 you Jeff) though I have been unable to understand it. David, as you can see 
 your code begins week 2 for year 2011 on a Wednesday, rather than on a 
 Saturday, as it should. Your adjustment seems not to correct the problem, but 
 I concede I may be using it incorrectly. If you are obtaining the correct 
 results please let me know what I am doing wrong.

 Thanks,

 Mike



Hello again,

Just for fun, I implemented the gist of your original code in R. It's
much longer and not as elegant as the other solutions, but perhaps
someone can learn something from it.
Regards,
Gustaf



Daterange-range(SCHEDULE3$DATE.)
Daterange[1]-paste(as.numeric(
substr(as.character(Daterange[1]),1,4))-1,
-05-01,sep=)
Daterange[2]-paste(as.numeric(
substr(as.character(Daterange[2]),1,4))+1,
-05-01,sep=)

alldates-seq(from=Daterange[1],to=Daterange[2],by=1)


My.locale-Sys.getlocale(LC_TIME)
Sys.setlocale(LC_TIME,English_USA.1252)
Week-1
allweeks-vector(length=length(alldates))


for(i in seq_along(alldates)){
if(weekdays(alldates[i])==Saturday){
Week-Week+1
}
if(substr(as.character(alldates[i]),6,10)==05-01){
Week-1
}
allweeks[i]-Week
}

SCHEDULE3$Week-allweeks[match(SCHEDULE3$DATE.,alldates)]

Sys.setlocale(LC_TIME,My.locale)



-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RNG

2010-04-21 Thread Gustaf Rydevik
On Wed, Apr 21, 2010 at 4:37 PM, tamas barjak tamas.bar...@gmail.com wrote:
 Hi all!

 I would like to generate random numbers between 0 and 1. How can I do this?
 I downloaded it single RNG but it generates ones between only 1 and
 1...:(

 Thank you for the help!

 Tamas


Hi tamas,

I am not sure what you mean by  downloaded There is a lot of
random number generators built into R.

To generate 10 random numbers between 0 and 1, try

runif(10)

Regards,
Gustaf



-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning Week Numbers

2010-04-21 Thread Gustaf Rydevik
On Tue, Apr 20, 2010 at 7:59 PM, Michael Hosack mhosa...@hotmail.com wrote:

 R experts,

 How could I extract the week number from a date vector (in Date class)
 such that week numbering (week 1...2...) begins (May 01) and ends
 (October 31) on the same specific dates each year? Week numbering
 must conform to the following day numbering format 
 (Sat=1,Sun=2,Mon=3.Fri=7).
 This means that new weeks must begin on Saturdays, and end on Fridays
 (except for the first date of May 01, which always begins week 1; week 2
 begins on the proceeding Saturday). This needs to be applicable across years
 to work effectively. I have tried using both vectorized and loop approaches 
 with
 no success.

 I am including a bit of old Systat code that does the trick simply and 
 concisely.
 If anyone knows an analogous method in R please let me know. My R dataframe 
 contains
 all the variables and data in the Systat temp file.

 Use sched3.t
 Save sched4.t
 Hold
 By mm dd
 If bof then let week=1
 Else if bog and DOW$=SAT then let week = week + 1
 Run


 Thank you,

 Mike



From your code, it seems as if you're assuming that SCHEDULE3 contains
all consecutive saturdays, without skipping any. Is that correct?

/Gustaf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how can I plot the histogram like this using R?

2010-04-16 Thread Gustaf Rydevik
On Fri, Apr 16, 2010 at 10:13 AM, bbslover dlu...@yeah.net wrote:

  Thanks for your reply, I just want to get the figure like y1.jpg using the
 data from y1.txt.
  Through the figure  I want to obtain the split point like y1.jpg, and
 consider 2.5 as the plit point.  This figure is drawn by other people, I
 just want to draw it using R, but I can not, so I hope, friends can help me.

 Best wishes!
 kevin http://n4.nabble.com/file/n1965378/y1.jpg
 http://n4.nabble.com/file/n1965378/y1.txt y1.txt
 --
 View this message in context: 
 http://n4.nabble.com/how-can-I-plot-the-histogram-like-this-using-R-tp1839303p1965378.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Hi,

Does this do what you want?

temp-read.table(url(http://n4.nabble.com/file/n1965378/y1.txt;))
hist(temp$V1,breaks=seq(0,5.1,by=0.1))
abline(v=2.5,lty=2,lwd=2,col=red)


Regards,
Gustaf


-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simplifying particular piece of code

2010-03-31 Thread Gustaf Rydevik
How about this (not tested, since you did not provide example data nor
function code):

---

SRnames - paste(colnames.mrets, .SR, sep=)
AVnames - paste(colnames.mrets, .AV120, sep=)
SDnames - paste(colnames.mrets, .SD120, sep=)
names.matrix-cbind(SRnames,AVnames,SDnames)

mrets.list-apply(names.matrix,1,function(.names){
apply(mrets,1,MyFunc,ret=.names[2],stdev=.names[3]}
)
names(mrets.list)-names.matrix[,1]
mrets-do.call(merge,mrets.list)

-
?
/Gustaf

On Wed, Mar 31, 2010 at 12:10 PM, Sergey Goriatchev serg...@gmail.com wrote:
 Hello, everyone

 I have a piece of code that looks like this:

 mrets - merge(mrets, BMM.SR=apply(mrets, 1, MyFunc, ret=BMM.AV120,
 stdev=BMM.SD120))
 mrets - merge(mrets, GM1.SR=apply(mrets, 1, MyFunc, ret=GM1.AV120,
 stdev=GM1.SD120))
 mrets - merge(mrets, IYC.SR=apply(mrets, 1, MyFunc, ret=IYC.AV120,
 stdev=IYC.SD120))
 mrets - merge(mrets, FCA.SR=apply(mrets, 1, MyFunc, ret=FCA.AV120,
 stdev=FCA.SD120))
 mrets - merge(mrets, IMM.SR=apply(mrets, 1, MyFunc, ret=IMM.AV120,
 stdev=IMM.SD120))
 mrets - merge(mrets, BME.SR=apply(mrets, 1, MyFunc, ret=BME.AV120,
 stdev=BME.SD120))
 mrets - merge(mrets, CRT.SR=apply(mrets, 1, MyFunc, ret=CRT.AV120,
 stdev=CRT.SD120))
 mrets - merge(mrets, GTF.SR=apply(mrets, 1, MyFunc, ret=GTF.AV120,
 stdev=GTF.SD120))
 mrets - merge(mrets, ERU.SR=apply(mrets, 1, MyFunc, ret=ERU.AV120,
 stdev=ERU.SD120))
 mrets - merge(mrets, ERE.SR=apply(mrets, 1, MyFunc, ret=ERE.AV120,
 stdev=ERE.SD120))
 mrets - merge(mrets, EPT.SR=apply(mrets, 1, MyFunc, ret=EPT.AV120,
 stdev=EPT.SD120))
 mrets - merge(mrets, EVA.SR=apply(mrets, 1, MyFunc, ret=EVA.AV120,
 stdev=EVA.SD120))
 mrets - merge(mrets, EMT.SR=apply(mrets, 1, MyFunc, ret=EMT.AV120,
 stdev=EMT.SD120))
 mrets - merge(mrets, EMM.SR=apply(mrets, 1, MyFunc, ret=EMM.AV120,
 stdev=EMM.SD120))
 mrets - merge(mrets, EMV.SR=apply(mrets, 1, MyFunc, ret=EMV.AV120,
 stdev=EMV.SD120))
 mrets - merge(mrets, ETM.SR=apply(mrets, 1, MyFunc, ret=ETM.AV120,
 stdev=ETM.SD120))

 Is there a way to simplify this, some sort of loop?
 mrets is a zoo object.
 .AV120 and .SD120 are columns in this object.
 I need the exact .SR column names.

 This does not work:
 SRnames - paste(colnames.mrets, .SR, sep=)
 AVnames - paste(colnames.mrets, .AV120, sep=)
 SDnames - paste(colnames.mrets, .SD120, sep=)

 for(i in seq(SRnames)){
 mrets - merge(mrets, SRnames[i]=apply(mrets, 1, MyFunc,
 ret=AVnames[i], stdev=SDnames[i]))
 }


 Help much appreciated.

 Regards,
 Sergey


 --
 Simplicity is the last step of art./Bruce Lee
 The more you know, the more you know you don't know. /Myself

 I'm not young enough to know everything. /Oscar Wilde
 Experience is one thing you can't get for nothing. /Oscar Wilde
 When you are finished changing, you're finished. /Benjamin Franklin
 Luck is where preparation meets opportunity. /George Patten

 Kniven skärpes bara mot stenen.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simplifying particular piece of code

2010-03-31 Thread Gustaf Rydevik
On Wed, Mar 31, 2010 at 5:11 PM, Sergey Goriatchev serg...@gmail.com wrote:
 but

 data - merge(data,data.list)

 works.

 Neither data or data.list is a list, so do.call does not work.
 I am very weak on lists, never used them before

 Best,
 Sergey

Hej Sergey,

Ok; I was wondering if the apply thing would work. Cool that merge
would be clever enough to append a matrix. I'm guessing that you've
got what you needed then? For reference,  (and for the general list) I
had changed the  code  before Sergeys response, replacing apply() with
lapply(). That code follows below.
Best regards,
Gustaf


-
cnames - c(BMM, GM1, IYC, FCA, IMM, BME, CRT, GTF,
ERU, ERE, EPT, EVA, EMT, EMM, EMV, ETM)
AVnames - paste(cnames, .AV120, sep=)
SDnames - paste(cnames, .SD120, sep=)
a - zoo(matrix(rep(seq(from=160, to=10, by=-10), 1000), ncol=16, byrow=TRUE))
colnames(a) - AVnames
b - zoo(matrix(rep(2, 16000), ncol=16))
colnames(b) - SDnames
data - merge(a, b)
MyFunc - function(x, ret, stdev){
   if(any(is.na(c(x[ret], x[stdev]{
   return(NA)
   }else{
   return(x[ret]/x[stdev])
   }
}
names.df-data.frame(rbind(SRnames,AVnames,SDnames))
func - function(.names){
   apply(data, 1, MyFunc, ret=.names[2], stdev=.names[3])
}
data.list-lapply(names.df, func)
mrets-do.call(merge,c(list(data),data.list))









On Wed, Mar 31, 2010 at 12:33, Gustaf Rydevik gustaf.ryde...@gmail.com wrote:
 How about this (not tested, since you did not provide example data nor
 function code):

 ---

 SRnames - paste(colnames.mrets, .SR, sep=)
 AVnames - paste(colnames.mrets, .AV120, sep=)
 SDnames - paste(colnames.mrets, .SD120, sep=)
 names.matrix-cbind(SRnames,AVnames,SDnames)

 mrets.list-apply(names.matrix,1,function(.names){
 apply(mrets,1,MyFunc,ret=.names[2],stdev=.names[3]}
 )
 names(mrets.list)-names.matrix[,1]
 mrets-do.call(merge,mrets.list)

 -
 ?
 /Gustaf

 On Wed, Mar 31, 2010 at 12:10 PM, Sergey Goriatchev serg...@gmail.com wrote:
 Hello, everyone

 I have a piece of code that looks like this:

 mrets - merge(mrets, BMM.SR=apply(mrets, 1, MyFunc, ret=BMM.AV120,
 stdev=BMM.SD120))
 mrets - merge(mrets, GM1.SR=apply(mrets, 1, MyFunc, ret=GM1.AV120,
 stdev=GM1.SD120))
 mrets - merge(mrets, IYC.SR=apply(mrets, 1, MyFunc, ret=IYC.AV120,
 stdev=IYC.SD120))
 mrets - merge(mrets, FCA.SR=apply(mrets, 1, MyFunc, ret=FCA.AV120,
 stdev=FCA.SD120))
 mrets - merge(mrets, IMM.SR=apply(mrets, 1, MyFunc, ret=IMM.AV120,
 stdev=IMM.SD120))
 mrets - merge(mrets, BME.SR=apply(mrets, 1, MyFunc, ret=BME.AV120,
 stdev=BME.SD120))
 mrets - merge(mrets, CRT.SR=apply(mrets, 1, MyFunc, ret=CRT.AV120,
 stdev=CRT.SD120))
 mrets - merge(mrets, GTF.SR=apply(mrets, 1, MyFunc, ret=GTF.AV120,
 stdev=GTF.SD120))
 mrets - merge(mrets, ERU.SR=apply(mrets, 1, MyFunc, ret=ERU.AV120,
 stdev=ERU.SD120))
 mrets - merge(mrets, ERE.SR=apply(mrets, 1, MyFunc, ret=ERE.AV120,
 stdev=ERE.SD120))
 mrets - merge(mrets, EPT.SR=apply(mrets, 1, MyFunc, ret=EPT.AV120,
 stdev=EPT.SD120))
 mrets - merge(mrets, EVA.SR=apply(mrets, 1, MyFunc, ret=EVA.AV120,
 stdev=EVA.SD120))
 mrets - merge(mrets, EMT.SR=apply(mrets, 1, MyFunc, ret=EMT.AV120,
 stdev=EMT.SD120))
 mrets - merge(mrets, EMM.SR=apply(mrets, 1, MyFunc, ret=EMM.AV120,
 stdev=EMM.SD120))
 mrets - merge(mrets, EMV.SR=apply(mrets, 1, MyFunc, ret=EMV.AV120,
 stdev=EMV.SD120))
 mrets - merge(mrets, ETM.SR=apply(mrets, 1, MyFunc, ret=ETM.AV120,
 stdev=ETM.SD120))

 Is there a way to simplify this, some sort of loop?
 mrets is a zoo object.
 .AV120 and .SD120 are columns in this object.
 I need the exact .SR column names.

 This does not work:
 SRnames - paste(colnames.mrets, .SR, sep=)
 AVnames - paste(colnames.mrets, .AV120, sep=)
 SDnames - paste(colnames.mrets, .SD120, sep=)

 for(i in seq(SRnames)){
 mrets - merge(mrets, SRnames[i]=apply(mrets, 1, MyFunc,
 ret=AVnames[i], stdev=SDnames[i]))
 }


 Help much appreciated.

 Regards,
 Sergey


 --
 Simplicity is the last step of art./Bruce Lee
 The more you know, the more you know you don't know. /Myself

 I'm not young enough to know everything. /Oscar Wilde
 Experience is one thing you can't get for nothing. /Oscar Wilde
 When you are finished changing, you're finished. /Benjamin Franklin
 Luck is where preparation meets opportunity. /George Patten

 Kniven skärpes bara mot stenen.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http

Re: [R] Adding minutes to 24 hour time

2010-03-17 Thread Gustaf Rydevik
On Wed, Mar 17, 2010 at 2:57 PM, Hosack, Michael mhos...@state.pa.us wrote:
 Hi,

 Does anyone know how to add minutes (up to 100 min) to a 24 hour time, to 
 create a new 24 hour time? I can't seem to find any documentation or examples 
 explaining how to do this. The variables of interest are 'ARRIVE','WAIT', and 
 'DEPART' in the attached partial dataframe. I want 'DEPART' to be the sum 
 of 'ARRIVE' and 'WAIT' in 24 hour format. Also, can anyone direct me to some 
 relevant documentation?

 Thank you,

 Mike


If you convert all data to a date-and-time ?POSIXlt object, you can
just convert the minutes to seconds and add together with +.
Another way would be the something like this:

addTime-function(timeTxt,mins){
  start.time-strsplit(timeTxt,:)
  start.time-do.call(rbind,start.time)
  storage.mode(start.time)-numeric
  hours-mins%/%60
  mins.left-mins%%60
  end.mins-(start.time[,2]+mins.left)%%60
  end.hours-(start.time[,1]+hours+(start.time[,2]+mins.left)%/%60)%%24
  end.time-paste(end.hours,end.mins,sep=:)
 return(end.time)
}
addTime(c(15:23,7:00),c(70,100))

or this:

addTime2-function(timeTxt,mins){
orig.date-as.POSIXct(paste(2001-01-01,timeTxt))
new.Date-orig.date+mins*60
new.Date-strsplit(as.character(new.Date), )
new.Time-(sapply(new.Date,[,2))
return(new.Time)
}
addTime2(c(15:23,7:00),c(70,100))


Regards,
Gustaf
-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] two questions for R beginners

2010-03-01 Thread Gustaf Rydevik
On Mon, Mar 1, 2010 at 4:02 PM, Karl Ove Hufthammer k...@huftis.org wrote:
 On Mon, 01 Mar 2010 09:09:11 -0500 Duncan Murdoch murd...@stats.uwo.ca
 wrote:
  The reason for the difference is that data.frames are lists organized
  into columns (so the $ handling comes from the list, where it means
  extract the component) whereas a matrix is a single vector displayed
  in columns.
 
  Sure, I know that. But is there are reason why the '$' can't be
  overloaded to handle the extraction, as a *convenience* to the user?

 See the second paragraph of my response.

 OK. So I take it that there are no *technical* reasons can't be made to
 work for matrices and named vectors? I tried redefining it for matrices
 with

 `$.matrix`=function(x, name) ... something ...

 but I still get an error message when trying to use it.

 Of course I agree that 'the idea of a list is so fundamental to R that
 it needs to be something learned pretty early', but is there any harm in
 slightly 'blur[ing] the distinction between dataframes and matrices', as
 a convenience to the user? Or, in other words, what does one *gain* by
 having '$' on named matrices and vectors give a confusing error message
 instead of the expected results? Dinstinction for dinstinction's own
 sake is of little use.

 In case anyone is wondering about the vector case (of which matrices is
 of course only a special case), here is an example:

 d=iris[,1:4]
 d1=head(d,1)
 d2=mean(d)

 d1
  Sepal.Length Sepal.Width Petal.Length Petal.Width
 1          5.1         3.5          1.4         0.2
 d2
 Sepal.Length  Sepal.Width Petal.Length  Petal.Width
    5.84     3.057333     3.758000     1.199333

 d3$Sepal.Width
 [1] 3.5
 d4$Sepal.Width
 Error in d4$Sepal.Width : $ operator is invalid for atomic vectors

 --
 Karl Ove Hufthammer


As a technical excercise, I wrote the following function:

 '%W%'-function(e1,e2)e1[,which(colnames(e1)%in%e2)]

temp-matrix(1:6,nrow=2,dimnames=list(a=1:2,b=c(a,b,c)))
temp%W%b


I assume that the reason you can't use $.matrix , is that $ is a
primitive function and doesn't use the UseMethod function.

/Gustaf
-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] WHO Anthro growth curve macros and R

2009-12-28 Thread Gustaf Rydevik
Hi all,

I've got a project where I have to calculate weight-for-age Z-scores,
preferably using the WHO standards.

WHO have been very nice to publish macros for doing this in both
STATA,SPSS, SAS and Splus formats
(see http://www.who.int/childgrowth/software/en/), but for some reason
have chosen not to use the free R alternative to Splus.

In the Splus zipfile there are nine datafiles with a sdd file
ending, presumably data dumps from Splus 7.x. I've tried using
restore.data from the foreign package, but that does not work
(probably because the data is saved in the newer format).

I'm considering trying to read in spss files and massaging them to fit
to the format that the splus macro is expecting, but I'd prefer to be
able to use the Splus files directly.

Has anyone on the list tried using the WHO anthro macros with R, and
can tell me how they did it?
Alternatively, could some, very kind, person try and open the Splus
files, and save them in a R-readable format?
I would be extremely grateful for any help on this.

Best regards,

Gustaf Rydevik


-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] String question

2009-12-23 Thread Gustaf Rydevik
On Wed, Dec 23, 2009 at 11:21 AM, Knut Krueger r...@krueger-family.de wrote:
 Hi to all

 I need a string like
 temp - paste(m1,m2,m3,sep=,)
 But i must know how many items are in the string,afterwards
 the other option would be to use a vector
 temp - c(m1,m2,m3)
 No problem to get the count of items but I must get afterwards the string
  m1,m2,m3
 No problem to build the string with a loop, but it should be more easy but
 it seems that I am looking to the wrong functions.

 Kind regards Knut


Just thought I'd show you a solution from the other direction, in
addition to those that all other have posted:


temp - paste(m1,m2,m3,sep=,)##Generate string
nchar(gsub(([^,]),,temp))+1## Count commas in the string and add 1.


Regards,
Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question About Repeat Random Sampling from a Data Frame

2009-12-21 Thread Gustaf Rydevik
On Mon, Dec 21, 2009 at 4:12 PM, Adam Carr adamlc...@yahoo.com wrote:
 Good Morning:

 I've read many, many posts on the r-help system and I feel compelled to 
 quickly admit that I am relatively new to R, I do have several reference 
 books around me, but I cannot count myself among the fortunate who seem to 
 strong programming intuition.

 I have a data set consisting of 1637 observations of five variables: tensile 
 strength, yield strength, elongation, hardness and a character indicator with 
 three levels: (Y)es, (N)o, and (F)ail.

 My objective is to randomly sample various subsets from this data set and 
 then evaluate these subsets using simple parameters among them tests for 
 normality, shape and skewness. The data set is ordered by the character 
 variable prior to sampling, and the samples are weighted to mirror 
 representation in an overall, physical process.

 I am sampling the data set using this code:

 sample - dataset[sample(1:1637, 500, 
 prob=c(rep(163.7/1637,513),rep(245.5/1637,197),rep(1227.8/1637,927)),replace 
 = TRUE),]

 What I would like to do is iterate this process to create many (say 500 or 
 more) sampled sets of n=500 and then evaluate each set for the parameters of 
 interest. I would actually be evaluating each variable within each subset for 
 my characteristic of interest. I am familiar with sampling and saving single 
 columns of data to do this sort of thing, but I am not sure how to accomplish 
 this with a multiple-variable data set.

 For example, I am currently iterating this using a clunky process:

 mysamples-list()
 for (i in 1:10){
 mysamples[[i]] - dataset[ 
 sample(1:1637,100,prob=c(rep(163.7/1637,513),rep(245.5/1637,197),rep(1227.8/1637,927)),replace
  = TRUE), ]
 }

 But this leaves me with the additional task of defining each mysample[i] 
 iteration and converting it to a form on which I can apply a standard 
 statistical test like mean() or skewness() to the variable columns within 
 each subset. I have attempted to iteratively convert these lists using this 
 code:

 mat-matrix(nrow=100,ncol=5)
 for (i in 1:length(mysamples))
 {mat[i]-do.call('rbind',mysamples[i])}

 but running the code generates the error message: number of items to replace 
 is not a multiple of replacement length. I have tried unsuccessfully, by 
 reading many, many helpful r-help emails on this error, to understand my 
 probably obvious mistake.

 Based on the small amount that I think I know about R it seems to me that 
 sampling the data frame and containing the samples in a list is likely a 
 pretty inefficient way to do this task. Any help that any of you could 
 provide to assist me in iteratively sampling the data frame, and storing the 
 samples in a form on which I can apply other statistical tests would be 
 greatly appreciated.

 Thank you very much for taking the time to consider my questions.

 Adam



        [[alternative HTML version deleted]]

That's pretty much how I tend to do those things. what you seem to be
missing is the ?apply family:

mysamples.means-lapply(mysamples,function(x)mean(x[,1]))


Hope that gets you on your way. If you want more help, I'd suggest
including an example data set in your follow-up messages.

/Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What is the fastest way to see what are in an RData file?

2009-12-18 Thread Gustaf Rydevik
On Thu, Dec 17, 2009 at 4:33 PM, Peng Yu pengyu...@gmail.com wrote:
 On Thu, Dec 17, 2009 at 5:33 AM, Gustaf Rydevik
 gustaf.ryde...@gmail.com wrote:
 On Wed, Dec 16, 2009 at 10:13 PM, Peng Yu pengyu...@gmail.com wrote:

 Currently, I load the RData file then ls() and str(). But loading the file
 takes too long if the file is big. Most of the time, I only interested what
 the variables are in the the file and the attributes of the variables (like
 if it is a data.frame, matrix, what are the colnames/rownames, etc.)

 I'm wondering if there is any facility in R to help me avoid loading the
 whole file.


 I thought this was interesting as well, so i did a bit of searching
 through the R-help list archives and found this answer by Simon
 Urbanek:
 https://stat.ethz.ch/pipermail/r-devel/2007-August/046724.html
 The link to a c-routine that does what you want still works, but for
 future reference I'm pasting the code below.

 It doesn't work for the RData file that I saved by save(list='test',
 file='test.RData').

 $ rdcopy test.RData
 Format version 3ec, R version = 23813.88.84, release = f9db1dba
 Sorry, this tool supported RXDR version 2 format only



What happens if you remove the version check?
I.e. this one:
  if (ver != 2) {
XdrInTerm(d);
error(_(Sorry, this tool supported RXDR version 2 format only\n));
  }


From what I can read on the hel page for ?save, there hasn't been a
change in the file format since 1.4.0


/Gustaf
-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What is the fastest way to see what are in an RData file?

2009-12-17 Thread Gustaf Rydevik
On Wed, Dec 16, 2009 at 10:13 PM, Peng Yu pengyu...@gmail.com wrote:

 Currently, I load the RData file then ls() and str(). But loading the file
 takes too long if the file is big. Most of the time, I only interested what
 the variables are in the the file and the attributes of the variables (like
 if it is a data.frame, matrix, what are the colnames/rownames, etc.)

 I'm wondering if there is any facility in R to help me avoid loading the
 whole file.


I thought this was interesting as well, so i did a bit of searching
through the R-help list archives and found this answer by Simon
Urbanek:
https://stat.ethz.ch/pipermail/r-devel/2007-August/046724.html
The link to a c-routine that does what you want still works, but for
future reference I'm pasting the code below.

Regards,
Gustaf


/*  rdcopy v0.1-0 - extract objects or display contents of RData RDX2 files
 *
 *  Copyright (C) 2007    Simon Urbanek
 *  based in part on src/main/serialize.c and src/main/saveload.c from R:
 *  Copyright (C) 1995, 1996  Robert Gentleman and Ross Ihaka
 *  Copyright (C) 1997--2007  Robert Gentleman, Ross Ihaka and the
 *    R Development Core Team
 *  License: GPL v2
 *
 *  Although R includes are needed to compile this (for constants),
 *  libR does NOT have to be linked.
 */

#include stdio.h
#include rpc/types.h
#include rpc/xdr.h
#include R.h
#include Rinternals.h

#ifndef _
#define _(X) X
#endif

#undef error
void error(char *fmt, ...) {
  va_list(ap);

  va_start(ap, fmt);
  vprintf(fmt, ap);
  va_end(ap);
  exit(1);
}

/* .RData:

 byte 0..4  XDR2. - file magic (XDR2\n=XDR ver2)
 byte 5..6  X.    - format (A\n=ASCII, B\n=binary, X\n=XDR)
 byte 7...  RXDR2 stream.

 Note: RXDR2 format in NOT a valid XDR format! Strings and
   raw bytes are not padded and thus cannot be read
   using XDR alone.
*/

/* we need to override this so that we don't have to really use libR */
SEXP R_NilValue = 0;

/*  those are directly from serialize.c */
#define REFSXP    255
#define NILVALUE_SXP  254
#define GLOBALENV_SXP 253
#define UNBOUNDVALUE_SXP  252
#define MISSINGARG_SXP    251
#define BASENAMESPACE_SXP 250
#define NAMESPACESXP  249
#define PACKAGESXP    248
#define PERSISTSXP    247
#define CLASSREFSXP   246
#define GENERICREFSXP 245
#define BCREPDEF  244
#define BCREPREF  243
#define EMPTYENV_SXP  242
#define BASEENV_SXP   241

/* map type to a name */
static const char *nameSEXP(int type) {
  switch (type) {
  case REFSXP: return REF;
  case NILVALUE_SXP: return NULL;
  case GLOBALENV_SXP: return .GlobalEnv;
  case UNBOUNDVALUE_SXP: return unbound;
  case MISSINGARG_SXP: return missing;
  case BASENAMESPACE_SXP: return base;
  case NAMESPACESXP: return NAMESPACE;
  case PACKAGESXP: return PACKAGE;
  case PERSISTSXP: return PERSIST;
  case CLASSREFSXP: return CLASSREF;
  case GENERICREFSXP: return GENERICREF;
  case BCREPDEF: return BC-REP-DEF;
  case BCREPREF: return BC-REP-REF;
  case EMPTYENV_SXP: return empty-env;
  case BASEENV_SXP: return base-env;
  case NILSXP: return NIL;
  case SYMSXP: return SYM;
  case LISTSXP: return LIST;
  case CLOSXP: return CLO;
  case ENVSXP: return ENV;
  case PROMSXP: return PROM;
  case LANGSXP: return LANG;
  case SPECIALSXP: return SPECIAL;
  case BUILTINSXP: return BUILTIN;
  case CHARSXP: return CHAR;
  case LGLSXP: return LGL;
  case INTSXP: return INT;
  case REALSXP: return REAL;
  case CPLXSXP: return CPLX;
  case STRSXP: return STR;
  case DOTSXP: return ...;
  case ANYSXP: return ANY;
  case VECSXP: return VEC;
  case EXPRSXP: return EXPR;
  case BCODESXP: return BCODE;
  case EXTPTRSXP: return EXTPTR;
  case WEAKREFSXP: return WEAKREF;
  case RAWSXP: return RAW;
  case S4SXP: return S4;
  }
  return ?;
}

/* again from serialize.c */

#define IS_OBJECT_BIT_MASK (1  8)
#define HAS_ATTR_BIT_MASK (1  9)
#define HAS_TAG_BIT_MASK (1  10)
#define ENCODE_LEVELS(v) (v  12)
#define DECODE_LEVELS(v) (v  12)
#define DECODE_TYPE(v) (v  255)

/* this structure is passed acros all functions. it encapsulates both
the reading an book-keeping */

typedef struct {
  XDR xdrs;
  char *buf;
  long bs;
  FILE *f;
  int lev;
  char *flag;
  int refs;
  long *ref; /* reference offsets */
  int maxrefs; /* length of the refes vector */
  int verb;
  int mode;
  int flags;
  long target;
  FILE *copyf;
} SaveLoadData;

#define M_Read 0
#define M_NonRefCopy   1
#define M_Copy 2
#define M_NonRefSelect 3
#define F_NOREF  1

/* the following is partially based on src/main/saveload.c from R */

static void XdrInInit(FILE *fp, SaveLoadData *d, long sbsize)
{
  xdrstdio_create(d-xdrs, fp, XDR_DECODE);
  d-buf = (char*) malloc(sbsize);
  if (!(d-buf))
    error(_(cannot allocate memory for a string buffer));
  d-bs = sbsize;
  d-f = fp;
  d-lev = 0;
  d-flag = 0;
  d-flags = 0;
  d-refs = 0;
  d-maxrefs = 2048;
  d-ref = (long*) malloc(sizeof(long)*d-maxrefs);
  

Re: [R] How to find the significant digits of a number?

2009-12-16 Thread Gustaf Rydevik
On Wed, Dec 16, 2009 at 10:26 AM, Xiang Wu xiang@gmail.com wrote:

 Is there a function in R that could find the significant digit of a
 specific
 number? Such as for 3.1415, return '5'?

 Thanks in advance.

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



Hi,

x-pi
substr(as.character(x),6,6)

Regards,

Gustaf
-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] write.csv and header

2009-12-15 Thread Gustaf Rydevik
On Mon, Dec 14, 2009 at 4:37 PM, Walther, Alexander 
awaltherm...@googlemail.com wrote:

 Dear list,

 I would like to export a matrix to a TXT-File by using write.csv (not
 necessarily). Is there a way to add a header (with additional
 informations concerning the project) spanning multiple lines to this
 file before the actual data are listed up? Should look like this:



 date:
 filename:
 number of permutations:

 

 data (as a matrix)



 Any suggestions? Thnx in advance.


 cheers

 Alex




Hi,

?write.table and the argument append should be of help.
example:

 sink(test.csv)
 cat(-)
 cat(\n)
 cat(This is \n a test of header)
 cat(\n)
 cat(-)
 cat(\n)
 sink()

write.table(matrix(rnorm(100),nrow=10),file=test.csv,append=TRUE,sep=,)


regards,
Gustaf


-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Literature analysis

2009-12-11 Thread Gustaf Rydevik
On Fri, Dec 11, 2009 at 3:04 PM, Schwan s.s.hosse...@utwente.nl wrote:

 Thanks, but how should I put the citation inside a data frame?

 data.frame(first txt file, second txt file...)
 plot (what should I insert here) type=p

 And how should I load the txt files anyway inside the frame?


 Can you give an example of a couple of text files? Are they in a
standardised format (i.e. bibTEX or similar)?

/Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grep() exclude certain patterns?

2009-12-09 Thread Gustaf Rydevik
Hi,
Just a quick note regarding google and R: I use www.rseek.org almost
exclusively, and it tends to give me the results I need. It is based on
google, but uses a number of smart tricks to ferret out R-relevant
information.

/Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] savePlot for Mac and / or Linux?

2009-12-08 Thread Gustaf Rydevik
On Mon, Dec 7, 2009 at 9:53 AM, Christophe Genolini
cgeno...@u-paris10.frwrote:

 Hi all,

 In the package rtlu, I use the function savePlot. It is convenient since it
 let the user decide in which graphic format he wants his graph to be export.
 But when I run R CMD check, I get the following message :

   rtlu(V1,fileOutput=First.tex,textBefore=\\section{Variable 1 to

 3},graphName=V1)
 Error in savePlot(filename = nomBarplot, type = type) : can only copy from
 'windows' devices
 Calls: rtlu ... r2lUniv - r2lUniv.factor - r2lBarplot - savePlot
 Execution halted

 I guess this is a compatibility problem with Linux/Mac? Is there something
 close to savePlot for Mac / Linux?

 Christophe



I'm not sure I understand exactly what you want, but for easy changing of
the output file type,
I've written this small function. Perhaps it can be of help.

Regards,
Gustaf

-


###Function by Gustaf Rydevik, 2009-12-03 gustaf.ryde...@gmail.com
## Created to facilitate easy changes in the file format of generated
graphs.
## Gen.device() generates a device function that is a copy of an existing
function, but
## with (possibly) new defaults.
## Wanted.device can be the name of any device you choose:
png(),jpeg(),postcript(),etc.
## if fileEnding is missing, the function uses the Wanted.device name as
file ending.
## I then use My.device in the rest of the script file, meaning that I only
have to change file ##format in one location  (in the argument of
Gen.device()) to do so for all susequent graphs.

Gen.device-function(Wanted.device=png,fileEnding=NULL,...){
dots-list(...)
ending-Wanted.device
Wanted.device-get(Wanted.device)
if(!is.null(fileEnding)) ending-fileEnding
generated.device-function(File,...){
dots2-list(...)
File-paste(File,ending,sep=.)
dots[which(names(dots)%in%names(dots2))]-NULL
do.call(Wanted.device,c(filename=File,dots,dots2))
}
return(generated.device)
}


##example
My.device-Gen.device(png,width=7,height=7,units=in,res=Res-200)
My.device(File=test)
plot(rnorm(1999))
dev.off()




-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R help - IGARCH estimation

2009-10-19 Thread Gustaf Rydevik
On Mon, Oct 19, 2009 at 6:37 AM, Xu, Ke-Li k...@bus.ualberta.ca wrote:

  Hi there,

 Thanks for your previous help on R. Do you know how to estimate an IGARCH
 (integrated GARCH) model in R? I need it when I estimate the Value at Risk
 following RiskMetrics methodology.

 regards,
 Keli


Hi Keli,

I would like to point you to the website:
www.rseek.org

I don't know anything at all about IGARCH, but a quick search pointed me to:
http://rgarch.r-forge.r-project.org/
which seem to include the mentioned mode.

I would also recommend that you subscribe to R-help at
https://stat.ethz.ch/mailman/listinfo/r-help
and send questions there instead of directly to me (who is not much of an R
expert...)

best regards, and good luck

Gustaf



-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Function to find prime numbers

2009-10-13 Thread Gustaf Rydevik
library(gmp)
?isprime


/Gustaf



On Tue, Oct 13, 2009 at 9:59 AM, AJ83 aljense...@gmail.com wrote:


 I need to create a function to find all the prime numbers in an array. Can
 anyone point me in the right direction?
 Thank you.
 AJ
 --
 View this message in context:
 http://www.nabble.com/Function-to-find-prime-numbers-tp25868633p25868633.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matching in R

2009-04-27 Thread Gustaf Rydevik
On Sun, Apr 26, 2009 at 6:22 PM,  dirk...@gmx.de wrote:
 Dear R users,

 I am trying to do exact matching on a large dataset (500.000 obs), about 
 equal size of treatment and controll group, with replacement: As for the 
 moment I use the Match function of the Matching library. I match on 2 
 covariates and all observations in the treatment group have at least one 
 exact counterpart in the controllgroup. Now I want to introduce observation 
 weights. I set ties=FALSE, as I want exactly one by one matching: Is there a 
 way which makes that I draw randomly from the individuals in the 
 controllgroup which have the same values of covariates as the individual in 
 the treatmentgroup, setting the probabilities to be drawn proportional to the 
 weights of the individual in the CT? E.g. I have three individuals which all 
 have the same value for the covariates as the one observation I want to find 
 a partner for, and the first of the three individuals has a very large 
 weight: Now when drawing randomly among those three I want the probability 
 that the first one is dr!
  awn to be very large.

 I'd really appreciate any suggestions: the weights option does not do the 
 job, this seems to work only if setting ties=TRUE

 Thanks
 Dirk
 --


Hi Dirk,

You don't give a sample dataset, and I've not used the Matching
library, so take my comments with a scoop of salt.
Looking at the help page for Match, it seems as if the option
Weight.matrix is what you're looking for. creating a weight column
in the treatment group with a constant, high value, including weight
in the matching, and giving that covariate a high importance might
work, no?

/Gustaf
-

Quote:

Weight.matrix
This matrix denotes the weights the matching algorithm uses when
weighting each of the covariates in X—see the Weight option. This
square matrix should have as many columns as the number of columns of
the X matrix. This matrix is usually provided by a call to the
GenMatch function which finds the optimal weight each variable should
be given so as to achieve balance on the covariates.

For most uses, this matrix has zeros in the off-diagonal cells. This
matrix can be used to weight some variables more than others. For
example, if X contains three variables and we want to match as best as
we can on the first, the following would work well:
 Weight.matrix - diag(3)
 Weight.matrix[1,1] - 1000/var(X[,1])
 Weight.matrix[2,2] - 1/var(X[,2])
 Weight.matrix[3,3] - 1/var(X[,3])
This code changes the weights implied by the inverse of the variances
by multiplying the first variable by a 1000 so that it is highly
weighted. In order to enforce exact matching see the exact and caliper
options. 

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Margins in lattice and device resolution

2009-04-17 Thread Gustaf Rydevik
Hi all,

I believe I've run into this before, but I seem to have totally
forgotten. No headway in the last couple of hours either. How do I
make sure that points and margins remain the same absolute size as I
change
the resolution of a device?
(I'm running 2.9.1 patched, on a Win XP-machine)

Many thanks in advance,

Gustaf

Ps: As an afterthought, might it be that this behaviour is related to
the receng grid-bug for text size in lattice when changing resolution?


Example:
.

###This give a totally squished graph, where the actual plotting area is minimal
CairoPNG(example.png,width = 480, height = 480,
 dpi=600,
 pointsize = 12, bg = white)
bwplot(decrease ~ treatment, OrchardSprays, groups = rowpos,
   panel = panel.superpose,
   panel.groups = panel.linejoin,
   xlab = treatment,
   key = list(lines = Rows(trellis.par.get(superpose.line),
  c(1:7, 1)),
  text = list(lab = as.character(unique(OrchardSprays$rowpos))),
  columns = 4, title = Row position))
dev.off()


### This gives a more norma looking graph
CairoPNG(example2.png,width = 480, height = 480,
 dpi=20,
 pointsize = 12, bg = white)
bwplot(decrease ~ treatment, OrchardSprays, groups = rowpos,
   panel = panel.superpose,
   panel.groups = panel.linejoin,
   xlab = treatment,
   key = list(lines = Rows(trellis.par.get(superpose.line),
  c(1:7, 1)),
  text = list(lab = as.character(unique(OrchardSprays$rowpos))),
  columns = 4, title = Row position))
dev.off()

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] utils lacking namespace?

2009-04-15 Thread Gustaf Rydevik
Hi all,
A colleague of mine tried to install the package EMV, which had been
removed from CRAN.
she ran into some kind of trouble, R locked up, and she closed the program.
Now when she starts R, utils can't be loaded which of course create
an unworkable environment.
Below I've copy-pasted the error message she gets when starting R.
Any ideas on what went wrong, and more importantly, how to fix it?
Many thanks in advance,

Gustaf Rydevik

Ps: She's running R on a WinXP box, if that might be of relevance...




Error : package 'utils' does not have a name space

R version 2.8.1 (2008-12-22)
Copyright (C) 2008 The R Foundation for Statistical Computing ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and 'citation()' on how to
cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

Warning message:
package methods in options(defaultPackages) was not found Error in
library(package, lib.loc = lib.loc, character.only = TRUE,
logical.return = TRUE,  :
  'utils' is not a valid package -- installed  2.0.0?



-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] utils lacking namespace?

2009-04-15 Thread Gustaf Rydevik
On Wed, Apr 15, 2009 at 12:20 PM, Duncan Murdoch murd...@stats.uwo.ca wrote:
 Gustaf Rydevik wrote:

 Hi all,
 A colleague of mine tried to install the package EMV, which had been
 removed from CRAN.
 she ran into some kind of trouble, R locked up, and she closed the
 program.
 Now when she starts R, utils can't be loaded which of course create
 an unworkable environment.
 Below I've copy-pasted the error message she gets when starting R.
 Any ideas on what went wrong, and more importantly, how to fix it?


 No idea of the details of what went wrong, but it looks as though your
 colleague has some bad startup file (Renviron, Rprofile, etc; see ?Startup
 for the full list) or has actually damaged her R installation.  I'd try
 re-installing it first, because that's easy, then work through ?Startup and
 see if there are some bad files or environment variables messing things up.

 Duncan Murdoch

Hi, and thanks for the help!

It turned out after a bit of searching among the libraries file
structrure that the utils catalogue had somehow been moved to the
catalogue belonging to package NADA. It must have been some
installation script (of the EMV package?) that for some reason moved
it there, but heavens know why.

Oh well, things got sorted out in the end anyhow, and all's well now!

regards,

Gustaf




-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] same value in column--delete

2009-03-26 Thread Gustaf Rydevik
On Thu, Mar 26, 2009 at 12:15 PM, Duijvesteijn, Naomi
naomi.duijveste...@ipg.nl wrote:

   Hi Readers,


   I have a question.


   I have a large dataset and want to throw away columns that have the same
   value in the column itself and I want to know which column this was.


   For example

    x-data.frame(id=c(1,2,3), snp1=c(A,G,
   G),snp2=c(G,G,G),snp3=c(G,G,A))

    x

     id snp1 snp2 snp3

   1  1    A    G    G

   2  2    G    G    G

   3  3    G    G    A


   Now I want to know that snp2 in monomorphic (the same value for the column)
   and after I know which column it is I want to take these columns out.


   Thanks,

   Naomi



Another, perhaps slightly more intuitive solution than Jim's would be
the following:

 x-data.frame(id=c(1,2,3), snp1=c(A,G,
G),snp2=c(G,G,G),snp3=c(G,G,A))
is.monovalued-function(df){
  sapply(df,function(x){
length(unique(x))==1
  })
}

monovaluedCols-is.monovalued(x)
which(monovaluedCols)
x[!monovaluedCols]

/Gustaf
-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] modifying a built in function from the stats package (fixing arima)

2009-03-05 Thread Gustaf Rydevik
On Thu, Mar 5, 2009 at 10:00 AM, Marc Vinyes mvin...@aleasoft.com wrote:
If you ***look at the code*** for arima you will see that ``%+%'' is
defined
in terms of a call to ``.Call()'' which calls ``R_TSconv''.  So
apparently
R_TSconv is a C or Fortran function or subroutine in a ``shared
object library''
or dll upon which arima depends.  Hence to do anything with it you'll
need to get
that shared object library and dynamically load it.  (E.g. get the
code, SHLIB it,
and dynamically load the resulting shared object library.)

The code is all available from the R source tarball.

If this is a challenge for you then the best advice would be not to
mess with it.

 Hi Rolf,
 It took me some time to come to the same conclusion (I didn't even know what
 .Call() was) but I've found an easier way to modify the R file without
 having to understand how to link dlls. I just downloaded the full R package,
 Rtools and followed the instructions in
 http://cran.r-project.org/doc/manuals/R-admin.html#Building-the-core-files
 to build it. Then I can modify C:\R\src\library\stats\R\arima.R and run it.
 It is quite exagerated that I have to build R in order to modify an R file
 without messing with dlls, and I think it would be interesting to make this
 process easier, but for now I'm happy to be productive again.

 Thank you all for your help,

 Best,
 MarC



Just a quick note on your original question:
if you use edit(arima), you have to remember that it returns the
modified function, which then must be stored.

I.e, use
arima-edit(arima)

instead of just

edit(arima)

,and changes should be stored.

Regards,

Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Have a function like the _n_ in R ? (Automatic count function )

2009-02-25 Thread Gustaf Rydevik
On Wed, Feb 25, 2009 at 3:30 PM, hadley wickham h.wick...@gmail.com wrote:
 And for completeness here's a function that returns the next integer
 on each call.

 n - (function(){
  i - 0
  function() {
    i - i + 1
    i
  }
 })()

 n()
 [1] 1
 n()
 [1] 2
 n()
 [1] 3
 n()
 [1] 4
 n()
 [1] 5
 n()
 [1] 6


 ;)

 Hadley



*headache*!
I can't wrap my head around this one - too strange code!
Could someone please give a hint on what's going on?
How doesi- i+1 modify i permanently, seeing as i is defined as 0
to start with?

/Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Have a function like the _n_ in R ? (Automatic count function )

2009-02-25 Thread Gustaf Rydevik
On Wed, Feb 25, 2009 at 4:43 PM, Charles C. Berry cbe...@tajo.ucsd.edu wrote:
 On Wed, 25 Feb 2009, Gustaf Rydevik wrote:

 On Wed, Feb 25, 2009 at 3:30 PM, hadley wickham h.wick...@gmail.com
 wrote:

 And for completeness here's a function that returns the next integer
 on each call.

 n - (function(){
  i - 0
  function() {
    i - i + 1
    i
  }
 })()

 n()

 [1] 1

 n()

 [1] 2

 n()

 [1] 3

 n()

 [1] 4

 n()

 [1] 5

 n()

 [1] 6


 ;)

 Hadley



 *headache*!
 I can't wrap my head around this one - too strange code!
 Could someone please give a hint on what's going on?
 How doesi- i+1 modify i permanently, seeing as i is defined as 0
 to start with?


 i is not _defined_ as zero. It is initially _assigned_ the value of zero and
 is subsequently assigned other values.

 As for the details of what goes here, see

        An Introduction to R
        Section 10.7 Scope

 and study the open.acount()  example there.

 HTH,

 Chuck



Thank you - I think I finally understood how that code got parsed.
Does the text below describe things correctly?
First, Hadley defines a function that returns another function, like this:

function(){
 i - 0
 function() {
   i - i + 1
   i
 }
}

Since the returned function is defined in a local environment , R
returns the function together with that local environment, and lexical
scoping can work it's magic
Finally Hadley evaluates the above defined function-returning
function, and stores the returned function in n.

n-function(){
 i - 0
 function() {
   i - i + 1
   i
 }
}()

*Phew*
That wasn't too difficult after all :-)

/Gustaf


-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] an S idiom for ordering matrix by columns?

2009-02-19 Thread Gustaf Rydevik
On Thu, Feb 19, 2009 at 5:40 PM, Aaron Mackey ajmac...@gmail.com wrote:
 There's got to be a better way to use order() on a matrix than this:

 y
2L-035-3 2L-081-23 2L-143-18 2L-189-1 2R-008-5 2R-068-15 3L-113-4
 3L-173-2
 3981 1 221 12
 2
 8571 1 221 22
 2
 9111 1 221 22
 2
 3831 1 221 12
 2
 6391 2 212 21
 2
 7561 2 212 21
 2
3L-186-1 3R-013-7 3R-032-1 3R-169-10 X-002 X-087
 398122 2 1 2
 857122 2 1 2
 911122 2 1 2
 383122 2 1 2
 639221 2 1 2
 756221 2 1 2


 y[order(y[,1],y[,2],y[,3],y[,4],y[,5],y[,6],y[,7],y[,8],y[,9],y[,10],y[,11],y[,12],y[,13],y[,14]),]
2L-035-3 2L-081-23 2L-143-18 2L-189-1 2R-008-5 2R-068-15 3L-113-4
 3L-173-2
 3981 1 221 12
 2
 3831 1 221 12
 2
 8571 1 221 22
 2
 9111 1 221 22
 2
 6391 2 212 21
 2
 7561 2 212 21
 2
3L-186-1 3R-013-7 3R-032-1 3R-169-10 X-002 X-087
 398122 2 1 2
 383122 2 1 2
 857122 2 1 2
 911122 2 1 2
 639221 2 1 2
 756221 2 1 2

 Thanks for any suggestions!

 -Aaron



You mean something like this:
 test-matrix(sample(1:4,100,replace=T),ncol=10)
 test[do.call(order,data.frame(test)),]

?

Regards,

Gustaf


-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Alternate to for-loop

2009-02-16 Thread Gustaf Rydevik
On Mon, Feb 16, 2009 at 12:59 PM, megh megh700...@yahoo.com wrote:

 Hi, I am trying to create a vector of length 10 (say), wherein each element
 will be average of random sample of size 100, from a distribution, say
 Normal. Can anyone please tell me without creating a for loop, how I can
 do that?

 Regards,


 --
 View this message in context: 
 http://www.nabble.com/Alternate-to-for-loop-tp22035954p22035954.html
 Sent from the R help mailing list archive at Nabble.com.


as a variant of Patrick Burns code, you can write:

rowMeans(matrix(rnorm(1000),ncol=100))

,and substitute another distribution for rnorm if you want.

/Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generating Numbers With Certain Distribution in R

2009-02-11 Thread Gustaf Rydevik
On Wed, Feb 11, 2009 at 2:15 PM, Ben Bolker bol...@ufl.edu wrote:
 Bernardo Rangel Tura wrote:

 I think your routine need a little fix

 x - rlnorm(1e6,meanlog=1,sdlog=1) ## pick any parameters you like
 y - round((x-min(x)/diff(range(x)))*19+1)

 What you think?

  Yes.


No.
Bernardo misplaced the parenthesis around (x-min(x))
Correct version is:

x - rlnorm(1e6,meanlog=1,sdlog=1) ## pick any parameters you like
y - round((x-min(x))/diff(range(x))*19+1)


/Gustaf


-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to comment in R

2009-02-11 Thread Gustaf Rydevik
On Wed, Feb 11, 2009 at 2:15 PM, baptiste auguie ba...@exeter.ac.uk wrote:
 A somewhat twisted approach that has not been mentioned is to consider
 everything a comment unless it is enclosed in special tags, as done in the
 brew package,

 for example,


  brew(textConnection(
 You won't see this R output, but it will run. % foo - 'bar' %
  Now foo is %=foo% and today is %=format(Sys.time(),'%B %d, %Y')%.
  ) )

 gives,

 You won't see this R output, but it will run.
 Now foo is bar and today is February 11, 2009.


 I'd love to see an editor with a brew mode that acts as a notebook: you
 type in your text in whatever language without worrying about the syntax (R
 syntax, i mean!), and when you want to do a calculation you just enclose it
 in such tags that behave like an inverted block comment.

 Just a thought,

 baptiste



Isn't this almost exactly what ?Sweave does? (and  odfWeave).
Granted, you have to deal with latex code to get nice output, but
latex is a GoodThing (tm).

/Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Beginner-how to down size a large sample

2009-02-11 Thread Gustaf Rydevik
On Wed, Feb 11, 2009 at 3:15 PM, pramil cheriyath drpra...@gmail.com wrote:
 I have this large data set with an outcome variable 0 and 1,  I want
 to randomly pick 100 from each group and create another data set.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




dataSet-data.frame(group=sample(c(1,0),1,replace=T),data=rnorm(1))
dataSet.1-dataSet[dataSet$group==1,]
dataSet.0-dataSet[dataSet$group==0,]
sampled.1-dataSet.1[sample(1:nrow(dataSet.1),100),]
sampled.0-dataSet.0[sample(1:nrow(dataSet.0),100),]

newdataSet-rbind(sampled.1,sampled.0)

/Gustaf
(a please, would have been nice)

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Counting session days

2009-02-09 Thread Gustaf Rydevik
On Mon, Feb 9, 2009 at 4:57 PM,  stefan.peters...@inizio.se wrote:

 hi,

 I have some session data in a dataframe, where each session is recorded with 
 a start and a stop date. Like this:

 session_start   session_stop
 ===
 2009-01-03  2009-01-04
 2009-01-01  2009-01-05
 2009-01-02  2009-01-09

 A session is at least one day long. Now I want a dataframe with 'active 
 sessions' per date. Like this:

 dateactive_sessions
 =
 2009-01-01  1
 2009-01-02  2
 2009-01-03  3
 2009-01-04  3
 2009-01-05  2
 2009-01-06  1
 2009-01-07  1
 2009-01-08  1
 2009-01-09  1

 How do I do that? I've searched the usual sources, but my newbie status and 
 language barrier left me with nothing. So plz, anyone?


Hej Stefan,

The following should do. It's a bit convoluted though, so someone else
might be able to come up with a better solution.

 test
   start   stop
1 2009-01-03 2009-01-04
2 2009-01-01 2009-01-05
3 2009-01-02 2009-01-09

activeDaysPerSession-apply(test,MARGIN=1,FUN=function(x)
seq(from=as.Date(x[start]),
to=as.Date(x[stop]),by=1
)
)
ActiveDays-as.Date(unlist(activeDaysPerSession))
as.data.frame(table(ActiveDays))



Regards,

Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selectively Removing objects

2009-02-02 Thread Gustaf Rydevik
On Mon, Feb 2, 2009 at 2:16 PM, Paulo Grahl pgr...@gmail.com wrote:
 Dear list members,

 Does anyone know how to use rm() to remove only variables but not
 declared functions from the environment ?
 I understand I could name all the functions with, let's say
 f_something, make sure that all variables do not start with f_ and
 then remove all BUT objects starting with f_.
 However, I have already defined all the functions and it would be
 troublesome to change all of them to a new name.

 Any hint ?
 Thanks

 Paulo Gustavo Grahl, CFA



[Note to Paulo:I changed the code slightly: defining Nonfunctions
separately messed things up.]


Hi Paulo,

The following should do it.

test-function(x)x^2
test2-5
test3-77
ls()

rm(list=ls()[
sapply(ls(),
   function(x){
 class(get(x))!=function
 })
])
ls()

Regards,

Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ifelse help?

2009-01-20 Thread Gustaf Rydevik
On Mon, Jan 19, 2009 at 9:08 PM,  rkevinbur...@charter.net wrote:
 Sorry I didn't give the proper initialization of j. But you are right j 
 should also be an array of 5. So x[j + 5] would return 5 values.

 So if the array returned from 'ifelse' is the same dimention as test (h), 
 then are all the values of h being tested? So since h as you say has no 
 dimensions is the test only testing h[1]? Again it seems that if all of the 
 elements of h are tested (there are 5 elements) and each element produces an 
 array of 5 the resulting array should be 25.

 Kevin


ifelse returns values row-by-row, so to speak. in this case, it will
return the vector:
c(x[j+2][1] , x[j+2][2] , x[j+2][3] , x[j+2][4] , x[j+2][5]).

If you instead write:

h-numeric(5)
 j-1:5
p - 1:5
x-1:1000
ifelse(h == 0,list(x[j+2]), 1:5)

,you get what you expected, since ifelse recycles the second argument
if necessary.

Regards,

Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] two-sample test of multinomial proportion

2009-01-20 Thread Gustaf Rydevik
Hi all,

This is perhaps more a statistics question than an R question, but I
hope it's OK anyhow.

I have some data (see below) with the number of tests positive to
subtype H1 of a virus, the number of tests postive to subtype H3, and
the total number of tests. This is for two different groups, and the
two subtypes are mutually exclusive.

What is the best way to test if the proportion of H1 tests to all
positive tests differ between the two groups?
I could run prop.test() on just the H1 and H3 part of the data,
ignoring the total number of tests. But this seem to skip some
information regarding variance of H1/H3 in the two groups, so I don't
think it is correct.

I've tried using a bootstrap approach on the ratio of the two
proportions, but there must be a smarter way.
Any help is much appreciated!

Best regards,

Gustaf Rydevik


data and bootstrap attempt ###
multi.data-data.frame(
  group=c(a,b),
  H1=c(2,12),
  H3=c(21,46),
  tests=c(189,411)
)
multi.ind-data.frame(Type=
rep(c(H1,H3,Neg),c(2+12,21+46,189+411-2-12-21-46)),
group=rep(c(a,b,a,b,a,b),c(2,12,21,46,189-2-21,411-12-46))
)

props1-vector(mode=numeric,length=1000)
props2-vector(mode=numeric,length=1000)
for(i in 1:1000){
sub.tab-t(table(Subtyp.orig[sample(1:nrow(Subtyp.orig),nrow(Subtyp.orig),replace=TRUE),]))
props1[i]-sub.tab[1,1]/(sub.tab[1,1]+sub.tab[1,2])
props2[i]-sub.tab[2,1]/(sub.tab[2,1]+sub.tab[2,2])
}
sub.kvot-props1/props2
sort(sub.kvot)[50]
sort(sub.kvot)[950]



-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] two-sample test of multinomial proportion

2009-01-20 Thread Gustaf Rydevik
On Tue, Jan 20, 2009 at 4:08 PM, Gustaf Rydevik
gustaf.ryde...@gmail.com wrote:
 Hi all,

 This is perhaps more a statistics question than an R question, but I
 hope it's OK anyhow.

 I have some data (see below) with the number of tests positive to
 subtype H1 of a virus, the number of tests postive to subtype H3, and
 the total number of tests. This is for two different groups, and the
 two subtypes are mutually exclusive.

 What is the best way to test if the proportion of H1 tests to all
 positive tests differ between the two groups?
 I could run prop.test() on just the H1 and H3 part of the data,
 ignoring the total number of tests. But this seem to skip some
 information regarding variance of H1/H3 in the two groups, so I don't
 think it is correct.

 I've tried using a bootstrap approach on the ratio of the two
 proportions, but there must be a smarter way.
 Any help is much appreciated!

 Best regards,

 Gustaf Rydevik


 data and bootstrap attempt ###
 multi.data-data.frame(
  group=c(a,b),
  H1=c(2,12),
  H3=c(21,46),
  tests=c(189,411)
 )
 multi.ind-data.frame(Type=
 rep(c(H1,H3,Neg),c(2+12,21+46,189+411-2-12-21-46)),
 group=rep(c(a,b,a,b,a,b),c(2,12,21,46,189-2-21,411-12-46))
 )

 props1-vector(mode=numeric,length=1000)
 props2-vector(mode=numeric,length=1000)
 for(i in 1:1000){
 sub.tab-t(table(Subtyp.orig[sample(1:nrow(Subtyp.orig),nrow(Subtyp.orig),replace=TRUE),]))
 props1[i]-sub.tab[1,1]/(sub.tab[1,1]+sub.tab[1,2])
 props2[i]-sub.tab[2,1]/(sub.tab[2,1]+sub.tab[2,2])
 }
 sub.kvot-props1/props2
 sort(sub.kvot)[50]
 sort(sub.kvot)[950]



 --
 Gustaf Rydevik, M.Sci.
 tel: +46(0)703 051 451
 address:Essingetorget 40,112 66 Stockholm, SE
 skype:gustaf_rydevik


ooops - forgot to change a name of the bootstrap code. Below is a
corrected version.

/Gustaf

data and bootstrap attempt ###
multi.data-data.frame(
 group=c(a,b),
 H1=c(2,12),
 H3=c(21,46),
 tests=c(189,411)
)
multi.ind-data.frame(Type=
rep(c(H1,H3,Neg),c(2+12,21+46,189+411-2-12-21-46)),
group=rep(c(a,b,a,b,a,b),c(2,12,21,46,189-2-21,411-12-46))
)

props1-vector(mode=numeric,length=1000)
props2-vector(mode=numeric,length=1000)
for(i in 1:1000){
sub.tab-t(table(multi.ind[sample(1:nrow(multi.ind),nrow(multi.ind),replace=TRUE),]))
props1[i]-sub.tab[1,1]/(sub.tab[1,1]+sub.tab[1,2])
props2[i]-sub.tab[2,1]/(sub.tab[2,1]+sub.tab[2,2])
}
sub.kvot-props1/props2
sort(sub.kvot)[50]
sort(sub.kvot)[950]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data frames with å, ä, and ö (=n on-ASCII-characters) from windows to mac os x

2009-01-16 Thread Gustaf Rydevik
Hi,
I ran into this issue previously and managed to solve it, but I've
forgotten how and am getting frustrated...

I have a data frame (see below) with scandinavian characters in R
(2.7.1) running on a Win Xp-computer. I save the data frame in an
RData-file on a usb stick, and load() it in R (2.8.0) running on OS X
10.5. Now the name of the data frame and all factor labels with
scandinavian characters are scrambled. How do I make R in OS X read my
data frame?
From what I've managed to find in the list archives and the FAQ I either
1) run
 Sys.setlocale(LC_ALL,en_US.UTF-8) ### Doesn't change anything
or
2) run
  defaults write org.R-project.R force.LANG en_US.UTF-8
in the terminal, which doesn't help either.
I must admit that I couldn't quite follow what documentation i found
on locales, so I might have messed up somewhere along the line.

Many thanks in advance for your help!

Regards,

Gustaf




Länkarta -
structure(list(LANKOD = structure(c(11L, 19L, 10L, 13L, 21L,
7L, 9L, 18L, 8L, 3L, 16L, 6L, 5L, 4L, 15L, 2L, 20L, 17L, 1L,
14L, 12L), .Label = c(AB, AC, BD, C, D, E, F, G,
H, I, K, M, N, O, S, T, U, W, X, Y, Z
), class = factor), Län = structure(c(1L, 4L, 3L, 5L, 6L, 7L,
8L, 2L, 9L, 10L, 20L, 21L, 13L, 14L, 15L, 16L, 17L, 18L, 12L,
19L, 11L), .Label = c(Blekinge län, Dalarnas län, Gotlands län,
Gävleborgs län, Hallands län, Jämtlands län, Jönköpings län,
Kalmar län, Kronobergs län, Norrbottens län, Skåne län,
Stockholms län, Södermanlands län, Uppsala län, Värmlands län,
Västerbottens län, Västernorrlands län, Västmanlands län,
Västra Götalands län, Örebro län, Östergötlands län), class =
factor)), .Names = c(LANKOD,
Län), class = data.frame, row.names = c(0, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20))



-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] reshape, direction=long: multiple row names not allowed

2009-01-14 Thread Gustaf Rydevik
Hi all,

for some reason I always get stuck spending hours when trying to use
reshape or the Reshape package. Heaven knows why.
My latest frustration (in 2.7.1, so ignore if this has been fixed):

test-data.frame(matrix(rnorm(42*4),ncol=4),rep(1:21,2),rep(c(a,b),each=21))
reshape(test,varying=list(colnames(test)[1:4]),direction=long)

test-data.frame(matrix(rnorm(42*4),ncol=4),id=rep(1:21,2),rep(c(a,b),each=21))
reshape(test,varying=list(colnames(test)[1:4]),direction=long)

The first works, but the second does not. The only information on why
is that duplicate row names are not allowed.
It took me a fair time before figuring out that it was the id-column
that caused problems.
Perhaps something to fix, or at least give a more informative error message?

Best regards,

Gustaf


-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reshape, direction=long: multiple row names not allowed

2009-01-14 Thread Gustaf Rydevik
On Wed, Jan 14, 2009 at 3:07 PM, hadley wickham h.wick...@gmail.com wrote:
 On Wed, Jan 14, 2009 at 5:51 AM, Gustaf Rydevik
 gustaf.ryde...@gmail.com wrote:
 Hi all,

 for some reason I always get stuck spending hours when trying to use
 reshape or the Reshape package. Heaven knows why.
 My latest frustration (in 2.7.1, so ignore if this has been fixed):

 test-data.frame(matrix(rnorm(42*4),ncol=4),rep(1:21,2),rep(c(a,b),each=21))
 reshape(test,varying=list(colnames(test)[1:4]),direction=long)

 test-data.frame(matrix(rnorm(42*4),ncol=4),id=rep(1:21,2),rep(c(a,b),each=21))
 reshape(test,varying=list(colnames(test)[1:4]),direction=long)

 The first works, but the second does not. The only information on why
 is that duplicate row names are not allowed.
 It took me a fair time before figuring out that it was the id-column
 that caused problems.
 Perhaps something to fix, or at least give a more informative error message?

 Well there isn't any problem with the reshape package:

 test - data.frame(
  matrix(rnorm(42 * 4), ncol = 4),
  A = rep(1:21,2),
  B = rep(c(a,b), each = 21)
 )
 library(reshape)
 melt(test, id = c(A, B))

 but I'm not sure what you're trying to achieve.

 Hadley

 PS.  Usingwhitespacemakesyourcodeeasiertoread!

 --
 http://had.co.nz/


Hi,

sorry, I didn't mean to imply that the Reshape package fails here.
Just that for some reason I find it difficult to wrap my head around
the syntax of both the reshape command and the Reshape package...
Your code was exactly was I was trying to achieve btw.
Thank you!

regards,
Gustaf
ps: Butitisooeasytojustcodewithoutbotheringaboutmerehumanreadability. :-)



-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] getting ISO week

2008-12-11 Thread Gustaf Rydevik
Hi all,

Is there a simple function already implemented for getting the ISO
weeks of a Date object?
I couldn't find one, and so wrote my own function to do it, but would
appreciate a pointer to the default way. If a function is not yet
implemented, could the code below be of interest to submit to CRAN?

Best Regards,

Gustaf



getweek-function(Y,M=NULL,D=NULL){

  if(!class(Y)[1]%in%c(Date,POSIXt)) {
  date.posix-strptime(paste(c(Y,M,D),collapse=-),%Y-%m-%d)
  }
  if(class(Y)[1]%in%c(POSIXt,Date)){
date.posix-as.POSIXlt(Y)
Y-as.numeric(format(date.posix,%Y))
M-as.numeric(format(date.posix,%m))
D-as.numeric(format(date.posix,%d))
  }


  LY- (Y%%4==0  !(Y%%100==0))|(Y%%400==0)
  LY.prev- ((Y-1)%%4==0  !((Y-1)%%100==0))|((Y-1)%%400==0)
  date.yday-date.posix$yday+1
  jan1.wday-strptime(paste(Y,01-01,sep=-),%Y-%m-%d)$wday
  jan1.wday-ifelse(jan1.wday==0,7,jan1.wday)
  date.wday-date.posix$wday
  date.wday-ifelse(date.wday==0,7,date.wday)


  If the date is in the beginning, or end of the year,
  ### does it fall into a week of the previous or next year?
  Yn-ifelse(date.yday=(8-jan1.wday)jan1.wday4,Y-1,Y)
  Yn-ifelse(Yn==Y((365+LY-date.yday)(4-date.wday)),Y+1,Y)

  ##Set the week differently if the date is in the beginning,middle or
end of the year

  Wn-ifelse(
  Yn==Y-1,
  ifelse((jan1.wday==5|(jan1.wday==6 LY.prev)),53,52),
  ifelse(Yn==Y+1,1,(date.yday+(7-date.wday)+(jan1.wday-1))/7-(jan1.wday4))
  )
return(list(Year=Yn,ISOWeek=Wn))
}


-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] getting ISO week

2008-12-11 Thread Gustaf Rydevik
On Thu, Dec 11, 2008 at 2:10 PM, Prof Brian Ripley
[EMAIL PROTECTED] wrote:
 A slightly simpler version is

 format(Sys.Date(), %V)


 On Thu, 11 Dec 2008, Prof Brian Ripley wrote:

 strftime(x, %V)

 E.g.

 strftime(as.POSIXlt(Sys.Date()), %V)

 is 50, and you might want as.numeric() on it.

 Note that this is OS-dependent, and AFAIR Windows does not have it.


-

On Thu, Dec 11, 2008 at 2:15 PM, Gabor Grothendieck
[EMAIL PROTECTED] wrote:
 format(d, %U) and format(d, %W) give week numbers using
 different conventions.  See ?strptime


Thank you both for your replies!

I'm on windows, so prof Ripleys solution does not work (why is this
OS-dependent?).
Regarding Gabor's solution, neither convention follow the ISO 8601
standard, which is used in Europe (and Sweden in particular). See
http://en.wikipedia.org/wiki/ISO_8601#Week_dates .
So it seems that my function does fill a hole, however small

I know that for me, working with week numbers, which are used quite
heavily in Sweden, have always been a major frustration.
Would it be possible to implement something similar to my solution in
base, and how should I go about making it fit in to the rest of the
date functions?

/Gustaf



-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question involving loops from intro level R programming class

2008-11-28 Thread Gustaf Rydevik
On Fri, Nov 28, 2008 at 8:35 AM, Heidi Wong [EMAIL PROTECTED] wrote:
 a. Write a R function zerdiag.v1(m) using loop to output a square matrix
 whose diagonal elements are zero and the other elements are filled in by
 consecutive integers from 1 to m row-wise.

 For example,
 zerdiag.v1(6) =  [0, 1, 2]
  [3, 0, 4]
  [5, 6, 0]

 This function should have error checking ability. If the input m cannot form
 a square matrix, then the function will return an error message: Input
 number is incorrect.

 b. Write a R function zerdiag.v2(m) to produce the same output as in part
 (a) without using a loop.

 c. Test your functions in part (a) and (b) using m=12 and m=14 respectively.

 I'd appreciate any help with this problem... I've spent a lot of time
 staring at it, and I'm still not sure where to start. Thanks!

[[alternative HTML version deleted]]




Hi,

Nice little brain teaser! Not too difficult, but requires a bit of
creative thinking...
You might wanna have a look at, for example, ?diag, ?uniroot, or ?polyroot.

regards,

Gustaf




-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding Stopping time

2008-11-27 Thread Gustaf Rydevik
Hi Debanjan,

It would be more likely that you get a response if your question was more clear.
Your code is very difficult to read, and it doesn't help that you
don't provide any context, or comment your code with ### This is
calculating the average kind of statements.
What are you trying to do?

Anyhow, after quite a bit of effort trying to understand what you've
done, I found your (simple!)
mistake:
Since you are resetting the k counter after your first try, you need
to change your k constant in that big quantity you're calculating to
(k-N[j-1]), like this:
T[k] -
(((k-N[j-1])/2)*log(theta1/theta2))+(((theta2-theta1)/(2*theta1*theta2))*smm[k])-((k-N[j-1])*(theta2-theta1)/2)

As an aside, try not to use variables defined outside a function in
the function code (in this case your x). It makes the code more
difficult to follow, and far more likely to break.

Regards,

Gustaf


On Wed, Nov 26, 2008 at 4:04 PM, Debanjan Bhattacharjee
[EMAIL PROTECTED] wrote:
 Can any one help me to solve problem in my code? I am actually trying to
 find the stopping index N.
 So first I generate random numbers from normals. There is no problem in
 finding the first stopping index.
 Now I want to find the second stopping index using obeservation starting
 from the one after the first stopping index.
 E.g. If my first stopping index was 5. I want to set 6th observation from
 the generated normal variables as the first random
 number, and I stop at second stopping index.

 This is my code,


 alpha - 0.05
 beta - 0.07
 a - log((1-beta)/alpha)
 b - log(beta/(1-alpha))
 theta1 - 2
 theta2 - 3

 cumsm-function(n)
  {y-NULL
   for(i in 1:n)
   {y[i]=x[i]^2}
   s=sum(y)
   return(s)
  }
 psum - function(p,q)
   {z - NULL
for(l in p:q)
   { z[l-p+1] - x[l]^2}
ps - sum(z)
return(ps)
   }
 smm - NULL
 sm - NULL
 N - NULL
 Nout - NULL
 T - NULL
 k-0
  x - rnorm(100,theta1,theta1)
  for(i in 1:length(x))
{
   sm[i] - psum(1,i)
   T[i] -
 ((i/2)*log(theta1/theta2))+(((theta2-theta1)/(2*theta1*theta2))*sm[i])-(i*(theta2-theta1)/2)
   if (T[i]=b | T[i]=a){N[1]-i
  break}

}
 for(j in 2:200)
 {
  for(k in (N[j-1]+1):length(x))
{  smm[k] - psum((N[j-1]+1),k)
   T[k] -
 ((k/2)*log(theta1/theta2))+(((theta2-theta1)/(2*theta1*theta2))*smm[k])-(k*(theta2-theta1)/2)
   if (T[k]=b | T[k]=a){N[j]-k
  break}
 }
 }

 But I cannot get the stopping index after the first one.

 Tanks
 --

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple rep() question duplicating times and dates.

2008-11-05 Thread Gustaf Rydevik
On Wed, Nov 5, 2008 at 4:02 PM, John Kane [EMAIL PROTECTED] wrote:

 I want to create a data.frame of time and date for a year.  I started with 
 the idea of simply producing two vectors (time and date)

 The first part ( time) is easy.
  rep(1:24, 365)

 But how do I get a series of 24 dates for O1 January 2005 and repeat this to 
 31 December 2005.

 It should be easy but I don't see it.

 Thanks


Hi John,

?Date leads you to (among other things) ?seq.Date.

Something like this should work:

time-rep(1:24, 365)
dates-seq(as.Date(01012005,format=%d%m%Y),as.Date(31122005,format=%d%m%Y),by=1)
TimeFrame-data.frame(time)
TimeFrame$dates-rep(dates,each=24)


Regards,
Gustaf
-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to get the duplicated elements from a vector?

2008-10-29 Thread Gustaf Rydevik
On Wed, Oct 29, 2008 at 2:47 PM, Leon Yee [EMAIL PROTECTED] wrote:
 Dear all,

How can I get the duplicated elements from a vector? For example,
 x - c(yes, no, yes, yes, no, not sure), how can I filter out
 all the elements which occured =2 times?

Thanks for any help!

 Regards,
 Leon


Hi Leon,

unique(x)

or

duplicated(x)

should work, depending on what you want.

Best,

Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to get the duplicated elements from a vector?

2008-10-29 Thread Gustaf Rydevik
On Wed, Oct 29, 2008 at 3:45 PM, Erik Iverson [EMAIL PROTECTED] wrote:


 Leon Yee wrote:

 Gustaf Rydevik wrote:
 Hi Leon,

 unique(x)

 or

 duplicated(x)

 should work, depending on what you want.

 Best,

 Gustaf


 Hi,
Thank you all. Actually, I have a data frame or matrix, whose first
 column is numerical values, and whose 2nd column is names.

 Then you have a data.frame, as matrices in R are of homogeneous type.

I need those
 whose names repeated 3 times and get the mean of the 3 values for each
 repeated names.

It sounds that I need some programming work.

 Yes, but not much

 ## BEGIN R CODE
 ## guarantees there is at least one level with exactly three elements,
 ## which your problem seems to require
 t1 - data.frame(a = rnorm(10), b = c(D, D, D, sample(LETTERS[1:3], 7,
 replace = TRUE)))

 ## find which names have exactly three elements
 t2 - subset(t1, b %in% names(which(table(t1$b) == 3)))

 ## note that the elements of the returned value depend on what was
 ## originally in your data set's 'b' column
 tapply(t2$a, t2$b, mean)

 ## END R CODE


I'm always forgetting about the ave function. Using that one, here's
another way:

temp-data.frame(Num=sample(1:1000,100),Names=sample(letters[1:25],100,replace=T))
temp$count-ave(rep(1,nrow(temp)),temp$Names,FUN=sum)
temp$MeanOfThree[temp$count==3]-
ave(temp$Num[temp$count==3],temp$Names[temp$count==3])

/Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Automatically adjust text size in plot

2008-10-24 Thread Gustaf Rydevik
Hi all,

I'm writing a function that will automatically generate a report based
on answers to a questionnaire. The exact questions and answers to the
questionnaire can vary. One of the question types is in a matrix
format, where the agreement to several statements can be indicated on
a scale.

I'm planning to plot this on a multilevel barplot, and only labeling
each bar column once. However, I'm stuck as to how I should adjust
text size and wrapping to fit to each column.

Here's an example of what I mean:

barnames-c(I agree completely, I agree, I partly agree, I do
not agree, I really hate this stupid question, don't you?)
answers-data.frame(question=paste(Q,1:5,sep=),S1=sample(1:100,5),S2=sample(1:100,5),S3=sample(1:100,5),S4=sample(1:100,5),S5=sample(1:100,5))
Width-50
Cex-1.5

par(mfrow=c(nrow(answers)+1,1),mar=c(0,1,1,1))
plot.new()
plot.window(xlim=c(0,1),ylim=c(0,1))
barnames.plot-do.call(c,lapply(barnames,function(x)paste(strwrap(x,Width),collapse=\n)))
text(barnames.plot,x= (seq.int(0, 1, length.out =
length(barnames)+1)-0.5/length(barnames))[-1],y=0.5,cex=Cex)
for(i in 1:nrow(answers)){
barheight-rep(0,length(barnames))
barheight[as.numeric(names(subQ.tables[[i]]))]-subQ.tables[[i]]
barplot(barheight,space=0)
}

The question is, how do I figure out the appropriate Width  and
Cex parameters as a function of barnames?
That is, with varying text lengths of the barnames, varying number of
alternatives etc, and independent on which device type is used?
strwrap uses column as units, and I can't really figure out how to
convert that to graph units. Same goes for cex.

Many thanks in advance,

Gustaf


PS: As an alternative, if someone could come up with a better way do
display this type of data, I'd be all ears. I'm not too happy with my
current solution


-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combining all possible values of variables into a new...

2008-10-20 Thread Gustaf Rydevik
On Mon, Oct 20, 2008 at 4:10 PM,  [EMAIL PROTECTED] wrote:

 I'm trying to create a new column in my data.frame where subjects are 
 categorized depending on values on four other columns. In any other case I 
 would just nest a few ifelse statements, however, in this case i have 
 4*6*2*3=144 combinations and i get weird 'context overflow' errors. So I 
 wonder if there is a more efficient way of doing this.

 For illustrational purposes, let's say i have:

 x-c(1,0,0,1,0,0,1,0,0,1)
 y-c(1,3,2,3,2,1,2,3,2,3)
 z-c(1,2,1,2,1,2,1,2,1,2)
 d-as.data.frame(cbind(x,y,z))

 and i do:

 d$myvar - ifelse(d$x == 0  d$y==1  d$z==1 , d$myvar - 1,
 ifelse(d$x == 0  d$y==1  d$z==2 , d$myvar - 2,
 ifelse(d$x == 0  d$y==2  d$z==1 , d$myvar - 3,
 ifelse(d$x == 0  d$y==2  d$z==2 , d$myvar - 4,
 ifelse(d$x == 0  d$y==3  d$z==1 , d$myvar - 5,
 ifelse(d$x == 0  d$y==3  d$z==2 , d$myvar - 6,
 ifelse(d$x == 1  d$y==1  d$z==1 , d$myvar - 7,
 ifelse(d$x == 1  d$y==1  d$z==2 , d$myvar - 8,
 ifelse(d$x == 1  d$y==2  d$z==1 , d$myvar - 9,
 ifelse(d$x == 1  d$y==2  d$z==2 , d$myvar - 10,
 ifelse(d$x == 1  d$y==3  d$z==1 , d$myvar - 11,
 ifelse(d$x == 1  d$y==3  d$z==2 , d$myvar - 12, NA

 Suggestions?

How about the following?

x-c(1,0,0,1,0,0,1,0,0,1)
y-c(1,3,2,3,2,1,2,3,2,3)
z-c(1,2,1,2,1,2,1,2,1,2)
d-as.data.frame(cbind(x,y,z))

xyz.comb-interaction(x,y,z,lex.order=T)
d$myvar-match(xyz.comb,levels(xyz.comb))


/Gustaf


Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading Data

2008-10-07 Thread Gustaf Rydevik
On Tue, Oct 7, 2008 at 10:36 AM,  [EMAIL PROTECTED] wrote:

 Hi,
 I have a data in which the first row is in date format and the first
 column is in text format and rest all the entries are numeric. Whenever
 I am trying to read the data using read.table, the whole of my data is
 converted in to the text format.

 Please suggest what shall I do because using the numeric data which are
 prices I need to calculate the return but if these prices are not
 numeric then calculating return will be a problem

 regards

 Rahul Agarwal
 Analyst
 Equities Quantitative Research
 UBS_ISC, Hyderabad
 On Net: 19 533 6363


Hi,

A single column in a data frame can't contain mixed formats.
In the absence of example data,  would guess one of the following could work :

1)
read.table(data.txt,skip=1, header=T) ## If you have headers

2)
read.table(data.txt, header=T) ## If the date row is supposed to be
variable names.

3)
 read.table(data.txt,skip=1) ## If there are no headers, and you
want to ignore the date


regards,

Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with Grep Under Loop

2008-10-06 Thread Gustaf Rydevik
On Mon, Oct 6, 2008 at 1:37 PM, Gundala Viswanath [EMAIL PROTECTED] wrote:
 Dear all,

 I have no problem with this individual grep command:

 datk - grep(XM_528056, source$V1)
 dat2 - source[datk,]
 print(dat2)
 V1  V2   V3 V4  V5  V6 V7
 35995 XM_528056 panTro2 chr8  + 1775569 1896107 Chimpanzee


 BUT, when I run them under the loop it gives this error:


 hm_acc - c(XM_528056,AB002296)
 for (i in 1:length(hm_acc)){
 +
 +hm_acc_id - as.character(hm_acc[i])
 +print(hm_acc_id)
 +
 +hm_allk - grep(hm_acc_id,source$V1)
 +hm_all - source[hm_allk,]
 +
 +print(hm_all)
 + }

 [1] XM_528056
 [1] V1 V2 V3 V4 V5 V6 V7
 0 rows (or 0-length row.names)
 [1] AB002296
 [1] V1 V2 V3 V4 V5 V6 V7
 0 rows (or 0-length row.names)
 .

 What's wrong with my way of using grep?
 Please advice.


 - Gundala Viswanath
 Jakarta - Indonesia



Hi,

Could you give us a small sample of the source data, so that your
example is reproducible?
From looking at your code, it seems as if you copied something wrong.
First you write:
 grep(XM_528056, source$V1)

,but when you print dat2 it seems as if your ID-code (XM_528056) is
in V2, not V1.


regards,

Gustaf


-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] function in R

2008-10-02 Thread Gustaf Rydevik
On Thu, Oct 2, 2008 at 1:34 PM, Alphonse Monkamg [EMAIL PROTECTED] wrote:




 Dear ALL,

 Does anyone know how to get the complete code program for any build-in 
 function
 in R, e.g. when I tape mean in the R-console, I get the following:

  mean

 function (x, ...)

 UseMethod(mean)

 environment: namespace:base

 but I need the full mean function.



 Thank in advance,



 Alphonse.



[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



Hi Alphonse,

mean is a so-called generic function, that behaves differently
depending on what class it's argument is.

writing:

?UseMethod

,explains a bit of this, and points you to:

?methods

So you can write

methods(mean)

, and see which functions exist. For example mean.default, or
mean.data.frame, for which you can have a look at the code. An added
complication is that these functions are calling C-code by using
.Internal. This C-code can be found in cran, but as I don't know C,
I've never tried it out more than having a quick look. But it's there
if you want it.

Regards,

Gustaf Rydevik



-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about multiple regression

2008-09-09 Thread Gustaf Rydevik
On Mon, Sep 8, 2008 at 7:47 PM, Dimitri Liakhovitski [EMAIL PROTECTED] wrote:
 Thank you everyone for your responses. I'll answer several questions.

 1.   Disclaimer: I have **NO IDEA** of the details of what you want
 to do or why
 -- but I am willing to bet that there are better ways of doing it than  1.8
 mm multiple refressions that take 270 secs each!! (which I find difficult to
 believe in itself -- are you sure you are doing things right? Something
 sounds very fishy here: R's regression code is typically very fast).
 I probably should not bore everyone, but just to explain where the
 large number is coming from. I have an experimental design with 7
 factors. Each factor has between 3 and 5 levels. Once you cross them
 all, you end up with 18,000 cells. For each cell, I want to generate a
 sample of N=100. For each sample I have to analyze the data using 3
 different statistical methods of analysis (the goal of the
 Monte-Carlo) is to compare those methods. One of the methods requires
 running of up to ~32,000 simple multiple regressions - yes just for
 one sample and it's not a mistake. I test-ran one such analysis for a
 sample with N=800 and 15 predictors and it took 270 seconds. R was
 actually very fast - it ran each of the individual regressions in
 about 0.008 seconds. Still I need something faster.

 2. Sorry - what was the formula sum(lm.fit(x,y))$residuals^2) for? For
 example, using it on my data, I got a value of 36,644...

 3. I know that for similarly challenging situations people did used
 Fortran compilers. So, anyone heard of a free Fortran library or an
 efficient piece of code?

 Thank you!
 Dimitri



Have you considered the fact that 32000 regressions simply takes a lot of time?
I don't really have anything to go by, but it sounds unlikely that you
will be able to cut computing time by more than, say, ten times to 27
second. That would still leave you with 4 months of running a
computer.

Perhaps an alternative approach would be to get access to stronger
(super)computers, either at a university, or buying access. A quick
googling turns up http://www.clusterondemand.com/ for example.

Anyhow, good luck with your project! I'm sure the R list would be very
interested to hear of how you solved your problem.

Regards,

Gustaf


-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coefficients of Logistic Regression from bootstrap - how to get them?

2008-07-31 Thread Gustaf Rydevik
On Thu, Jul 31, 2008 at 4:30 PM, Michal Figurski
[EMAIL PROTECTED] wrote:
 Frank and all,

 The point you were looking for was in a page that was linked from the
 referenced page - I apologize for confusion. Please take a look at the two
 last paragraphs here:
 http://people.revoledu.com/kardi/tutorial/Bootstrap/examples.htm

 Though, possibly it's my ignorance, maybe it's yours, but you actually
 missed the important point again. It is that you just don't estimate mean,
 or CI, or variance on PK profile data! It is as if you were trying to
 estimate mean, CI and variance of a Toccata__Fugue_in_D_minor.wav file.
 What for? The point is in the music! Would the mean or CI or variance tell
 you anything about that? Besides, everybody knows the variance (or
 variability?) is there and can estimate it without spending time on
 calculations.
 What I am trying to do is comparable to compressing a wave into mp3 - to
 predict the wave using as few data points as possible. I have a bunch of
 similar waves and I'm trying to find a common equation to predict them all.
 I am *not* looking for the variance of the mean!

 I could be wrong (though it seems less and less likely), but you keep
 talking about the same irrelevant parameters (CI, variance) on and on. Well,
 yes - we are at a standstill, but not because of Davison  Hinkley's book. I
 can try reading it, though as I stated above, it is not even remotely
 related to what I am trying to do. I'll skip it then - life is too short.

 Nevertheless I thank you (all) for relevant criticism on the procedure (in
 the points where it was relevant). I plan to use this methodology further,
 and it was good to find out that it withstood your criticism. I will look
 into the penalized methods, though.

 --
 Michal J. Figurski


I take it you mean the sentence:

 For example, in here, the statistical estimator is  the sample mean.
Using bootstrap sampling, you can do beyond your statistical
estimators. You can now get even the distribution of your estimator
and the statistics (such as confidence interval, variance) of your
estimator.

Again you are misinterpreting text. The phrase about doing beyond
your statistical estimators, is explained in the next sentence, where
he says that using bootstrap gives you information about the mean
*estimator* (and not more information about the population mean).
And since you're not interested in this information, in your case
bootstrap/resampling is not useful at all.

As another example of misinterpretation: In your email from  a week
ago, it sounds like you believe that the authors of the original paper
are trying to improve on a fixed model
Figurski:
Regarding the multiple stepwise regression - according to the cited
SPSS manual, there are 5 options to select from. I don't think they used
'stepwise selection' option, because their models were already
pre-defined. Variables were pre-selected based on knowledge of
pharmacokinetics of this drug and other factors. I think this part I
understand pretty well.

This paragraph is wrong. Sorry, no way around it.

Quoting from the paper Pawinski etal:
  *__Twenty-six(!)* 1-, 2-, or 3-sample estimation
models were fit (r2  0.341– 0.862) to a randomly
selected subset of the profiles using linear regression
and were used to estimate AUC0–12h for the profiles not
included in the regression fit, comparing those estimates
with the corresponding AUC0–12h values, calculated
with the linear trapezoidal rule, including all 12
timed MPA concentrations. The 3-sample models were
constrained to include no samples past 2 h.
(emph. mine)

They clearly state that they are choosing among 26 different models by
using their bootstrap-like procedure, not improving on a single,
predefined model.
This procedure is statistically sound (more or less at least), and not
controversial.

However, (again) what you are wanting to do is *not* what they did in
their paper!
resampling can not improve on the performance of a pre-specified
model. This is intuitively obvious, but moreover its mathematically
provable! That's why we're so certain of our standpoint. If you really
wish, I (or someone else) could write out a proof, but I'm unsure if
you would be able to follow.

In the end, it doesn't really matter. What you are doing amounts to
doing a regression 50 times, when once would suffice. No big harm
done, just a bit of unnecessary work. And proof to a statistically
competent reviewer that you don't really understand what you're doing.
The better option would be to either study some more statistics
yourself, or find a statistician that can do your analysis for you,
and trust him to do it right.

Anyhow, good luck with your research.

Best regards,

Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read

Re: [R] Simple... but...

2008-07-23 Thread Gustaf Rydevik
On Wed, Jul 23, 2008 at 3:23 PM, Doran, Harold [EMAIL PROTECTED] wrote:
 Shubba

 I'm confused. Your first post said the result should be c(1,2,3,4,5,6)
 when x and y are combined. The code I sent does that. But here you say
 your result should be c(4,1,2,5,2,3).

 What do you want your result to actually be?

 -Original Message-
 From: Shubha Vishwanath Karanth [mailto:[EMAIL PROTECTED]
 Sent: Wednesday, July 23, 2008 9:17 AM
 To: Doran, Harold; [EMAIL PROTECTED]
 Subject: RE: [R] Simple... but...

 OK,

 Let x=c(4,2,2)
   y=c(1,5,3)

 My result should be c(4,1,2,5,2,3)

 Thanks, Shubha




There should be nicer ways, but this does it:

x-c(4,2,2)
y-c(1,5,3)
c(matrix(c(x,y),byrow=T,nrow=2))


/Gustaf
-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Fwd: Re: Coefficients of Logistic Regression from bootstrap - how to get them?]

2008-07-23 Thread Gustaf Rydevik
On Wed, Jul 23, 2008 at 3:14 PM, Michal Figurski
[EMAIL PROTECTED] wrote:
 I think the argument supporting the use of bootstrap to determine
 coefficients, as opposed to just running linear regression on the whole
 dataset, is the comparison of Rsq and prediction errors between these
 two approaches - page 1502. There's a substantial difference in favor of
 the bootstrap approach.

 --
 Michal J. Figurski


Are you talking about this passage?

A commonly used approach for establishing estimation
models is to perform a multiple stepwise linear
regression on the total set of full AUCs (19 ). When we
used that approach, we obtained a r2 value of 0.74 and a
prediction error of 7.6%   26.7%, (median, 6.5%; 95% CI,
 51.9% to 67.5%), and the model estimated MPA AUC to
within 15% of the full value in 56% of the profiles. Our
estimation model using the repeated cross-validation approach
was significantly better, with a r2 value of 0.862,
prediction error of 6.1%   19%, (median, 3.0%; 95% CI,
 33.1% to 32%), and estimation of MPA AUC to within
15% of the value (when all 12 samples are used to
calculate MPA AUC) in 82% of the profiles.

As far as I can tell, they  are talking about the disadvantage using
stepwise regression to determine the optimal variables in the
regression, versus the bootstrap/CV-approach. And this might well be
true.

It is the following part in the methods description that seem unmotivated to me:
Once the general model (of the 26) was
selected, the proposed regression coefficients were
taken as the median of the distribution of regression
coefficient values described in step 2.

I.e, after having decided upon the model that uses C0, C0.5 and C2 ,
using a median of the bootstrap estimates (which is what the R-code I
wrote does, more or less) , instead of fitting that model on the
entire data set. I don't see how this could be better,
since we can't get any more information from the data other than
what's there from the beginning. And I believe that this is what's all
the other people on the list is trying to tell you, that it's a step
without purpose.

You have to distinguish between finding out which model is best, which
bootstrap can be useful for, and estimating the parameters for the
final, decided model, where bootstrapping several regressions and
taking median most likely is no better than standard regression.

best regards,

Gustaf
-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coefficients of Logistic Regression from bootstrap - how to get them?

2008-07-23 Thread Gustaf Rydevik
On Wed, Jul 23, 2008 at 4:08 PM, Michal Figurski
[EMAIL PROTECTED] wrote:
 Gustaf,

 I am sorry, but I don't get the point. Let's just focus on predictive
 performance from the cited passage, that is the number of values predicted
 within 15% of the original value.
 So, the predictive performance from the model fit on entire dataset was 56%
 of profiles, while from bootstrapped model it was 82% of profiles. Well - I
 see a stunning purpose in the bootstrap step here: it turns an useless
 equation into a clinically applicable model!

 Honestly, I also can't see how this can be better than fitting on entire
 dataset, but here you have a proof that it is.

 I think that another argument supporting this approach is model validation.
 If you fit model on entire data, you have no data left to validate its
 predictions.

 On the other hand, I agree with you that the passage in methods section
 looks awkward.

 In my work on a similar problem, that is going to appear in August in Ther
 Drug Monit, I used medians since beginning and all the comparisons were done
 based on models with median coefficients. I think this is what the authors
 of that paper did, though they might just have had a problem with describing
 it correctly, and unfortunately it passed through review process unchanged.




Hi,

I believe that you misunderstand the passage. Do you know what
multiple stepwise regression is?

Since they used SPSS, I copied from
http://www.visualstatistics.net/SPSS%20workbook/stepwise_multiple_regression.htm

Stepwise selection is a combination of forward and backward procedures.
Step 1

The first predictor variable is selected in the same way as in forward
selection. If the probability associated with the test of significance
is less than or equal to the default .05, the predictor variable with
the largest correlation with the criterion variable enters the
equation first.


Step 2

The second variable is selected based on the highest partial
correlation. If it can pass the entry requirement (PIN=.05), it also
enters the equation.

Step 3

From this point, stepwise selection differs from forward selection:
the variables already in the equation are examined for removal
according to the removal criterion (POUT=.10) as in backward
elimination.

Step 4

Variables not in the equation are examined for entry. Variable
selection ends when no more variables meet entry and removal criteria.
---


It is the outcome of this *entire process*,step1-4, that they compare
with the outcome of their *entire bootstrap/crossvalidation/selection
process*, Step1-4 in the methods section, and find that their approach
gives better result
What you are doing is only step4 in the article's method
section,estimating the parameters of a model *when you already know
which variables to include*.It is the way this step is conducted that
I am sceptical about.

Regards,

Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Non-normal data issues in PhD software engineering experiment

2008-07-10 Thread Gustaf Rydevik
On Thu, Jul 10, 2008 at 5:15 PM, Andrew Jackson [EMAIL PROTECTED] wrote:
 Hi All,


Hi Andrew,

The main questions here are not R-related, but statistical modelling
questions, and much too broad for the R list. They are things you'd
ask a (paid) statistical consultant. I would suggest taking contact
with your own university's statistical support unit:

http://www.insightsc.ie/statistics_clinic.htm
,and discuss the best approach for analysis.

Rummaging around in R, looking for tests that you can squeeze your
data into *really* isn't the best approach (and ?friedman.test clearly
states that it's for unreplicated designs only).

Some things I'd  want to know if I were your statistical consultant:

-What are you doing with your research? what's your goal?

-What exactly are sensitivity
(,coverage,execution,infection,propogation) measuring? looking
at your data, it seems as if sensitivity is making discrete jumps. Why
is this?

-What's your actual hypothesis? That the means sensitivity values for
the two paradigms differ by a constant no matter the version? or that
they differ by a fraction?

- If this is measured on test persons, I assume that you used each
person several times. Is that so?

Answers to the above questions might be good to bring to your meeting
with the statistics faculty.


Good luck with your research,

Gustaf
-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question: Beginner stuck in a R cycle

2008-07-08 Thread Gustaf Rydevik
On Tue, Jul 8, 2008 at 3:18 PM, Daniela Ottaviani [EMAIL PROTECTED] wrote:
 Dear All,

 I have a database of 200 observations named myD.
 In the dataframe there are a column named code (with codes varying from 1 to 
 77), a column named prevalence with some quantitative measurements are 
 given and an column named Pr_mean, with no values.

 I would like to set a cycle to compute the average of prevalence values for 
 each different code and store the averages under the empty field Pr_mean.

 This is what I wrote:

 # Set a cycle
 for (i in 1:nrow(myD)) {
 mycode = myD$code[i]
 mymean[i] = mean(prevalence)
 myD$Pr_mean[i] = mymean[i]
 }

 With the above cycle I am able to compute the average of all 200 observations 
 which is then written in every cell.
 I understand that a condition is missing, that indicates that the average has 
 to be computed amongst the observations showing  the same codes values.

 Could you please help me ?


 D.



The easiest thing to do is to use ?by:

myD-data.frame(code=sample(letters[1:5],200,replace=T),value=rnorm(200))
by(myD$value,myD$code,mean)

but that won't get you the the group means in the empty column without
some more lines of code. Another way is to use ?lapply and ?unlist:

myD$Pr_mean-unlist(lapply(as.character(myD$code),function(x)
mean(myD$value[myD$code==x])))


Regards,

Gustaf

--
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik



-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Migrating from S-Plus to R - Exporting Tables

2008-07-03 Thread Gustaf Rydevik
On Thu, Jul 3, 2008 at 2:17 AM, jim holtman [EMAIL PROTECTED] wrote:
 Does something like this get you close:

 x - list()
 keys - LETTERS[1:6]
 # create
 for (i in keys){
x[[i]] - data.frame(a=1:5, b=1:5, c=1:5)
 }
 # output
 output - file('tempxx.txt', 'w')
 for (i in keys){
write.table(i, row.names=FALSE, col.names=FALSE, file=output, quote=FALSE)
write.table(x[[i]], file=output, quote=FALSE)
 }
 close(output)



In order to get row.names written above the row names, I think you
have to cheat a bit:
(modifying Jim's code)

x - list()
keys - LETTERS[1:6]
# create
for (i in keys){
   x[[i]] - data.frame(a=1:5, b=1:5, c=1:5)
}
# output
output - file('tempxx.txt', 'w')
for (i in keys){
   write.table(i, row.names=FALSE, col.names=FALSE, file=output, quote=FALSE)
   write.table(data.frame(RowNames=row.names(x[[i]]),x[[i]]),
file=output,   quote=FALSE,row.names=FALSE) ##excluding actual
rownames, adding them as a column.
}
close(output)

-
It seems as if you can't get it to write row.names, since that is a
restricted name in a dataframe, but hopefully RowNames is good
enough.

/Gustaf
-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] tiff()-bug (was re:Preparing high quality figures with tiff as end result)

2008-06-25 Thread Gustaf Rydevik
Hi all,

A while back I sent a message concerning working with tiff-files, and
mentioned that I encountered a bug in 2.7.0.
This bug still occurs in 2.7.1, and is reproducable on a separate
computer (both running WinXP professional):

tiff()
plot(1:1000)
dev.off()

This causes R to show the window R GUI has encountered a problem and
needs to close.
Can anyone else out there reproduce this, so I can file a bug report?

Best,

Gustaf Rydevik

---
 sessionInfo()
R version 2.7.1 (2008-06-23)
i386-pc-mingw32

locale:
LC_COLLATE=Swedish_Sweden.1252;LC_CTYPE=Swedish_Sweden.1252;LC_MONETARY=Swedish_Sweden.1252;LC_NUMERIC=C;LC_TIME=Swedish_Sweden.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] RWinEdt_1.8-0

loaded via a namespace (and not attached):
[1] tools_2.7.1



-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tiff()-bug (was re:Preparing high quality figures with tiff as end result)

2008-06-25 Thread Gustaf Rydevik
On Wed, Jun 25, 2008 at 10:16 AM, Uwe Ligges
[EMAIL PROTECTED] wrote:


 Gustaf Rydevik wrote:

 Hi all,

 A while back I sent a message concerning working with tiff-files, and
 mentioned that I encountered a bug in 2.7.0.
 This bug still occurs in 2.7.1, and is reproducable on a separate
 computer (both running WinXP professional):

 tiff()
 plot(1:1000)
 dev.off()

 This causes R to show the window R GUI has encountered a problem and
 needs to close.
 Can anyone else out there reproduce this, so I can file a bug report?

 Yes. Confirmed.

 Uwe Ligges



Thank you. Bug report submitted.
/Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tiff()-bug (was re:Preparing high quality figures with tiff as end result)

2008-06-25 Thread Gustaf Rydevik
A short update that may be of help:
The snippet of code does not crash R if i run under vanilla, nor if I
change R to MDI-mode.
It does crash R infallibly if I set it to SDI-mode in  the Rprofile
file. Strange...

/Gustaf



On Wed, Jun 25, 2008 at 3:16 PM, Prof Brian Ripley
[EMAIL PROTECTED] wrote:
 On Wed, 25 Jun 2008, Peng Jiang wrote:

 Hi , Gustaf
 i don't know why but it works pretty well on a mac.

 with completely different code.

 Gustaf Rydevik has mentioned this before -- it never fails for me on Windows
 and hence one would not expect there to be a change in 2.7.1. Only if
 someone can reproduce it under a debugger have we a chance of tracking it
 down.

 regards .
 On 2008-6-25, at 下午4:16, Uwe Ligges wrote:



 Gustaf Rydevik wrote:

 Hi all,
 A while back I sent a message concerning working with tiff-files, and
 mentioned that I encountered a bug in 2.7.0.
 This bug still occurs in 2.7.1, and is reproducable on a separate
 computer (both running WinXP professional):
 tiff()
 plot(1:1000)
 dev.off()
 This causes R to show the window R GUI has encountered a problem and
 needs to close.
 Can anyone else out there reproduce this, so I can file a bug report?

 Yes. Confirmed.

 Uwe Ligges


 Best,
 Gustaf Rydevik
 ---

 sessionInfo()

 R version 2.7.1 (2008-06-23)
 i386-pc-mingw32
 locale:

 LC_COLLATE=Swedish_Sweden.1252;LC_CTYPE=Swedish_Sweden.1252;LC_MONETARY=Swedish_Sweden.1252;LC_NUMERIC=C;LC_TIME=Swedish_Sweden.1252
 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base
 other attached packages:
 [1] RWinEdt_1.8-0
 loaded via a namespace (and not attached):
 [1] tools_2.7.1


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.











 --
 Peng Jiang
 江鹏
 Ph.D. Candidate

 Antai College of Economics  Management
 安泰经济管理学院
 Department of Mathematics
 数学系
 Shanghai Jiaotong University (Minhang Campus)
 800 Dongchuan Road
 200240 Shanghai
 P. R. China

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 --
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R help

2008-06-24 Thread Gustaf Rydevik
dear Xu,

does:
library(urca)
example(ur.ers)
ers.gnp
str(ers.gnp)
[EMAIL PROTECTED]

,do what you want?
(this reminds me that I have to learn S4 sometime)
best,

Gustaf Rydevik



On Tue, Jun 24, 2008 at 3:52 AM, Xu, Ke-Li [EMAIL PROTECTED] wrote:
 Dear Sir/Madam,

 I found your email address and your correspondence with R-users. I hope
 you could help me with this question about the function ur.ers in the
 package of urca. It is an improved unit root test (Elliott et al. 1996
 Econometrica). Do you know how to extract the value of the test
 statistic from the output? The only thing I can get is the print-out of
 all results including the test statistic. But I am wondering whether the
 value is saved somewhere, like g$coef will give you the estimated
 coefficient, where g is a linear model lm object.

 Thank you very much.



 Best regards,
 Keli Xu


[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ifelse and vs

2008-06-18 Thread Gustaf Rydevik
On Wed, Jun 18, 2008 at 3:10 PM, Christos Argyropoulos
[EMAIL PROTECTED] wrote:

 Hi,

 I noticed whether some one could explain why  and  behave differently 
 in data frame transformations.

 Consider the following :

 a-data.frame(r=c(0,0,2,3),g=c(0,2,0,2.1))

 Then:

 transform(a,R=ifelse(r0  g 0,log(r/g),NA))

  r   g  R
 1 0 0.0 NA
 2 0 2.0 NA
 3 2 0.0 NA
 4 3 2.1 NA

 but

 transform(a,R=ifelse(r0  g 0,log(r/g),NA))
  r   g R
 1 0 0.0NA
 2 0 2.0NA
 3 2 0.0NA
 4 3 2.1 0.3566749


 If my understanding of the differences between  and  and how 
 'transform' works are accurate, both statements should produce the same 
 output.


 I got the same behaviour in Windows XP Pro 32-bit (running R v 2.7) and 
 Ubuntu Hardy (running the same version of R).


 Thanks

 Christos Argyropoulos

 University of Pittsburgh Medical Center
 _


from ? :  The shorter form performs elementwise comparisons in
much the same way as arithmetic operators. The longer form evaluates
left to right examining only the first element of each vector. 

Thus,

 a$r  a$g
[1] FALSE FALSE FALSE  TRUE
 a$r  a$g
[1] FALSE

ifelse takes a vector as argument. isince  only gives a single
value,  ifelse(r0  g 0,log(r/g),NA) will only return NA, which
then is recycled by transform. When using , ifelse returns a vector,
and this vector is appended to the data frame.

/Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Preparing high quality figures with tiff as end result

2008-05-23 Thread Gustaf Rydevik
Hi all,

I'm currently preparing some figures that will be submitted to PloS One.
In their guidelines they state that they will only accept figures in
tiff or eps format, with the warning that eps figures will be
converted to tiff format ( see
http://www.plosone.org/static/figureGuidelines.action ).

Because of this conversion, I figured I'd generate tiff-format figures
from the beginning.
However, a number of issues cropped up:
1) using

library(Cairo)
CairoTIFF(test.tif)

I get  Sorry, this Cairo was compiled without tiff support.. I tried
finding out how to recompile Cairo, but got lost in a lot of confusing
talk about GTK+, downloaded dll files that I didn't know how to use
etc.

so I turned to plain tiff(), and

2) R started crashing on me.The following code

tiff(test.tif)
plot(rnorm(100))
dev.off()

,crashes R (i.e R GUI has encountered a problem and needs to
close...) every third time or so.
When it does work, the resulting output is not too pretty.

So I turned to using postscript files. However, Plos One requires that
fonts be embedded into the figure.
embedFonts()  works for this, but the result is that text becomes
low-res bitmaps, and I don't know how to solve this.

So basically my question is: How should I go about generating graphics
that will look as nice as possible given the above constraints?

Many thanks in advance,

Gustaf




-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Preparing high quality figures with tiff as end result

2008-05-23 Thread Gustaf Rydevik
On Fri, May 23, 2008 at 4:40 PM, Gustaf Rydevik
[EMAIL PROTECTED] wrote:
 Hi all,

 I'm currently preparing some figures that will be submitted to PloS One.
 In their guidelines they state that they will only accept figures in
 tiff or eps format, with the warning that eps figures will be
 converted to tiff format ( see
 http://www.plosone.org/static/figureGuidelines.action ).

 Because of this conversion, I figured I'd generate tiff-format figures
 from the beginning.
 However, a number of issues cropped up:
 1) using

 library(Cairo)
 CairoTIFF(test.tif)

 I get  Sorry, this Cairo was compiled without tiff support.. I tried
 finding out how to recompile Cairo, but got lost in a lot of confusing
 talk about GTK+, downloaded dll files that I didn't know how to use
 etc.

 so I turned to plain tiff(), and

 2) R started crashing on me.The following code

 tiff(test.tif)
 plot(rnorm(100))
 dev.off()

 ,crashes R (i.e R GUI has encountered a problem and needs to
 close...) every third time or so.
 When it does work, the resulting output is not too pretty.

 So I turned to using postscript files. However, Plos One requires that
 fonts be embedded into the figure.
 embedFonts()  works for this, but the result is that text becomes
 low-res bitmaps, and I don't know how to solve this.

 So basically my question is: How should I go about generating graphics
 that will look as nice as possible given the above constraints?

 Many thanks in advance,

 Gustaf


Oh, and before anyone bites my head of, I forgot the following:


 sessionInfo()
R version 2.7.0 (2008-04-22)
i386-pc-mingw32

locale:
LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United
Kingdom.1252;LC_MONETARY=English_United
Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] RWinEdt_1.8-0 Cairo_1.4-2


, and  I'm using windows XP professional


again, many thanks in advance
/Gustaf


-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Change the position of panel strips in a lattice plot.

2008-04-08 Thread Gustaf Rydevik
Hi all,


In lattice plots, is there any option to position the panel strips
with text below each subgraph, instead of above?
i.e. in:
Depth - equal.count(quakes$depth, number=8, overlap=.1)
xyplot(lat ~ long | Depth, data = quakes)
,is there any way to make Depth appear below the subgraphs, instead of above?
I've been looking through the lattice documentation and the list
archive but have not found such a thing.

Many thanks in advance,

Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loop for in R to generate several variables

2008-04-07 Thread Gustaf Rydevik
On Mon, Apr 7, 2008 at 11:31 AM, arpino [EMAIL PROTECTED] wrote:

  Hi everybody,
  I have to create several variables of this form:

  Yind = L0 + L1*X1 + L2*X2 + L3*X3 + K*Cind + n

  where ind varires in {1,...,10}

  I thought to this loop for but it does not work:

  for (ind in 1:10) {

   Yind = L0 + L1*X1 + L2*X2 + L3*X3 + K*Cind + n


 }

  Any suggestions?

  Thank you.



look up ?assign and ?get, i.e:
for (ind in 1:10) {
assign(paste(Y,ind,sep=),L0 + L1*X1 + L2*X2 + L3*X3 +
get(paste(C,ind,sep=))+ n)
 }

regards,

Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Make plots with GNUplot. Have anyone tried that?

2008-03-03 Thread Gustaf Rydevik
On Fri, Feb 29, 2008 at 11:12 PM, Louise Hoffman
[EMAIL PROTECTED] wrote:
 [snip]

   Seriously. Be specific if you have a problem. (read the posting guide). R 
  can
also plot. If you don't like R's plots (which I could not understand) you 
 can
export data and import them to gnuplot. So what?

  Okay, my post was not very good.

  The reason (I think) I need GNUplot, is that I would like to include
  the plots from R in a Latex report, where I would like to have all the
  text and equations in the plots with the same font as used in Latex.

  So when I read about opening and closing dev for making a pdf I
  figured that the plots that R produces are like the once Matlab makes;
  shows what they ought to, nothing more, nothing less.

  So I was wondering if anyone know of an GNUplot friendly format and
  the code that would produce that text file.

  I am new to both R and GNUplot, so I am pure ears if someone knows how
  to make such plots in R.


Hi Louise,


In addition to what Paul Murrell linked to regarding latex fonts, take
a look at demo(plotmath).
I really don't think you have to go outside of R to do what you want.
In addition, if you aim to end up with a latex report I strongly
encourage you to try out ?Sweave. It has certainly helped to
streamline my workflow.

Regards,

Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Avoiding overplotting of text.

2008-02-26 Thread Gustaf Rydevik
On Mon, Feb 25, 2008 at 10:36 PM, hadley wickham [EMAIL PROTECTED] wrote:
   I am plotting some data, and use text() to get variable names next to
points on the graph. What is the best way to make sure that these text
labels are readable and not overlapping when two datapoints are close?
I've tried using jitter(), but the effect is random and doesn't always
give a good result.
Any suggestions would be most appreciated.

  Have a look at pointLabel in maptools -
  http://finzi.psych.upenn.edu/R/library/maptools/html/pointLabel.html


  --
  http://had.co.nz/


Thank you, Hadley. That was a very good tool to find!
In conjunction to the  regular tricks mentioned by Rickard Cotton, and
thigmophobe by Jim Lemon, the problem turned out to be fairly easy.
This seem like one of those tasks that is needed fairly frequently,
but which is rarely bothered with. Would it be possible to add one of
these algorithms as an option to the regular text()?

Regards,

Gustaf


-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Avoiding overplotting of text.

2008-02-25 Thread Gustaf Rydevik
Hi all,

I am plotting some data, and use text() to get variable names next to
points on the graph. What is the best way to make sure that these text
labels are readable and not overlapping when two datapoints are close?
I've tried using jitter(), but the effect is random and doesn't always
give a good result.
Any suggestions would be most appreciated.

Best regards,

Gustaf

Example:
--
 x-rnorm(20)
 x.labels-vector(length=length(x))
 for(i in 
1:length(x))x.labels[i]-paste(sample(LETTERS,5,replace=T),collapse=)
 y-rnorm(length(x))
 plot(x,y)
 text(x,y,x.labels)
###Most of the time some of the labels end up unreadable. 
---

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cox model

2008-02-13 Thread Gustaf Rydevik
On Feb 13, 2008 2:37 PM, Matthias Gondan [EMAIL PROTECTED] wrote:
 Hi Eleni,

 The problem of this approach is easily explained: Under the Null
 hypothesis, the P values
 of a significance test are random variables, uniformly distributed in
 the interval [0, 1]. It
 is easily seen that the lowest of these P values is not any 'better'
 than the highest of the
 P values.

 Best wishes,

 Matthias


Correct me if I'm wrong, but isn't that the point? I assume that the
hypothesis is that one or more of these genes are true predictors,
i.e. for these genes the p-value should be significant. For all the
other genes, the p-value is uniformly distributed. Using a
significance level of 0.01, and an a priori knowledge that there are
significant genes, you will end up with on the order of 20 genes, some
of which are the true predictors, and the rest being false
positives. this set of 20 genes can then be further analysed. A much
smaller and easier problem to solve, no?


/Gustaf
-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cox model

2008-02-13 Thread Gustaf Rydevik
On Feb 13, 2008 3:06 PM, Gustaf Rydevik [EMAIL PROTECTED] wrote:
 On Feb 13, 2008 2:37 PM, Matthias Gondan [EMAIL PROTECTED] wrote:
  Hi Eleni,
 
  The problem of this approach is easily explained: Under the Null
  hypothesis, the P values
  of a significance test are random variables, uniformly distributed in
  the interval [0, 1]. It
  is easily seen that the lowest of these P values is not any 'better'
  than the highest of the
  P values.
 
  Best wishes,
 
  Matthias
 

 Correct me if I'm wrong, but isn't that the point? I assume that the
 hypothesis is that one or more of these genes are true predictors,
 i.e. for these genes the p-value should be significant. For all the
 other genes, the p-value is uniformly distributed. Using a
 significance level of 0.01, and an a priori knowledge that there are
 significant genes, you will end up with on the order of 20 genes, some
 of which are the true predictors, and the rest being false
 positives. this set of 20 genes can then be further analysed. A much
 smaller and easier problem to solve, no?


 /Gustaf

Sorry, it should say 200 genes instead of 20.

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How does do.call() work??

2008-01-25 Thread Gustaf Rydevik
On Jan 25, 2008 11:27 AM, Sergey Goriatchev [EMAIL PROTECTED] wrote:
 Dear members of R forum,

 Say I have a list:

 L - list(1:3, 1:3, 1:3)

 that I want to turn into a matrix.

 I wonder why if I do:

 do.call(cbind, L)

 I get the matrix I want, but if I do

 cbind(L)

 I get something different from what I want. Why is that? How does
 do.call() actually work?

 I've read in do.call() help file this sentence: The behavior of some
 functions, such as substitute, will not be the same for functions
 evaluated using do.call as if they were evaluated from the
 interpreter. The precise semantics are currently undefined and subject
 to change. 

 Thanks for help!
 Sergey


Try
cbind(L[[1]],L[[2]],L[[3]])
,which is equal to do.call(cbind,L).
do.call takes a list of arguments, and feed each element of that list
to the function.
cbind takes two or more matrices, not a list of matrices as arguments.

/Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] histogram with NAs

2008-01-18 Thread Gustaf Rydevik
On Jan 18, 2008 4:49 PM,  [EMAIL PROTECTED] wrote:
 Dear list,

 I have a categorical variable in a data.frame that I would like to
 plot using a histogram to show number of events. Values are 0, 1 and
 some NAs. I can´t make the hist() function to
 1) include a column with the number of NAs
 2) have the x axis to be categorical, I always get 0, 0.2, 0.4,... 1
 divisions

 Can anyone help me?

 This is my code. database is my data.frame and Event is my
 variable.
 attach(database)
 hist(Event, col = 2, main = Number of Events))

 Thanks in advance,

 David


Please read ?hist, especially the line:

Typical plots with vertical bars are not histograms. Consider barplot
or plot(*, type = h) for such bar plots. . But no worry, I've mixed
them up myself a number of times.

To get a column of NA's, see the following:
###Example:
sample.data-as.factor(sample(c(1,0,NA),100,replace=T))
sample.data-as.character(sample.data)
sample.data[is.na(sample.data)]- NA
sample.data-factor(sample.data)
plot(sample.data)
#

/Gustaf


-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] An R is slow-article

2008-01-09 Thread Gustaf Rydevik
Hi all,

Reading the wikipedia page on R, I stumbled across the following:
http://fluff.info/blog/arch/0172.htm

It does seem interesting that the C execution is that much slower from
R than from a native C program. Could any of the more technically
knowledgeable people explain why this is so?

The author also have some thought-provoking opinions on R being
no-good and that you should write everything in C instead (mainly
because R is slow and too good at graphics, encouraging data
snooping). See  http://fluff.info/blog/arch/0041.htm
 While I don't agree (granted, I can't really write C), it was
interesting to read something from a very different perspective than
I'm used to.

Best regards,

Gustaf

_
Department of Epidemiology,
Swedish Institute for Infectious Disease Control
work email: gustaf.rydevik at smi dot ki dot se
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R help

2008-01-09 Thread Gustaf Rydevik
On Jan 9, 2008 5:47 PM,  [EMAIL PROTECTED] wrote:
 Is there a number I can call to get started with R?  I have some really
 basic questions that won't take more than 10 minutes.

 Sitadri


Try and write your questions down to this mailing list, and you're
bound to get answers,

/Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tutorial for Basic Stats

2007-12-10 Thread Gustaf Rydevik
On Dec 10, 2007 5:43 AM, Kapoor, Bharat [EMAIL PROTECTED] wrote:
 Thanks in advance - am looking for  a Tutorial for doing basic stats. I have 
 already looked/looking at the R-intro.pdf at the R site.

 Regards
 BK

 [[alternative HTML version deleted]]


google introductory statistics R, and you'll find a nice pdf by J. Verzani.

/Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting clusters from Data Frame

2007-12-10 Thread Gustaf Rydevik
On Dec 10, 2007 2:28 PM, Johannes Graumann [EMAIL PROTECTED] wrote:
 Hello,

 I have a large data frame (1006222 rows), which I subject to a crude
 clustering attempt that results in a vector stating whether the datapoint
 represented by a row belongs to a cluster or not. Conceptually this looks
 something like this:
 Value   Cluster?
 0.01FALSE
 0.03TRUE
 0.04TRUE
 0.05TRUE
 0.07FALSE
 ...
 What I'm looking for is an efficient strategy to extract all consecutive
 rows associated with TRUE as a single cluster (data.frame
 representation?) without cluttering memory with thousends of data.frames.
 I was thinking of an independent data.frame that would contain a column of
 lists that reference all indexes from the big one which are contained in
 one cluster ...
 Can anyone kindly nudge me and let me know how to deal with this
 efficiently?

 Joh


How about :
orig.data-sample(c(TRUE,FALSE),100,replace=T)
Cluster-data.frame(c.ndx=cumsum(rle(orig.data)$lengths),c.size=rle(orig.data)$lengths,c.type=rle(orig.data)$values)
Cluster-Cluster[Cluster$c.type==TRUE,]

##Then, to get all original data belonging to cluster three:
orig.data[rev(Cluster[3,c.ndx]-seq(length.out=Cluster[3,c.size])+1)]


Not the neatest solution, but I'm sure someone here can improve on it.
/Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >