Re: [R] a very simple question

2012-03-19 Thread David Winsemius


On Mar 18, 2012, at 4:43 PM, Dajiang Liu wrote:



Dear All,
I have a seemingly very simple question, but I just cannot figure  
out the answer. I attempted to run the  
following:a=0.1*(1:9);which(a==0.3);it returns integer(0). But  
obviously, the third element of a is equal to 0.3.
I must have missed something. Can someone kindly explain why? Thanks  
a lot.


It has already been explained on this list ... frequently in FAQt.

Locate the FAQ and search for a question about why R doesn't think two  
numbers are equal. The FAQ should be part of a standard instalL on the  
main help page.




Regards,Dajiang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem reading mixed CSV file

2012-03-19 Thread Ashish Agarwal
This is quite a CPu consuming process. My system got hung up for the
big file I have.

Within the for loop that you have suggested, can't I have a case
statement for different value of nfields to be read and specify what
format does the variable needs to be read?
something like
case
# input format for 6 fields
when nFields == 6
read.csv as string, string, string, numeric, numeric, numeric into dataframe1
#input format for 7 fields
when nFields == 7
read.csv as string, string, string, string, numeric, numeric, numeric
into dataframe2
end case
# Output the two dataframes via some way of tracking the original line
numbers of the input file - similar to _N_ in SAS
. Dataframe1 to be outputed as it is while in dataframe2,
concatenating the 3rd and the 4th strings.

Could you please help with the format for the above?



On Sat, Mar 17, 2012 at 4:54 AM, jim holtman jholt...@gmail.com wrote:
 Here is a solution that looks for the line with 7 elements and inserts
 the quotes:


 fileName - '/temp/text.txt'
 input - readLines(fileName)
 # count the fields to find 7
 nFields - count.fields(fileName, sep = ',')
 # now fix the data
 for (i in which(nFields == 7)){
 +     # split on comma
 +     z - strsplit(input[i], ',')[[1]]
 +     input[i] - paste(z[1], z[2]
 +         , paste('', z[3], ',', z[4], '', sep = '') # put on quotes
 +         , z[5], z[6], z[7], sep = ','
 +         )
 + }

 # now read in the data
 result - read.table(textConnection(input), sep = ',')

         result
                         V1       V2                   V3   V4 V5 V6
 1                                                         1968 21  0
 2                                                  Boston 1968 13  0
 3                                                  Boston 1968 18  0
 4                                                 Chicago 1967 44  0
 5                                              Providence 1968 17  0
 6                                              Providence 1969 48  0
 7                                                   Binky 1968 24  0
 8                                                 Chicago 1968 23  0
 9                                                   Dally 1968  7  0
 10                                   Raleigh, North Carol 1968 25  0
 11 Addy ABC-Dogs Stars-W8.1                    Providence 1968 38  0
 12              DEF_REQPRF/                     Dartmouth 1967 31  1
 13                       PL                               1967 38  1
 14                       XY PopatLal                      1967  5  1
 15                       XY PopatLal                      1967  6  8
 16                       XY PopatLal                      1967  7  7
 17                       XY PopatLal                      1967  9  1
 18                       XY PopatLal                      1967 10  1
 19                       XY PopatLal                      1967 13  1
 20                       XY PopatLal               Boston 1967  6  1
 21                       XY PopatLal               Boston 1967  7 11
 22                       XY PopatLal               Boston 1967  9  2
 23                       XY PopatLal               Boston 1967 10  3
 24                       XY PopatLal               Boston 1967  7  2



 On Fri, Mar 16, 2012 at 2:17 PM, Ashish Agarwal
 ashish.agarw...@gmail.com wrote:
 I have a file that is 5000 records and to edit that file is not easy.
 Is there any way to line 10 differently to account for changes in the
 third field?

 On Fri, Mar 16, 2012 at 11:35 PM, Peter Ehlers ehl...@ucalgary.ca wrote:
 On 2012-03-16 10:48, Ashish Agarwal wrote:

 Line 10 has City and State that too separated by comma. For line 10
 how can I read differently as compared to the other lines?


 Edit the file and put quotes around the city-state combination:
  Raleigh, North Carol


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Jim Holtman
 Data Munger Guru

 What is the problem that you are trying to solve?
 Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sankey Diagrams in R

2012-03-19 Thread Eric Fail
Dear R-list,

I am trying to visualize where the dropout happens in our patient flow. We are 
currently using traditional flowcharts and it bothers me that I can't visualize 
both the percentage and the flow in one diagram.

The other day I came across some interesting diagrams doing exactly what I 
wanted, they had both flow and percentages visualized on one diagram. Here is 
some nice examples apparently made with ‘sankeypython’ 
http://www.sankey-diagrams.com/tag/software/

It didn't take long to find a blog where a Ruser (thanks!) had posted an R 
script that actually produces an Sankey Diagram in R 
http://biologicalposteriors.blogspot.com/2010/07/sankey-diagrams-in-r.html

See below for working example.

My questions are, is this the most updated Sankey Diagram-script we have in the 
R community? Is there a better way to visualize flow and percentages in one 
diagram in R?

Thanks,
Eric

## the working example

## th, 
https://tonybreyal.wordpress.com/2011/11/24/source_https-sourcing-an-r-script-from-github/
sourc.https - function(url, ...) {
  # load package
require(RCurl)
  # install.packages(c(RCurl), dependencies = TRUE)

  # parse and evaluate each .R script
  sapply(c(url, ...), function(u) {
    eval(parse(text = getURL(u, followlocation = TRUE, cainfo = 
system.file(CurlSSL, cacert.pem, package = RCurl))), envir = .GlobalEnv)
  })
}

# Example from https://gist.github.com/1423501
sourc.https(https://raw.github.com/gist/1423501/55b3c6f11e4918cb6264492528b1ad01c429e581/Sankey.R;)

# My example (there is another example inside Sankey.R):
inputs = c(6, 144)
losses = c(6,47,14,7, 7, 35, 34)
unit = n =
labels = c(Transfers,
   Referrals\n,
   Unable to Engage,
   Consultation only,
   Did not complete the intake,
   Did not engage in Treatment,
   Discontinued Mid-Treatment,
   Completed Treatment,
   Active in \nTreatment)
SankeyR(inputs,losses,unit,labels)

# Clean up my mess
rm(inputs, labels, losses, SankeyR, sourc.https, unit)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] randomly subsample rows from subsets

2012-03-19 Thread Rui Barradas
Hello,

Try

text=
fish fam length
1 a 71.46
2 a 71.06
3 a 62.94
4 b 79.46
5 b 52.38
6 b 56.78
7 b 92.08
8 c 96.86
9 d 98.09
10 d 17.23
11 d 98.35
12 d 82.43
13 e 83.85
14 e 33.92
15 e 23.16
16 e 31.39
17 e 57.08
18 e 27.05
19 f 62.38
20 f 83.21
21 f 18.72
22 f 84.32
23 g 15.99
24 h 40.33
25 h 92.73
26 h 59.08
27 i 29.05

fish - read.table(textConnection(text), header=TRUE)
head(fish)

set.seed(1)
select - lapply(split(fish, fish$fam),
function(x) if(NROW(x)  1) x[sample(NROW(x), 2), ])
select - select[!sapply(select, is.null)]

# result as a list
select
# result as a data.frame
do.call(rbind, select)

Hope this helps,

Rui Barradas


--
View this message in context: 
http://r.789695.n4.nabble.com/randomly-subsample-rows-from-subsets-tp4483477p4483613.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to use R script in VB?

2012-03-19 Thread Dong-Joon Lim
Hello R friends,

I want to use my R script in VB to make macro in Excel.
I tried with RExcel but it seems to me that this package is just GUI API
and I still have to run(connect) R to use the script.
Google tells me there are some ways to make R script as an independent
library/module/header so that I can call it in VB, C or JAVA.
Where can I get detailed tutorial or manual for that?

Thanks in advance,
Dong-Joon

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Output formatting in Latex and R

2012-03-19 Thread Manish Gupta
I am working on Latex and R and using following code.

echo=FALSE=
infile-read.table(test.txt,sep=\t)
Col3 - unique(infile[,3]) 
LCol3 - length(Col3)
for (i in 1:LCol3) {


print(paste(Column, Col3[i]))
print(infile[infile[,3]==Col3[i],-3])
}
@

I am getting following output.

1] Column C V1 V2 V4 1 A B D 2 X T K [1] Column Z V1 V2 V4 3 Z U M 4 E V
R 5 Z U M [1] Column P V1 V2 V4 6 E V R

Blockquote

I want to avoid numbering and columns names. I want my output as follows.

Column C A B D X T K

Column Z Z U M E V R Z U M

Column P E V R

How can i implement it?


--
View this message in context: 
http://r.789695.n4.nabble.com/Output-formatting-in-Latex-and-R-tp4483631p4483631.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to Group Categorical data in R?

2012-03-19 Thread Manish Gupta
 It is working fine. 

Thanks

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-Group-Categorical-data-in-R-tp4477622p4483565.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Output formatting in Latex and R

2012-03-19 Thread priyank
Col3 - unique(Msg17$V3)
LCol3 - length(Col3)
for (i in 1:LCol3) {
  print(paste(Column, Col3[i]))
  write.table(Msg17[Msg17$V3==Col3[i],-3], row.names=F, col.names=F,quote=F)
# If you R implementation does not accept 'F', use 'FALSE'
}

--
View this message in context: 
http://r.789695.n4.nabble.com/Output-formatting-in-Latex-and-R-tp4483631p4483863.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Output formatting in Latex and R

2012-03-19 Thread Manish Gupta
Great it works! But how can i put space or tab between two records?

--
View this message in context: 
http://r.789695.n4.nabble.com/Output-formatting-in-Latex-and-R-tp4483631p4483921.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] multiple density plot

2012-03-19 Thread K. Elo

Hi,

some sample data would be *very* helpful...

Kind regards,
Kimmo

16.03.2012 15:44, statquant2 wrote:

Hello I am looking for a special plot.

Let's suppose I have *100 days and
   *each day I have a (1D)
distribution of the same variable.

I would like to plot
*dates on x axis and
*one distribution per date on the y axe. Do you know a way of doing it ?
Cheers


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] a very simple question

2012-03-19 Thread Rainer Schuermann
As to the reasons, David as given you the necessary hints.

In order to get around the issue, here is what I do:

 a - round( 0.1 * ( 1:9 ), 1 )
 a
[1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
 which( a == 0.3 )
[1] 3

Rgds,
Rainer


 Original-Nachricht 
 Datum: Sun, 18 Mar 2012 21:43:54 +
 Von: Dajiang Liu ldjst...@hotmail.com
 An: r-help@r-project.org
 Betreff: [R] a very simple question

 
 Dear All,
 I have a seemingly very simple question, but I just cannot figure out the
 answer. I attempted to run the following:a=0.1*(1:9);which(a==0.3);it
 returns integer(0). But obviously, the third element of a is equal to 0.3. 
 I must have missed something. Can someone kindly explain why? Thanks a
 lot.
 Regards,Dajiang
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
---

Gentoo Linux with KDE

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Singleton pattern

2012-03-19 Thread David Cassany
Thanks all of your answers and advices! They brought me some light!

I'll have a look to memois package and to tracemem function in order to
check if they can help me somehow, at least to understand and trace in
detail how memory gets consumed.

Thank you all!
David

2012/3/16 Jan T. Kim jtt...@googlemail.com

 Using the singleton pattern in R has never occurred to me so far, as
 I think it applies to languages that support multiple references to
 one instance. R doesn't do that, at least not in ways that would be
 required for applying the singleton pattern as described in the GoF book,
 anyway. One would have to use closures and / or environments to
 approximate references, I suppose.

 When passed around as parameters, R objects don't get copied unless
 the called function starts modifying them, so if the primary concern
 is to prevent unnecessary / costly copying of bulky objects, creating
 the thing once and then passing it around as necessary, taking care
 that called functions don't change it, is perhaps good enough.

 Best regards, Jan

 On Fri, Mar 16, 2012 at 12:15:27PM -0400, Bryan Hanson wrote:
  Since no one else has bit, I'll take a stab.  I'm an experienced R
 person, but I've recently been teaching myself objective-c and I've been
 using singletons quite a bit (and mis-using them quite a bit!).  Not a
 computer scientist at all.  You've been warned.
 
  I don't think there is a comparable concept in R.  You do have a choice
 of S3 or S4 classes for your object orientation in R.  S3 is very loose in
 that you can add to S3 objects readily and abuse them a lot.  There really
 is no checking of them unless you implement it manually.  S4 objects are
 much tighter and they are less readily modified and are self-checking (I
 know some will complain about this characterization but  it's approximately
 correct).  So perhaps you want an S4 object so it's less likely to get
 mangled, but I doubt there is a way to prevent users from copying it, which
 would be more along the lines of a singleton.
 
  You can google the archives for some great discussions of S3 vs S4 if
 that sounds interesting.
 
  Bryan
 
  ***
  Bryan Hanson
  Professor of Chemistry  Biochemistry
  DePauw University
 
  On Mar 16, 2012, at 7:47 AM, David Cassany wrote:
 
   Hi all,
  
   I know it may not have much sense thinking about a Singleton Pattern
 in an
   R application which doesn't use any OOP facilities, however I'm
 curious to
   know if anybody faced the same issue. I've been googling but using
   singleton pattern as a key word leads to typical OOP languages like
 Java
   or C++ among others.
  
   So my problem is that I'd like to ensure some very big objects aren't
   copied again and again in some other variables. In the worst case I'll
   check all code by myself to ensure it but in this case the application
   won't force programmers to take it in consideration which is what I am
   really looking for.
  
   Any advice will be highly appreciated :P
  
   Thanks!
   --
   *David Cassany Viladomat
   Software Developer
   Transmural Biote**ch S.L*
  
   [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 --
  +- Jan T. Kim ---+
  | email: jtt...@gmail.com|
  | WWW:   http://www.jtkim.dreamhosters.com/  |
  *-=  hierarchical systems are for files, not for humans  =-*

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
*David Cassany Viladomat
Software Developer
Transmural Biote**ch S.L*
**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] a very simple question

2012-03-19 Thread Petr Savicky
On Sun, Mar 18, 2012 at 09:43:54PM +, Dajiang Liu wrote:
 
 Dear All,
 I have a seemingly very simple question, but I just cannot figure out the 
 answer. I attempted to run the following:a=0.1*(1:9);which(a==0.3);it returns 
 integer(0). But obviously, the third element of a is equal to 0.3. 
 I must have missed something. Can someone kindly explain why? Thanks a lot.

Hi.

A simple way to detect rounding problems is subtracting
the numbers.

  a = 0.1*(1:4)
  a - 0.3

  [1] -2.00e-01 -1.00e-01  5.551115e-17  1.00e-01

Use rounding to avoid it as suggested by others.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to use R script in VB?

2012-03-19 Thread Jeff Newmiller
Actually, RExcel and the StatConn DCOM connector are what you want, and this is 
not the right place to discuss it. Go to http://www.statconn.com/, and read the 
license carefully.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Dong-Joon Lim tgno3@gmail.com wrote:

Hello R friends,

I want to use my R script in VB to make macro in Excel.
I tried with RExcel but it seems to me that this package is just GUI
API
and I still have to run(connect) R to use the script.
Google tells me there are some ways to make R script as an independent
library/module/header so that I can call it in VB, C or JAVA.
Where can I get detailed tutorial or manual for that?

Thanks in advance,
Dong-Joon

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] install R package on Unix cluster

2012-03-19 Thread Rainer M Krug
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 18/03/12 14:40, Uwe Ligges wrote:
 
 
 On 18.03.2012 05:47, Lorenzo Cattarino wrote:
 Hi R users,
 
 Working from a PC, I am trying to install the spatstat package on a Unix 
 cluster. I created 
 the following PBS file to send a job array:
 
 #!/bin/bash -ue
 
 #PBS -m ae #PBS -M my email #PBS -J 1-45 #PBS -A my username #PBS -N job 
 name #PBS -l 
 resources #PBS -l walltime
 
 cd $PBS_O_WORKDIR
 
 module load R/2.14.1
 
 R CMD INSTALL -l /path/to/library spatstat
 
 This command installs *a* source package from the current subdirectory 
 spatstat.
 
 If there is no such directory containing the sources, it won't work. Either 
 provide the
 gzipped tarball and give its name or use install.packages(spatstat) within 
 an R script.
 
 It makes sense to install it outside the parallel processing into a common 
 directory and just 
 use it in parallel

I can second that. My approach to install the package in my home directory 
which is then
accessible from all nodes.

If you are the admin of the cluster, you can install the package in the normal 
location and share
this location so that it is accessible to all nodes.

Cheers,

Rainer


 (It is not entirely clear to me if you are really running the installation on 
 all nodes).
 
 Uwe Ligges
 
 
 
 
 R CMD BATCH /path/to/folder/Script_$PBS_ARRAY_INDEX.R
 
 Obviosuly I failed to understand pag 19 of the R admin manual because I keep 
 getting the 
 following error message:
 
 Warning: invalid package ‘spatstat’ Error: ERROR: no packages specified
 
 I'd appreciate if you can point me in the right direction
 
 Thanks Lorenzo
 
 
 [[alternative HTML version deleted]]
 
 
 
 
 __ R-help@r-project.org mailing 
 list 
 https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting 
 guide 
 http://www.R-project.org/posting-guide.html and provide commented, minimal, 
 self-contained, 
 reproducible code.
 
 __ R-help@r-project.org mailing 
 list 
 https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html and provide commented, minimal, 
 self-contained, 
 reproducible code.


- -- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, 
UCT), Dipl. Phys.
(Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk9m8SUACgkQoYgNqgF2egpLvACfbtskR/1VxaiGqs3ErCRV+gVS
Q80An2WsyZ51VhmfvcpEpn8x2Zy/mexB
=M+ME
-END PGP SIGNATURE-

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Output formatting in Latex and R

2012-03-19 Thread Manish Gupta
Hi,

I am using follosing code and getting the below output. 
echo=FALSE=
infile-read.table(/home/manish/Desktop/test.txt,sep=\t,header=TRUE)
Col3 - unique(infile[,3]) 
LCol3 - length(Col3)
for (i in 1:LCol3) {
print(paste(Disease Risk:, Col3[i]),row.names=FALSE,
col.names=FALSE,quote=FALSE)
print(infile[infile[,3]==Col3[i],-3], row.names=FALSE,
col.names=FALSE,quote=FALSE, width=10,  justify = c(right, right,
centre)) 
}
@
http://r.789695.n4.nabble.com/file/n4484027/Screenshot.png 

Still [1] is written over there. How to avoid it? And i need to add tab  and
new line in between records. How can i implement it? Thanks in advance. 

--
View this message in context: 
http://r.789695.n4.nabble.com/Output-formatting-in-Latex-and-R-tp4483631p4484027.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem reading mixed CSV file

2012-03-19 Thread Jim Holtman
How big is the file? In the example I sent I waa using 'textConnection' to 
reread the input.  If the file is large, this can be slow.  You will have 
better luck writing the converted data outmto a temporarynfile and reading it 
right back in.

I am not such exactly what you are asking.  You can crate output file names 
based on the input file name.  What is it you want to do with the 'case' 
statement?

Sent from my iPad

On Mar 19, 2012, at 2:46, Ashish Agarwal ashish.agarw...@gmail.com wrote:

 This is quite a CPu consuming process. My system got hung up for the
 big file I have.
 
 Within the for loop that you have suggested, can't I have a case
 statement for different value of nfields to be read and specify what
 format does the variable needs to be read?
 something like
 case
 # input format for 6 fields
 when nFields == 6
 read.csv as string, string, string, numeric, numeric, numeric into dataframe1
 #input format for 7 fields
 when nFields == 7
 read.csv as string, string, string, string, numeric, numeric, numeric
 into dataframe2
 end case
 # Output the two dataframes via some way of tracking the original line
 numbers of the input file - similar to _N_ in SAS
 . Dataframe1 to be outputed as it is while in dataframe2,
 concatenating the 3rd and the 4th strings.
 
 Could you please help with the format for the above?
 
 
 
 On Sat, Mar 17, 2012 at 4:54 AM, jim holtman jholt...@gmail.com wrote:
 Here is a solution that looks for the line with 7 elements and inserts
 the quotes:
 
 
 fileName - '/temp/text.txt'
 input - readLines(fileName)
 # count the fields to find 7
 nFields - count.fields(fileName, sep = ',')
 # now fix the data
 for (i in which(nFields == 7)){
 + # split on comma
 + z - strsplit(input[i], ',')[[1]]
 + input[i] - paste(z[1], z[2]
 + , paste('', z[3], ',', z[4], '', sep = '') # put on quotes
 + , z[5], z[6], z[7], sep = ','
 + )
 + }
 
 # now read in the data
 result - read.table(textConnection(input), sep = ',')
 
 result
 V1   V2   V3   V4 V5 V6
 1 1968 21  0
 2  Boston 1968 13  0
 3  Boston 1968 18  0
 4 Chicago 1967 44  0
 5  Providence 1968 17  0
 6  Providence 1969 48  0
 7   Binky 1968 24  0
 8 Chicago 1968 23  0
 9   Dally 1968  7  0
 10   Raleigh, North Carol 1968 25  0
 11 Addy ABC-Dogs Stars-W8.1Providence 1968 38  0
 12  DEF_REQPRF/ Dartmouth 1967 31  1
 13   PL   1967 38  1
 14   XY PopatLal  1967  5  1
 15   XY PopatLal  1967  6  8
 16   XY PopatLal  1967  7  7
 17   XY PopatLal  1967  9  1
 18   XY PopatLal  1967 10  1
 19   XY PopatLal  1967 13  1
 20   XY PopatLal   Boston 1967  6  1
 21   XY PopatLal   Boston 1967  7 11
 22   XY PopatLal   Boston 1967  9  2
 23   XY PopatLal   Boston 1967 10  3
 24   XY PopatLal   Boston 1967  7  2
 
 
 
 On Fri, Mar 16, 2012 at 2:17 PM, Ashish Agarwal
 ashish.agarw...@gmail.com wrote:
 I have a file that is 5000 records and to edit that file is not easy.
 Is there any way to line 10 differently to account for changes in the
 third field?
 
 On Fri, Mar 16, 2012 at 11:35 PM, Peter Ehlers ehl...@ucalgary.ca wrote:
 On 2012-03-16 10:48, Ashish Agarwal wrote:
 
 Line 10 has City and State that too separated by comma. For line 10
 how can I read differently as compared to the other lines?
 
 
 Edit the file and put quotes around the city-state combination:
  Raleigh, North Carol
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 --
 Jim Holtman
 Data Munger Guru
 
 What is the problem that you are trying to solve?
 Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 

Re: [R] Help with dlply, loop and column names

2012-03-19 Thread Peter Meilstrup
I'm not sure I follow exactly what group of regression models you want to
create, but a good first step might be to use reshape so that each party's
vote share goes on a different row and the vote shares are all in the same
column. Then you can use plyr grouping on tipo and party to make your
models...

library(reshape2)
library(plyr)

ast - melt(asturias.gen2011, id=c(municipio, total, tipo),
variable.name=party, value.name=vote)

dlply(ast, .(party, tipo), lm, formula=vote~total)

or along those lines. This way you don't have to mess around with pasting
together expressions to eval and so on...

Peter

On Sun, Mar 18, 2012 at 12:59 PM, Igor Sosa Mayor 
joseleopoldo1...@gmail.com wrote:

 Hi,

 I have a dataframe basically like this:

  head(asturias.gen2011[,c(1,4,9:14)])
   municipio total upyd  psoeppiu   factipo
 440  Allande  2031 1.44 31.10 39.75  4.01 21.62  1000-1
 443Aller 12582 1.37 33.30 37.09 15.53 10.35 1-5
 567   Amieva   805 1.48 32.69 37.36  6.15 20.16   1000
 849   Avilés 84202 4.15 30.26 35.49 14.37 11.80  5
 1087 Belmonte de Miranda  1751 1.66 38.42 35.74  7.22 14.81  1000-1
 1260 Bimenes  1894 0.98 34.28 26.87 23.30 10.98  1000-1

 I want to do the following:
 1. for every party (psoe, pp, etc.) I want to create a variable like
 this: upyd.lm.tipos, psoe.lm.tipos, etc.

 2. I want to store in this variable a regression (psoe~total), but
 split up by tipo.

 I have the main idea of using dlply from the plyr vignette. But when I
 try to put all this in a loop I'm coming into trouble and I'm at the
 moment really confused how to solve this problem:

 I have the following function:

 elecregtipos - function(y){
z-dlply(asturias.gen2011, .(tipo), function(x) lm(x[,y]~x$edad.media))
# rsq-function(x) summary(x)$r.squared
# bcoefs-ldply(z, function(x) c(coef(x), rsquare=rsq(x)))
#  return (bcoefs)
return(z)
 }

 And I try to call it with:
 for (y in c(upyd, psoe, pp, fac, iu)) {
  eval(parse(text=paste(y,'.lm.tipos', '- elecregtipos(',y,')',sep='')))
 }

 At the moment I'm getting the error:
 Error en `[.data.frame`(x, , y) : objeto 'upyd' no encontrado

 If I call simply:
 elecregtipos(upyd)

 it works perfectly. The problem is the loop, column names, etc., but I'm
 really confused what I still could try, because I have already tried any
 possibility.

 Any hint?

 Thanks in advance.


 --
 :: Igor Sosa Mayor :: joseleopoldo1...@gmail.com ::
 :: GnuPG: 0x1C1E2890   :: http://www.gnupg.org/  ::
 :: jabberid: rogorido  ::::

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem reading mixed CSV file

2012-03-19 Thread Petr PIKAL
Hi
 
 This is quite a CPu consuming process. My system got hung up for the
 big file I have.
 
 Within the for loop that you have suggested, can't I have a case
 statement for different value of nfields to be read and specify what
 format does the variable needs to be read?
 something like
 case
 # input format for 6 fields
 when nFields == 6
 read.csv as string, string, string, numeric, numeric, numeric into 
dataframe1
 #input format for 7 fields
 when nFields == 7
 read.csv as string, string, string, string, numeric, numeric, numeric
 into dataframe2
 end case
 # Output the two dataframes via some way of tracking the original line
 numbers of the input file - similar to _N_ in SAS
 . Dataframe1 to be outputed as it is while in dataframe2,
 concatenating the 3rd and the 4th strings.
 
 Could you please help with the format for the above?

I would follow Jims suggestion, 
nFields - count.fields(fileName, sep = ',')
count fields and read chunks to different files by using scan with 
modifying skip and nlines parameters. However if there is only few lines 
which differ it would be better to correct those few lines manually in 
some suitable editor.

Elaborating omnipotent function for reading any kind of 
corrupted/nonstandard files seems to me suited only if you expect to read 
such files many times.

Regards
Petr


 
 
 
 On Sat, Mar 17, 2012 at 4:54 AM, jim holtman jholt...@gmail.com wrote:
  Here is a solution that looks for the line with 7 elements and inserts
  the quotes:
 
 
  fileName - '/temp/text.txt'
  input - readLines(fileName)
  # count the fields to find 7
  nFields - count.fields(fileName, sep = ',')
  # now fix the data
  for (i in which(nFields == 7)){
  + # split on comma
  + z - strsplit(input[i], ',')[[1]]
  + input[i] - paste(z[1], z[2]
  + , paste('', z[3], ',', z[4], '', sep = '') # put on quotes
  + , z[5], z[6], z[7], sep = ','
  + )
  + }
 
  # now read in the data
  result - read.table(textConnection(input), sep = ',')
 
  result
  V1   V2   V3   V4 V5 V6
  1 1968 21  0
  2  Boston 1968 13  0
  3  Boston 1968 18  0
  4 Chicago 1967 44  0
  5  Providence 1968 17  0
  6  Providence 1969 48  0
  7   Binky 1968 24  0
  8 Chicago 1968 23  0
  9   Dally 1968  7  0
  10   Raleigh, North Carol 1968 25  0
  11 Addy ABC-Dogs Stars-W8.1Providence 1968 38  0
  12  DEF_REQPRF/ Dartmouth 1967 31  1
  13   PL   1967 38  1
  14   XY PopatLal  1967  5  1
  15   XY PopatLal  1967  6  8
  16   XY PopatLal  1967  7  7
  17   XY PopatLal  1967  9  1
  18   XY PopatLal  1967 10  1
  19   XY PopatLal  1967 13  1
  20   XY PopatLal   Boston 1967  6  1
  21   XY PopatLal   Boston 1967  7 11
  22   XY PopatLal   Boston 1967  9  2
  23   XY PopatLal   Boston 1967 10  3
  24   XY PopatLal   Boston 1967  7  2
 
 
 
  On Fri, Mar 16, 2012 at 2:17 PM, Ashish Agarwal
  ashish.agarw...@gmail.com wrote:
  I have a file that is 5000 records and to edit that file is not easy.
  Is there any way to line 10 differently to account for changes in the
  third field?
 
  On Fri, Mar 16, 2012 at 11:35 PM, Peter Ehlers ehl...@ucalgary.ca 
wrote:
  On 2012-03-16 10:48, Ashish Agarwal wrote:
 
  Line 10 has City and State that too separated by comma. For line 10
  how can I read differently as compared to the other lines?
 
 
  Edit the file and put quotes around the city-state combination:
   Raleigh, North Carol
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
  --
  Jim Holtman
  Data Munger Guru
 
  What is the problem that you are trying to solve?
  Tell me what you want to do, not how you want to do it.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read 

Re: [R] Help with dlply, loop and column names

2012-03-19 Thread Igor Sosa Mayor
Peter:

many thanks for your help. This is basically what I wanted to do and in
a much more elegant way.



On Mon, Mar 19, 2012 at 03:13:40AM -0700, Peter Meilstrup wrote:
 I'm not sure I follow exactly what group of regression models you want to
 create, but a good first step might be to use reshape so that each party's
 vote share goes on a different row and the vote shares are all in the same
 column. Then you can use plyr grouping on tipo and party to make your
 models...
 
 library(reshape2)
 library(plyr)
 
 ast - melt(asturias.gen2011, id=c(municipio, total, tipo),
 variable.name=party, value.name=vote)
 
 dlply(ast, .(party, tipo), lm, formula=vote~total)
 
 or along those lines. This way you don't have to mess around with pasting
 together expressions to eval and so on...
 
 Peter
 
 On Sun, Mar 18, 2012 at 12:59 PM, Igor Sosa Mayor 
 joseleopoldo1...@gmail.com wrote:
 
  Hi,
 
  I have a dataframe basically like this:
 
   head(asturias.gen2011[,c(1,4,9:14)])
municipio total upyd  psoeppiu   factipo
  440  Allande  2031 1.44 31.10 39.75  4.01 21.62  1000-1
  443Aller 12582 1.37 33.30 37.09 15.53 10.35 1-5
  567   Amieva   805 1.48 32.69 37.36  6.15 20.16   1000
  849   Avilés 84202 4.15 30.26 35.49 14.37 11.80  5
  1087 Belmonte de Miranda  1751 1.66 38.42 35.74  7.22 14.81  1000-1
  1260 Bimenes  1894 0.98 34.28 26.87 23.30 10.98  1000-1
 
  I want to do the following:
  1. for every party (psoe, pp, etc.) I want to create a variable like
  this: upyd.lm.tipos, psoe.lm.tipos, etc.
 
  2. I want to store in this variable a regression (psoe~total), but
  split up by tipo.
 
  I have the main idea of using dlply from the plyr vignette. But when I
  try to put all this in a loop I'm coming into trouble and I'm at the
  moment really confused how to solve this problem:
 
  I have the following function:
 
  elecregtipos - function(y){
 z-dlply(asturias.gen2011, .(tipo), function(x) lm(x[,y]~x$edad.media))
 # rsq-function(x) summary(x)$r.squared
 # bcoefs-ldply(z, function(x) c(coef(x), rsquare=rsq(x)))
 #  return (bcoefs)
 return(z)
  }
 
  And I try to call it with:
  for (y in c(upyd, psoe, pp, fac, iu)) {
   eval(parse(text=paste(y,'.lm.tipos', '- elecregtipos(',y,')',sep='')))
  }
 
  At the moment I'm getting the error:
  Error en `[.data.frame`(x, , y) : objeto 'upyd' no encontrado
 
  If I call simply:
  elecregtipos(upyd)
 
  it works perfectly. The problem is the loop, column names, etc., but I'm
  really confused what I still could try, because I have already tried any
  possibility.
 
  Any hint?
 
  Thanks in advance.
 
 
  --
  :: Igor Sosa Mayor :: joseleopoldo1...@gmail.com ::
  :: GnuPG: 0x1C1E2890   :: http://www.gnupg.org/  ::
  :: jabberid: rogorido  ::::
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
:: Igor Sosa Mayor :: joseleopoldo1...@gmail.com ::
:: GnuPG: 0x1C1E2890   :: http://www.gnupg.org/  ::
:: jabberid: rogorido  ::::


pgpGk3gWFBxxV.pgp
Description: PGP signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot method for rasters and layout

2012-03-19 Thread Olivier Eterradossi
Hi list,

I thought I was used to layouts, but today I am facing a problem I cannot
overcome  :

 

On my R installation (Windows 7 Pro, SP1, R version 2.13.0, daily update of
packages), I am not able to put raster plots in user defined layouts :

 

 layout.matrix-matrix(c(1,2,3,4,5,5),2,3)

 layout(mat=layout.matrix)

 layout.show(5)

 

works fine, I get the correct frames in the correct place. But, using 5
graphs (that all plot OK if plotted alone) :

 

 plot(raster1)

 plot(raster2)

 plot(raster3)

 plot(raster4)

 plot(any.other.graph.meant.to.be.in.frame.5)

 

Plots giving the same layout  as :

 

 par(mfrow=c(2,3))

 plot(raster1)

 plot(raster2)

 plot(raster3)

 plot(raster4)

 plot(any.other.graph.supposed.to.fall.in.frame.5)

 

i.e. 3 rasterplots on the first row followed by the fourth raster and the
fifth graph, all of same size, the [2,3] frame being empty.

 

I suppose this is due to a conflict between layout and the bigplot/smallplot
approach used by the imageplot() function, from which the plot method for
rasters is said to be inspired. But I am not sure and I cannot work it out.

 

Do I miss something, and can anybody help ?

 

All the best to all of you, thanks as always for all the work done here !
Olivier

 

--

Olivier ETERRADOSSI

Maître-Assistant, HDR

Ecole des Mines d’Alès (CMGD, site de Pau)

Pôle Matériaux Polymères Avancés (MPA)

Hélioparc, 2 av. P. Angot, F-64053 PAU CEDEX 9

Tel : 05 59 30 90 35 (direct) - 05 59 30  54 25 (std)

Fax : 05 59 30 63 68

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Call for chapters: Data Mining Applications with R

2012-03-19 Thread Yanchang Zhao
Book title: Data Mining Applications with R

URL: http://www.rdatamining.com/books/book2.

Publisher: Elsevier

Chapter proposal due date: 30 April 2012


Introduction

R is one of the most widely used data mining tools in scientific and
business applications, among dozens of commercial and open-source data
mining software. It is free and expandable with over 3,600 packages.
However, it is not easy for beginners to find appropriate packages or
functions to use for their data mining tasks. It is more difficult,
even for experienced users, to work out the optimal combination of
multiple packages or functions to solve their business problems and
the best way to use them in the data mining process of their
applications. This book aims to facilitate using R in data mining
applications by presenting real-world applications in various areas.


Objective
-
This book will present around 20 applications on data mining with R.
Each application is to be presented as one chapter, covering its
background, business problems, data extraction and exploration, data
preprocessing, modeling, model evaluation, findings and model
deployment. In this way, it will help readers to learn to solve
real-world problems with a set of data mining techniques and then
apply the techniques and methodologies in their own data mining
projects. Code examples and sample data will be provided, so that
readers can easily learn the techniques by running the codes by
themselves.


Target audience
---
The audience includes data miners, analysts and R users from industry,
and university students and researchers who are interested in data
mining with R.


Topics
--
data mining applications with R in, but not limited to, the following areas
* Finance
* Retail
* Insurance
* Telecommunications
* Government
* Crime  Homeland Security
* Stock Market
* Social Welfare
* Social Media
* Sports
* Medicine and Health
* Education
* Patent
* Transport
* Real Estate
* Meteorology
* Bioinformatics
* Sentiment Analysis
* Spatial Data Analysis
* Scientific Computing


Submission procedure

Data miners and analysts are invited to submit by April 30, 2012, a
1-2 page manuscript proposal clearly explaining the mission and
concerns of the proposed chapter. Authors of accepted proposals will
be notified by May 15, 2012 about the status of their proposals. Full
chapters are due by July 31, 2012. All submitted chapters will be
reviewed by 2 or 3 reviewers. Please submit your chapter proposals and
full chapters at
https://www.easychair.org/account/signin.cgi?conf=dmar2013.

Details about the book are available at http://www.rdatamining.com/books/book2.


Book editors and contacts
-
Dr. Yanchang Zhao
RDataMining.com, Australia
yanchangzhao at gmail dot com

Mr. Yonghua Cen
Univ. of Technology, Sydney, Australia
justin.cen at gmail dot com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot method for rasters and layout

2012-03-19 Thread Michael Sumner
Is this with SDI in Windows? I'd update to a recent version of R, and
please provide reproducible code next time.

It could be the same as this issue, now long ago fixed:
https://stat.ethz.ch/pipermail/r-devel/2011-February/059906.html

Cheers, Mike.

On Mon, Mar 19, 2012 at 9:57 PM, Olivier Eterradossi
olivier.eterrado...@mines-ales.fr wrote:
 Hi list,

 I thought I was used to layouts, but today I am facing a problem I cannot
 overcome  :



 On my R installation (Windows 7 Pro, SP1, R version 2.13.0, daily update of
 packages), I am not able to put raster plots in user defined layouts :



 layout.matrix-matrix(c(1,2,3,4,5,5),2,3)

 layout(mat=layout.matrix)

 layout.show(5)



 works fine, I get the correct frames in the correct place. But, using 5
 graphs (that all plot OK if plotted alone) :



 plot(raster1)

 plot(raster2)

 plot(raster3)

 plot(raster4)

 plot(any.other.graph.meant.to.be.in.frame.5)



 Plots giving the same layout  as :



 par(mfrow=c(2,3))

 plot(raster1)

 plot(raster2)

 plot(raster3)

 plot(raster4)

 plot(any.other.graph.supposed.to.fall.in.frame.5)



 i.e. 3 rasterplots on the first row followed by the fourth raster and the
 fifth graph, all of same size, the [2,3] frame being empty.



 I suppose this is due to a conflict between layout and the bigplot/smallplot
 approach used by the imageplot() function, from which the plot method for
 rasters is said to be inspired. But I am not sure and I cannot work it out.



 Do I miss something, and can anybody help ?



 All the best to all of you, thanks as always for all the work done here !
 Olivier



 --

 Olivier ETERRADOSSI

 Maître-Assistant, HDR

 Ecole des Mines d’Alès (CMGD, site de Pau)

 Pôle Matériaux Polymères Avancés (MPA)

 Hélioparc, 2 av. P. Angot, F-64053 PAU CEDEX 9

 Tel : 05 59 30 90 35 (direct) - 05 59 30  54 25 (std)

 Fax : 05 59 30 63 68




        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Michael Sumner
Institute for Marine and Antarctic Studies, University of Tasmania
Hobart, Australia
e-mail: mdsum...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot method for rasters and layout

2012-03-19 Thread Olivier Eterradossi
Mike,
It is with SDI in Windows.

Here is reproducible code (by the way, I just add the opening of any raster and 
plot it four times).

 library (raster)
  b- brick(system.file(external/rlogo.grd, package=raster))
 layout.matrix-matrix(c(1,2,3,4,5,5),2,3,byrow=TRUE)
 layout(mat=layout.matrix)
 layout.show(5)
 plot(b[[1]])
 plot(b[[1]])
 plot(b[[1]])
 plot(b[[1]])

Running this I get the four R logos in quadrants [,1:3] and [2,1], not in 
[1:2,1:2]

I've read the thread you suggest, I'm afraid I don't fully understand it.
Last tiume I updated R December 2011 from CRAN, so it seems that it is 
posterior to the fix.
 I'll update to 2.14 asap.

Thank you. Olivier

-Message d'origine-
De : Michael Sumner [mailto:mdsum...@gmail.com] 
Envoyé : lundi 19 mars 2012 13:23
À : Olivier Eterradossi
Cc : r-help@r-project.org
Objet : Re: [R] plot method for rasters and layout

Is this with SDI in Windows? I'd update to a recent version of R, and please 
provide reproducible code next time.

It could be the same as this issue, now long ago fixed:
https://stat.ethz.ch/pipermail/r-devel/2011-February/059906.html

Cheers, Mike.

On Mon, Mar 19, 2012 at 9:57 PM, Olivier Eterradossi 
olivier.eterrado...@mines-ales.fr wrote:
 Hi list,

 I thought I was used to layouts, but today I am facing a problem I 
 cannot overcome  :



 On my R installation (Windows 7 Pro, SP1, R version 2.13.0, daily 
 update of packages), I am not able to put raster plots in user defined 
 layouts :



 layout.matrix-matrix(c(1,2,3,4,5,5),2,3)

 layout(mat=layout.matrix)

 layout.show(5)



 works fine, I get the correct frames in the correct place. But, using 
 5 graphs (that all plot OK if plotted alone) :



 plot(raster1)

 plot(raster2)

 plot(raster3)

 plot(raster4)

 plot(any.other.graph.meant.to.be.in.frame.5)



 Plots giving the same layout  as :



 par(mfrow=c(2,3))

 plot(raster1)

 plot(raster2)

 plot(raster3)

 plot(raster4)

 plot(any.other.graph.supposed.to.fall.in.frame.5)



 i.e. 3 rasterplots on the first row followed by the fourth raster and 
 the fifth graph, all of same size, the [2,3] frame being empty.



 I suppose this is due to a conflict between layout and the 
 bigplot/smallplot approach used by the imageplot() function, from 
 which the plot method for rasters is said to be inspired. But I am not sure 
 and I cannot work it out.



 Do I miss something, and can anybody help ?



 All the best to all of you, thanks as always for all the work done here !
 Olivier



 --

 Olivier ETERRADOSSI

 Maître-Assistant, HDR

 Ecole des Mines d’Alès (CMGD, site de Pau)

 Pôle Matériaux Polymères Avancés (MPA)

 Hélioparc, 2 av. P. Angot, F-64053 PAU CEDEX 9

 Tel : 05 59 30 90 35 (direct) - 05 59 30  54 25 (std)

 Fax : 05 59 30 63 68




[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




--
Michael Sumner
Institute for Marine and Antarctic Studies, University of Tasmania Hobart, 
Australia
e-mail: mdsum...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] assign a value to an element

2012-03-19 Thread John Kane
I am not sure that I understand but does something like this do what you want?

ec-1:10
vec[vec==4] - 100

vec - 1:10
vec[ vec==4 | vec==8] - 100

vec - 1:10
aa - 50
vec[vec==4] - aa


John Kane
Kingston ON Canada


 -Original Message-
 From: marc_...@yahoo.fr
 Sent: Sun, 18 Mar 2012 18:24:34 + (GMT)
 To: r-help@r-project.org
 Subject: [R] assign a value to an element
 
 Assign can be used to set a value to a variable that has name as a value
 of another variable. Example:
 
 name-essai
 assign(name, plouf)
 essai
 [1] plouf
 
 OK.
 But how to do the same when it is only an element of a vector, data frame
 and so on that must be changed.
 
 vec-1:10
 vec
  [1]  1  2  3  4  5  6  7  8  9 10
 vec[4]
 [1] 4
 name-vec[4]
 assign(name, 100)
 vec
  [1]  1  2  3  4  5  6  7  8  9 10
 
 The reason is probably here (from help of assign):
 assign does not dispatch assignment methods, so it cannot be used to set
 elements of vectors, names, attributes, etc.
 
 
 I have found this solution:
 eval(parse(text=paste(name, -100, sep=)))
 vec
  [1]   1   2   3 100   5   6   7   8   9  10
 
 Is-it the only way ? It is not very elegant !
 
 Thanks a lot
 
 Marc
 
 __
 Marc Girondot, Pr
 
 Laboratoire Ecologie, Systimatique et Evolution
 Equipe de Conservation des Populations et des Communautis
 CNRS, AgroParisTech et Universiti Paris-Sud 11 , UMR 8079
 Bbtiment 362
 91405 Orsay Cedex, France
 
 Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1.69.15.73.53
 e-mail: marc.giron...@u-psud.fr
 Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html


FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks  orcas on your 
desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reshape data frame with dcast and melt

2012-03-19 Thread mails
Hello,

I implemented two functions reshape_long and reshape_wide (see full working
example below) to reshape data frames.
I created several small examples and the two functions seemed to work
properly. However, using the reshape_wide function
on my real data sets (about 200.000 to 300.000 rows) failed. What happens is
set all values for X, Y and Z were set to 1.
The structure of my real data looks exactly the same as the small example
below. After working on it for 2 days I think the
problem is that the primary key (test_name, group_name and id) is only
unique in the wide form. After applying the 
reshape_long function the primary key is not longer unique. I was wondering
if anyone can tell me whether the step 
from d1 - reshape_wide - d2 can work at all because of the non uniqueness
of d1.



library(reshape2)

library(taRifx)




reshape_long - function(data, ids) {

# Bring data into long form

data_long - melt(data, id.vars = ids, variable.name=Data_Points,
value.name=value)

data_long$value - as.numeric(data_long$value)

# Remove rows were analyte value is NA

data_long - data_long[!is.na(data_long$value), ]

# Resort data

formula_sort - as.formula(paste(~, paste(ids, collapse=+)))

data_long - sort(data_long, f = formula_sort)

return(data_long)

}

reshape_wide - function(data, ids) {

# Bring data into wide form

formula_wide - as.formula(paste(paste(ids, collapse=+), ~
Data_Points))

data_wide - dcast(data, formula_wide)

# Resort data

formula_sort - as.formula(paste(~, paste(ids, collapse=+)))

data_wide - sort(data_wide, f = formula_sort)

return(data_wide)

}




d - data.frame(

test_name = c(rep(Test_A, 6), rep(Test_B, 6)),

group_name = c(rep(Group_C, 3), rep(Group_D, 3), rep(Group_C, 3),
rep(Group_D, 3)),

id = c(I1, I2, I3, I4, I5, I6,

   I1, I2, I3, I7, I8, I9),

X = c(NA,NA,1,2,3,4,5,6,NA,7,8,9),

Y = as.numeric(10:21),

Z = c(NA,22,23,NA,24,NA,25,26,NA,27,28,29)

)

d

d1 - reshape_long(d, ids=c(test_name, group_name, id))

d1

d2 - reshape_wide(d1, ids=c(test_name, group_name, id))

d2

identical(d,d2)


--
View this message in context: 
http://r.789695.n4.nabble.com/Reshape-data-frame-with-dcast-and-melt-tp4484332p4484332.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Issue with asin()

2012-03-19 Thread Letnichev
Hello everyone,

I am working for a few days already on a basic algorithm, very common in
applied agronomy, that aims to determine the degree-days necessary for a
given individual to reach a given growth stade. The algorithm (and context)
is explained here:  http://www.oardc.ohio-state.edu/gdd/glossary.htm , and
so I implemented my function in R as follows:

DD - function(Tmin, Tmax, Tseuil, meanT, method = DDsin)
### function that calculates the degree-days based on
### minimum and maximum recorded temperatures and the
### minimal threshold temperature (lower growth temperature)
{
### method arcsin
if(method == DDsin){
cond1 - (Tmax = Tseuil)
cond2 - (Tmin = Tseuil)
amp - ((Tmax - Tmin) / 2)
print((Tseuil-meanT)/amp)
alpha - asin((Tseuil - meanT) / amp)
DD_ifelse3 - ((1 / pi) * ((meanT - Tseuil) * ((pi/2) - alpha)) 
+
amp*cos(alpha))

DD - ifelse(cond1, 0, ifelse(cond2, (meanT - Tseuil), 
DD_ifelse3))
}

### method (Tmin + Tmax) / 2
else if(method == DDt2){
cond1 - (meanT  Tseuil)
DD - ifelse(cond1,(meanT - Tseuil),0)
}

else{
stop(\nMethod name is invalid.\nMethods available = DDsin 
(sinus) or DDt2
(mean)\n)
}
return(DD)
}

BUT! When I try to process random data:

library(reshape2)
library(plyr)

station - rep(c(station1,station2,station3), 20)
values_min - sample(-5:20, size = 60, replace = T)
values_max - sample(20:40, size = 60, replace = T)
meanT - ((values_min+values_max)/2)
d - data.frame(station,values_min,values_max,meanT)
names(d) - c(station, values_min,values_max,meanT)

x-ddply(d, .(station), transform, t1 =
cumsum(DD(values_min,values_max,0,meanT)))

I get a warning on my alpha calculation (NaN produced); indeed, the values I
give as argument to asin() are out of the range [-1:1], as the print()
reveals. I can't figure out how to solve this issue, because the same
algorithm works in Excel (visual basic).
It is very annoying, especially because it seems that no occurence of such
error using that algorithm can be found on Internet.
Any help is welcome :) Thanks for your time

P.

--
View this message in context: 
http://r.789695.n4.nabble.com/Issue-with-asin-tp4484462p4484462.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with package tensor

2012-03-19 Thread Peppe Ricci
Hi,

I'm using R to create multidimensional data, ie tensors. R, for my
work, is very good for import the data and I have seen that there are
packages to manage tensor and to factor the tensor.
I would ask a help regarding the package called tensor and tensorA. I
have seen, unfortunately, that the support material is really little
and it did not help me much.
I explain in brief my situation. I have some data arrays of different
size, they are matrices of large dimensions. From these I would create
a tensor..someone tell me how
do? Can you tell me an example that makes me understand how to build it?
Thank you.
giuseppe.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting border over map

2012-03-19 Thread uday
Hi Ray, 
Thanks for reply 
/R/PlotGridded2DMap.R 
/R/image.plot.fix.R 
/R/image.plot.plt.fix.r 
are the functions those I wrote for plotting and they work with another
data, but only I have some issue with only the codes those I provided
before. 
 and what do you mean by  I am redefining the map() function in there? 


--
View this message in context: 
http://r.789695.n4.nabble.com/plotting-border-over-map-tp4479163p4484009.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] what is p,d q in arima() function of time series

2012-03-19 Thread sagarnikam123
i am new to time series
i found in help about arima
arima(x = data, order = c(p, d, q))


what is exactly p,d,q? if i not changed them,what effects will happens?

--
View this message in context: 
http://r.789695.n4.nabble.com/what-is-p-d-q-in-arima-function-of-time-series-tp4484368p4484368.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to cluster/classify following time series?

2012-03-19 Thread sagarnikam123
how to cluster/classify attached time series ?(each column/time series
consider as single unit while clustering/classifying)
 if my concept is wrong,tell me how to extract time series with highest
information content ?


given file is  to do it
http://r.789695.n4.nabble.com/file/n4484173/rasta.txt rasta.txt 

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-cluster-classify-following-time-series-tp4484173p4484173.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Save File after order

2012-03-19 Thread MSousa
Hello,

    I'm trying to write the sorted data in a file of a data.frame, My
question and my problem is that when I record in file adds a new column
row.name, which apparently is the original position in the file.
    I wanted to write to the file without this column

x-data.frame(name=x1,Time=20)
x-rbind(x,data.frame(name=x2,Time=25))
x-rbind(x,data.frame(name=x3,Time=23))
x-rbind(x,data.frame(name=x2,Time=45))
x-rbind(x,data.frame(name=x1,Time=25))
x-rbind(x,data.frame(name=x1,Time=55))

x-x[order(x$name),]
View(x)
write.csv(data.frame(x$name,x$Time), file = ~/Desktop/DatasetOrder.csv)
In this momment save this
  name Time
1   x1   20
5   x1   25
6   x1   55
2   x2   25
4   x2   45
3   x3   23

The ideia is save 
name Time
x1   20
x1   25
x1   55
x2   25
x2   45
x3   23

Thanks

--
View this message in context: 
http://r.789695.n4.nabble.com/Save-File-after-order-tp4484539p4484539.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] fitted values with locfit

2012-03-19 Thread Soberon Velez, Alexandra Pilar
Dear memberships,



I'm trying to estimate the following multivariate local regression model using 
the locfit package:

BMI=m1(RCC)+m2(WCC)

where (m1) and (m2) are unknown smooth functions.


My problem is that once I get the regression done I cannot get the fitted 
values of each of this smooth functions (m1) and (m2). What I write is the 
following

library(locfit)

data(ais)
fit2-locfit.raw(x=lp(ais$RCC,h=0.5,deg=1)+lp(ais$WCC,deg=1,h=0.75),y=ais$BMI,ev=dat(),kt=prod,kern=gauss)
g21-predict(fit2,type=terms)


If I done this on the computer the results of (g21) is a vector when I should 
have a matrix with 2 columns (one for each fitted smooth function).


Please, somebody knows how can I get the estimated fitted values of both smooth 
functions (m1) and (m2) using a local linear regression with kernel weights as 
this example?


thanks a lot in advance I'm very desperate.

Alexandra


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] a very simple question

2012-03-19 Thread Dajiang Liu

Thanks a lot for the clarification. I just find it very bizarre that if you run 
a=0.1*(1:9);which(a==0.4) 
it returns the right answer. Anyway, I will pay attention next time. Thanks a 
lot. 

 Date: Mon, 19 Mar 2012 08:59:59 +0100
 From: rainer.schuerm...@gmx.net
 Subject: Re: [R] a very simple question
 To: ldjst...@hotmail.com; r-help@r-project.org
 
 As to the reasons, David as given you the necessary hints.
 
 In order to get around the issue, here is what I do:
 
  a - round( 0.1 * ( 1:9 ), 1 )
  a
 [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
  which( a == 0.3 )
 [1] 3
 
 Rgds,
 Rainer
 
 
  Original-Nachricht 
  Datum: Sun, 18 Mar 2012 21:43:54 +
  Von: Dajiang Liu ldjst...@hotmail.com
  An: r-help@r-project.org
  Betreff: [R] a very simple question
 
  
  Dear All,
  I have a seemingly very simple question, but I just cannot figure out the
  answer. I attempted to run the following:a=0.1*(1:9);which(a==0.3);it
  returns integer(0). But obviously, the third element of a is equal to 0.3. 
  I must have missed something. Can someone kindly explain why? Thanks a
  lot.
  Regards,Dajiang

  [[alternative HTML version deleted]]
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 -- 
 ---
 
 Gentoo Linux with KDE
 
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] regression with proportion data

2012-03-19 Thread Georgiana May
Hello,
I want to determine the regression relationship between a proportion (y)
and a continuous variable (x).
Reading a number of sources (e.g. The R Book, Quick R,help), I believe I
should be able to designate the model as:

model-glm(formula=proportion~x, family=binomial(link=logit))

this runs but gives me error messages:
Warning message:
In eval(expr, envir, enclos) : non-integer #successes in a binomial glm!

If I transform the proportion variable with log, it doesn't like that
either (values not: 0y1)

I understand that the binomial function concerns successes vs. failures and
can use those raw data, but the R Book and other sources seem to suggest
that proportion data are usable as well.  Not so?

Thank you,
Georgiana May

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Dotplot: how to change size in the y lab ?

2012-03-19 Thread Jose Bustos Melo
Hi everyone,

I'm trying to reduce the font size in the Y  exe in this plot:

http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=150

Anyone knows how to do it?
I have checked the argument lab.cex and cex, but any of these works!

if you want to check us this code:



### read the data
d - read.csv(  file( http://addictedtor.free.fr/graphiques/data/150/data.txt; 
) ) ### workaround so that lattice does not order bank names alphabetically
d$bank - ordered( d$bank, levels = d$bank ) ### load lattice and grid
require( lattice )
require( grid ) ### setup the key
k - simpleKey( c( Q2 2007,  January 20th 2009 ) )
k$points$fill - c(lightblue, lightgreen)
k$points$pch - 21
k$points$col - black
k$points$cex - 1 ### create the plot
dotplot( bank ~ MV2007 + MV2009 , data = d, horiz = T,  par.settings = list(  
superpose.symbol = list(  pch = 21,  fill = c( lightblue, lightgreen),  cex 
= 4,  col = black   ) ) , xlab = Market value ($Bn), key = k,  panel = 
function(x, y, ...){ panel.dotplot( x, y, ... ) grid.text(  unit( x, native) 
, unit( y, native) ,  label = x, gp = gpar( cex = .7 ) ) } ) 
Thank you in advance!
José

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Issue with asin()

2012-03-19 Thread Duncan Murdoch

On 12-03-19 7:42 AM, Letnichev wrote:

Hello everyone,

I am working for a few days already on a basic algorithm, very common in
applied agronomy, that aims to determine the degree-days necessary for a
given individual to reach a given growth stade. The algorithm (and context)
is explained here:  http://www.oardc.ohio-state.edu/gdd/glossary.htm , and
so I implemented my function in R as follows:

DD- function(Tmin, Tmax, Tseuil, meanT, method = DDsin)
### function that calculates the degree-days based on
### minimum and maximum recorded temperatures and the
### minimal threshold temperature (lower growth temperature)
{
### method arcsin
if(method == DDsin){
cond1- (Tmax= Tseuil)
cond2- (Tmin= Tseuil)


These look like useful diagnostics of out-of-range values, but you don't 
use them before the arcsin transformation.



amp- ((Tmax - Tmin) / 2)
print((Tseuil-meanT)/amp)
alpha- asin((Tseuil - meanT) / amp)




DD_ifelse3- ((1 / pi) * ((meanT - Tseuil) * ((pi/2) - alpha)) +
amp*cos(alpha))

DD- ifelse(cond1, 0, ifelse(cond2, (meanT - Tseuil), 
DD_ifelse3))
}

### method (Tmin + Tmax) / 2
else if(method == DDt2){
cond1- (meanT  Tseuil)
DD- ifelse(cond1,(meanT - Tseuil),0)
}

else{
stop(\nMethod name is invalid.\nMethods available = DDsin 
(sinus) or DDt2
(mean)\n)
}
return(DD)
}

BUT! When I try to process random data:


It's a good idea to use set.seed when trying to debug problems like 
this.  Then you can construct a reproducible example.  I'd also suggest 
getting rid of ddply at least for debugging; it makes it harder to see 
what's going on.


Duncan Murdoch




library(reshape2)
library(plyr)

station- rep(c(station1,station2,station3), 20)
values_min- sample(-5:20, size = 60, replace = T)
values_max- sample(20:40, size = 60, replace = T)
meanT- ((values_min+values_max)/2)
d- data.frame(station,values_min,values_max,meanT)
names(d)- c(station, values_min,values_max,meanT)

x-ddply(d, .(station), transform, t1 =
cumsum(DD(values_min,values_max,0,meanT)))

I get a warning on my alpha calculation (NaN produced); indeed, the values I
give as argument to asin() are out of the range [-1:1], as the print()
reveals. I can't figure out how to solve this issue, because the same
algorithm works in Excel (visual basic).
It is very annoying, especially because it seems that no occurence of such
error using that algorithm can be found on Internet.
Any help is welcome :) Thanks for your time

P.

--
View this message in context: 
http://r.789695.n4.nabble.com/Issue-with-asin-tp4484462p4484462.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regression with proportion data

2012-03-19 Thread Doran, Harold
The logit link requires a binary response variable, not a proportion. Better 
bet is a beta regression. You can also do some stuff with linear regression if 
you do some transformations, but linear regression assumes the outcome is any 
number on the real number line bounded between -Inf and Inf. 

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of Georgiana May
 Sent: Monday, March 19, 2012 10:06 AM
 To: r-help@r-project.org
 Subject: [R] regression with proportion data
 
 Hello,
 I want to determine the regression relationship between a proportion (y)
 and a continuous variable (x).
 Reading a number of sources (e.g. The R Book, Quick R,help), I believe I
 should be able to designate the model as:
 
 model-glm(formula=proportion~x, family=binomial(link=logit))
 
 this runs but gives me error messages:
 Warning message:
 In eval(expr, envir, enclos) : non-integer #successes in a binomial glm!
 
 If I transform the proportion variable with log, it doesn't like that
 either (values not: 0y1)
 
 I understand that the binomial function concerns successes vs. failures and
 can use those raw data, but the R Book and other sources seem to suggest
 that proportion data are usable as well.  Not so?
 
 Thank you,
 Georgiana May
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Issue with asin()

2012-03-19 Thread Berend Hasselman

On 19-03-2012, at 12:42, Letnichev wrote:

 Hello everyone,
 
 I am working for a few days already on a basic algorithm, very common in
 applied agronomy, that aims to determine the degree-days necessary for a
 given individual to reach a given growth stade. The algorithm (and context)
 is explained here:  http://www.oardc.ohio-state.edu/gdd/glossary.htm , and
 so I implemented my function in R as follows:
 
 DD - function(Tmin, Tmax, Tseuil, meanT, method = DDsin)
 ### function that calculates the degree-days based on
 ### minimum and maximum recorded temperatures and the
 ### minimal threshold temperature (lower growth temperature)
   {
   ### method arcsin
   if(method == DDsin){
   cond1 - (Tmax = Tseuil)
   cond2 - (Tmin = Tseuil)
   amp - ((Tmax - Tmin) / 2)
   print((Tseuil-meanT)/amp)
   alpha - asin((Tseuil - meanT) / amp)
   DD_ifelse3 - ((1 / pi) * ((meanT - Tseuil) * ((pi/2) - alpha)) 
 +
 amp*cos(alpha))
   
   DD - ifelse(cond1, 0, ifelse(cond2, (meanT - Tseuil), 
 DD_ifelse3))
   }
 
   ### method (Tmin + Tmax) / 2
   else if(method == DDt2){
   cond1 - (meanT  Tseuil)
   DD - ifelse(cond1,(meanT - Tseuil),0)
   }
 
   else{
   stop(\nMethod name is invalid.\nMethods available = DDsin 
 (sinus) or DDt2
 (mean)\n)
   }
   return(DD)
 }
 
 BUT! When I try to process random data:
 
 library(reshape2)
 library(plyr)
 
 station - rep(c(station1,station2,station3), 20)
 values_min - sample(-5:20, size = 60, replace = T)
 values_max - sample(20:40, size = 60, replace = T)
 meanT - ((values_min+values_max)/2)
 d - data.frame(station,values_min,values_max,meanT)
 names(d) - c(station, values_min,values_max,meanT)
 
 x-ddply(d, .(station), transform, t1 =
 cumsum(DD(values_min,values_max,0,meanT)))
 
 I get a warning on my alpha calculation (NaN produced); indeed, the values I
 give as argument to asin() are out of the range [-1:1], as the print()
 reveals. I can't figure out how to solve this issue, because the same
 algorithm works in Excel (visual basic).

That doesn't mean that Excel and/or Visual Basic gives correct answers.

With the same input?
Then what does Excel say that asin(-7.4) evaluates to?
I tried asin(-1.2) and asin(-7.4)  in LibreOffice Calc (3.5.0) and got #VALUE! 
(Error: wrong data type) twice.

You'll have to present correct input to asin()  if you want to avoid the NaN's.

Berend

 It is very annoying, especially because it seems that no occurence of such
 error using that algorithm can be found on Internet.
 Any help is welcome :) Thanks for your time
 
 P.
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Issue-with-asin-tp4484462p4484462.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] a very simple question

2012-03-19 Thread Berend Hasselman

On 19-03-2012, at 13:47, Dajiang Liu wrote:

 
 Thanks a lot for the clarification. I just find it very bizarre that if you 
 run a=0.1*(1:9);which(a==0.4) 
 it returns the right answer. Anyway, I will pay attention next time. Thanks a 
 lot. 
 


Look at

 a = 0.1*(1:4)
  a - 0.4
[1] -0.3 -0.2 -0.1  0.0
 


Berend

 Date: Mon, 19 Mar 2012 08:59:59 +0100
 From: rainer.schuerm...@gmx.net
 Subject: Re: [R] a very simple question
 To: ldjst...@hotmail.com; r-help@r-project.org
 
 As to the reasons, David as given you the necessary hints.
 
 In order to get around the issue, here is what I do:
 
 a - round( 0.1 * ( 1:9 ), 1 )
 a
 [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
 which( a == 0.3 )
 [1] 3
 
 Rgds,
 Rainer
 
 
  Original-Nachricht 
 Datum: Sun, 18 Mar 2012 21:43:54 +
 Von: Dajiang Liu ldjst...@hotmail.com
 An: r-help@r-project.org
 Betreff: [R] a very simple question
 
 
 Dear All,
 I have a seemingly very simple question, but I just cannot figure out the
 answer. I attempted to run the following:a=0.1*(1:9);which(a==0.3);it
 returns integer(0). But obviously, the third element of a is equal to 0.3. 
 I must have missed something. Can someone kindly explain why? Thanks a
 lot.
 Regards,Dajiang
   
 [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 -- 
 ---
 
 Gentoo Linux with KDE
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regression with proportion data

2012-03-19 Thread Rubén Roa
Your response variable is not binomial, it's a proportion.
Try the betareg function in the betareg package, which more correctly assumes 
that your response variable is Beta distributed (but beware that 1 and 0 are 
not allowed). The syntax is the same as in a glm.

HTH

Ruben

-Mensaje original-
De: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] En 
nombre de Georgiana May
Enviado el: lunes, 19 de marzo de 2012 15:06
Para: r-help@r-project.org
Asunto: [R] regression with proportion data

Hello,
I want to determine the regression relationship between a proportion (y) and a 
continuous variable (x).
Reading a number of sources (e.g. The R Book, Quick R,help), I believe I should 
be able to designate the model as:

model-glm(formula=proportion~x, family=binomial(link=logit))

this runs but gives me error messages:
Warning message:
In eval(expr, envir, enclos) : non-integer #successes in a binomial glm!

If I transform the proportion variable with log, it doesn't like that either 
(values not: 0y1)

I understand that the binomial function concerns successes vs. failures and can 
use those raw data, but the R Book and other sources seem to suggest that 
proportion data are usable as well.  Not so?

Thank you,
Georgiana May

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regression with proportion data

2012-03-19 Thread Jorge I Velez
Hi Georgiana,

Take a look at the betareg package at
http://cran.r-project.org/web/packages/betareg/index.html

HTH,
Jorge.-


On Mon, Mar 19, 2012 at 10:05 AM, Georgiana May  wrote:

 Hello,
 I want to determine the regression relationship between a proportion (y)
 and a continuous variable (x).
 Reading a number of sources (e.g. The R Book, Quick R,help), I believe I
 should be able to designate the model as:

 model-glm(formula=proportion~x, family=binomial(link=logit))

 this runs but gives me error messages:
 Warning message:
 In eval(expr, envir, enclos) : non-integer #successes in a binomial glm!

 If I transform the proportion variable with log, it doesn't like that
 either (values not: 0y1)

 I understand that the binomial function concerns successes vs. failures and
 can use those raw data, but the R Book and other sources seem to suggest
 that proportion data are usable as well.  Not so?

 Thank you,
 Georgiana May

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Issue with asin()

2012-03-19 Thread Sarah Goslee
Hi,

You're not following the algorithm as given. The asin step shouldn't
be done for all values, but only for the ones that don't meet the
previous conditions. You're trying to calculate that step for ALL
values, then only use certain ones. You must instead subset the
values, THEN calculate that step. I would guess that your working
Excel version does follow the correct algorithm, but it's hard to know
for certain.

Here's a version that more closely follows the given reference:

MaxDailyTemp - values_max
MinDailyTemp - values_min
k - 0

GDD - rep(0, length(Tmin))
AvgDailyTemp - (MaxDailyTemp + MinDailyTemp)/2

# if MaxDailyTemp  k
# GDD = GDD + 0
# - add 0

# if MaxDailyTemp  k  MinDailyTemp  k
# GDD = GDD + AvgDailyTemp - k
GDD[MaxDailyTemp  k  MinDailyTemp  k] - AvgDailyTemp[MaxDailyTemp
 k  MinDailyTemp  k] - k

# if MaxDailyTemp  k  MinDailyTemp  k
# GDD = GDD + (1/pi) * [ (AvgDailyTemp – k) * ( ( pi/2 ) – arcsine(
theta ) ) + ( a * cos( arcsine( theta ) ) ) ]
a - (MaxDailyTemp - MinDailyTemp)/2
theta - ((k - AvgDailyTemp)/a)
GDD[MaxDailyTemp  k  MinDailyTemp  k] - (1/pi) * (
(AvgDailyTemp[MaxDailyTemp  k  MinDailyTemp  k] - k) * ( ( pi/2 ) -
asin( theta[MaxDailyTemp  k  MinDailyTemp  k] ) ) + (
a[MaxDailyTemp  k  MinDailyTemp  k] * cos( asin( theta[MaxDailyTemp
 k  MinDailyTemp  k] ) ) ) )

sum(GDD)

Sarah

On Mon, Mar 19, 2012 at 7:42 AM, Letnichev chatelain.p...@gmail.com wrote:
 Hello everyone,

 I am working for a few days already on a basic algorithm, very common in
 applied agronomy, that aims to determine the degree-days necessary for a
 given individual to reach a given growth stade. The algorithm (and context)
 is explained here:  http://www.oardc.ohio-state.edu/gdd/glossary.htm , and
 so I implemented my function in R as follows:

 DD - function(Tmin, Tmax, Tseuil, meanT, method = DDsin)
 ### function that calculates the degree-days based on
 ### minimum and maximum recorded temperatures and the
 ### minimal threshold temperature (lower growth temperature)
        {
        ### method arcsin
        if(method == DDsin){
                cond1 - (Tmax = Tseuil)
                cond2 - (Tmin = Tseuil)
                amp - ((Tmax - Tmin) / 2)
                print((Tseuil-meanT)/amp)
                alpha - asin((Tseuil - meanT) / amp)
                DD_ifelse3 - ((1 / pi) * ((meanT - Tseuil) * ((pi/2) - 
 alpha)) +
 amp*cos(alpha))

                DD - ifelse(cond1, 0, ifelse(cond2, (meanT - Tseuil), 
 DD_ifelse3))
        }

        ### method (Tmin + Tmax) / 2
        else if(method == DDt2){
                cond1 - (meanT  Tseuil)
                DD - ifelse(cond1,(meanT - Tseuil),0)
                }

        else{
                stop(\nMethod name is invalid.\nMethods available = DDsin 
 (sinus) or DDt2
 (mean)\n)
                }
        return(DD)
 }

 BUT! When I try to process random data:

 library(reshape2)
 library(plyr)

 station - rep(c(station1,station2,station3), 20)
 values_min - sample(-5:20, size = 60, replace = T)
 values_max - sample(20:40, size = 60, replace = T)
 meanT - ((values_min+values_max)/2)
 d - data.frame(station,values_min,values_max,meanT)
 names(d) - c(station, values_min,values_max,meanT)

 x-ddply(d, .(station), transform, t1 =
 cumsum(DD(values_min,values_max,0,meanT)))

 I get a warning on my alpha calculation (NaN produced); indeed, the values I
 give as argument to asin() are out of the range [-1:1], as the print()
 reveals. I can't figure out how to solve this issue, because the same
 algorithm works in Excel (visual basic).
 It is very annoying, especially because it seems that no occurence of such
 error using that algorithm can be found on Internet.
 Any help is welcome :) Thanks for your time

 P.


-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Save File after order

2012-03-19 Thread Sarah Goslee
It doesn't have anything to do with your use of order(). Those are the
row names of your data frame. You can disable writing them with the
row.names=FALSE argument to write.table().

Sarah

On Mon, Mar 19, 2012 at 8:16 AM, MSousa ricardosousa2...@clix.pt wrote:
 Hello,

     I'm trying to write the sorted data in a file of a data.frame, My
 question and my problem is that when I record in file adds a new column
 row.name, which apparently is the original position in the file.
     I wanted to write to the file without this column

 x-data.frame(name=x1,Time=20)
 x-rbind(x,data.frame(name=x2,Time=25))
 x-rbind(x,data.frame(name=x3,Time=23))
 x-rbind(x,data.frame(name=x2,Time=45))
 x-rbind(x,data.frame(name=x1,Time=25))
 x-rbind(x,data.frame(name=x1,Time=55))

 x-x[order(x$name),]
 View(x)
 write.csv(data.frame(x$name,x$Time), file = ~/Desktop/DatasetOrder.csv)
 In this momment save this
  name Time
 1   x1   20
 5   x1   25
 6   x1   55
 2   x2   25
 4   x2   45
 3   x3   23

 The ideia is save
 name Time
 x1   20
 x1   25
 x1   55
 x2   25
 x2   45
 x3   23

 Thanks


-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regression with proportion data

2012-03-19 Thread S Ellison
 

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Georgiana May
 Sent: 19 March 2012 14:06
 To: r-help@r-project.org
 Subject: [R] regression with proportion data
 
 I understand that the binomial function concerns successes 
 vs. failures and can use those raw data, but the R Book and 
 other sources seem to suggest that proportion data are usable 
 as well.  Not so?

You _can_ use a two-column matrix with counts of successes and failures in the 
two columns

And if you know what the number n of observations was (which you would need to 
anyway for using proportions in a logistic regression) youcan calculate that 
matrix from the proportions and n, as long as you're reasonably careful about 
rounf=ding.

S Ellison***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] what is p,d q in arima() function of time series

2012-03-19 Thread R. Michael Weylandt
https://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average

You may also be interested in forecast:::auto.arima

Michael

On Mon, Mar 19, 2012 at 6:54 AM, sagarnikam123 sagarnikam...@gmail.com wrote:
 i am new to time series
 i found in help about arima
 arima(x = data, order = c(p, d, q))


 what is exactly p,d,q? if i not changed them,what effects will happens?

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/what-is-p-d-q-in-arima-function-of-time-series-tp4484368p4484368.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] hypergeometric function in ‘ mvtnorm’

2012-03-19 Thread R. Michael Weylandt
To view the source of (most) functions, simply type funcname without
parentheses: here, you get

 dmvt

function (x, delta, sigma, df = 1, log = TRUE, type = shifted)
{
if (df == 0)
return(dmvnorm(x, mean = delta, sigma = sigma, log = log))
if (is.vector(x)) {
x - matrix(x, ncol = length(x))
}
if (missing(delta)) {
delta - rep(0, length = ncol(x))
}
if (missing(sigma)) {
sigma - diag(ncol(x))
}
if (NCOL(x) != NCOL(sigma)) {
stop(x and sigma have non-conforming size)
}
if (!isSymmetric(sigma, tol = sqrt(.Machine$double.eps),
check.attributes = FALSE)) {
stop(sigma must be a symmetric matrix)
}
if (length(delta) != NROW(sigma)) {
stop(mean and sigma have non-conforming size)
}
m - NCOL(sigma)
distval - mahalanobis(x, center = delta, cov = sigma)
logdet - sum(log(eigen(sigma, symmetric = TRUE, only.values =
TRUE)$values))
logretval - lgamma((m + df)/2) - (lgamma(df/2) + 0.5 * (logdet +
m * logb(pi * df))) - 0.5 * (df + m) * logb(1 + distval/df)
if (log)
return(logretval)
return(exp(logretval))
}


Most of the functions in here you can see code for the same way: the
only ones you won't be able to are eigen, lgamma, log, exp, but these
methods are pretty well-documented and you shouldn't need to find code
for them. If you do, you'll need to read the underlying C.

Michael

On Sun, Mar 18, 2012 at 11:12 PM, statfan irene_vr...@hotmail.com wrote:
 Is there any way to know how the dmvt function computes the hypergeometric
 function needed in the calculation for the density of multivariate t
 distribution?

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/hypergeometric-function-in-mvtnorm-tp4483730p4483730.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Google Summer of Code

2012-03-19 Thread John C Nash
Once again, R has been accepted as an organization for the Google Summer of 
Code (2012).
We invite students interested in this program to learn more about it. A good 
starting
point is http://rwiki.sciviews.org/doku.php?id=developers:projects:gsoc2012. 
The Google
GSOC home page is http://www.google-melange.com/gsoc/homepage/google/gsoc2012

Workers who could mentor projects are also needed. We aim to have at least two 
mentors per
student project, based on experiences reported in an article (starting page 64) 
in the
recent issue of the R-Journal
http://journal.r-project.org/archive/2011-2/RJournal_2011-2.pdf

Those interested in either student or mentor participation should join our 
Google list
gso...@googlegroups.com as this is how we are communicating. Please provide a 1 
sentence
intro to yourself as we have had attempts by spammers to join the group.

Note that GSOC is about CODING. It is not intended to fund research, but many 
activities
with R require code to advance our work, so the program can be very helpful to 
improving R.

For information, the admins this year are Toby Dylan Hocking and John Nash, 
with backups
Brian Peterson and Virgilio Gomez.

Happy coding,

John Nash

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] a very simple question

2012-03-19 Thread Ted Harding
On 19-Mar-2012 Dajiang Liu wrote:
 Thanks a lot for the clarification. I just find it very bizarre
 that if you run
 a=0.1*(1:9);which(a==0.4) 
 it returns the right answer. Anyway, I will pay attention next time.
 Thanks a  lot.

The basic explanation is that, for an integer r (0r10), what is
stored in binary representation by R for 0.1*r or for 0.r or
for r/10 is always an approximation to the exact value (with the
possible exception of r=5).

The exact detail of the binary representation may depend on how
it was obtained, by any of several different methods of calculation
which, mathematically, are exactly equivalent but, in the binary
representations stored in the computer, may be slightly different.

Examples:

  0.1*(1:9) - (1:9)/10
  # [1] 0.00e+00 0.00e+00 5.551115e-17 0.00e+00
  # [5] 0.00e+00 1.110223e-16 1.110223e-16 0.00e+00
  # [8] 0.00e+00

  0.1*(1:9) - c(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9)
  # [1] 0.00e+00 0.00e+00 5.551115e-17 0.00e+00
  # [5] 0.00e+00 1.110223e-16 1.110223e-16 0.00e+00
  # [8] 0.00e+00

  # (1:9)/10 - c(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9)
  # [1] 0 0 0 0 0 0 0 0 0

  cumsum(rep(0.3,9))/3 - (1:9)/10
  # [1] -1.387779e-17 -2.775558e-17  0.00e+00 -5.551115e-17
  # [5]  0.00e+00  0.00e+00  1.110223e-16 -1.110223e-16
  # [9] -1.110223e-16

and so on ...

The third example suggests that when R is given a decimal
fraction 0.r it recognises that this is equivalent to r/10
and calculates it accordingly, hence the agreement between
(1:9)/10 and c(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9). (I would
need to check the source code to verify that statement, however).

The short answer (as has been pointed out) is that you cannot
count on exact agreement, within R (or most other numerical
software), between a value calculated by one numerical method
and the value calculated by another numerical method which is
mathematically equivalent.

Some numerical software will work by storing the expression
given to it not as a number but as a sequence of operations
performed on given digits, only evaluating this at the last
moment along with other similar expressions, working within
the scale (e.g. decimal scale for numbers given like 123.456)
thus obtaining maximum accuracy within the allocated storage.
An example it the arbitrary-precision calculator 'bc'.

Many (most?) hand-held digital calculators work to an internal
decimal representation such as BCD (binary-coded decimal)
where each byte is split into two half-bytes of 4 binary
digits, each capable of storing a number from 0 to 9; then
they can perform exact decimal arithmetic (to within the
precision of storage) for decimal numbers, avoiding the
imprecision resulting from conversion to binary (but may
exhibit similar problems to the above for binary input).

Ted.

-
E-Mail: (Ted Harding) ted.hard...@wlandres.net
Date: 19-Mar-2012  Time: 15:02:03
This message was sent by XFMail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Identifying a change in events between bins

2012-03-19 Thread Robert Baer
Your description was too general for me to know exactly what you want but 
perhaps this will help you solve your own problem

set.seed(123)
evtlist = sample(c('fwd','rev'),100,replace=TRUE)
evtlist
 [1] fwd rev fwd rev rev fwd rev rev rev fwd rev 
fwd rev
[14] rev fwd rev fwd fwd fwd rev rev rev rev rev rev 
rev
[27] rev rev fwd fwd rev rev rev rev fwd fwd rev fwd 
fwd
[40] fwd fwd fwd fwd fwd fwd fwd fwd fwd fwd rev fwd 
fwd
[53] rev fwd rev fwd fwd rev rev fwd rev fwd fwd fwd 
rev
[66] fwd rev rev rev fwd rev rev rev fwd fwd fwd fwd 
rev
[79] fwd fwd fwd rev fwd rev fwd fwd rev rev rev fwd 
fwd

[92] rev fwd rev fwd fwd rev fwd fwd rev

rle(evtlist)

Run Length Encoding
 lengths: int [1:50] 1 1 1 2 1 3 1 1 1 2 ...
 values : chr [1:50] fwd rev fwd rev fwd rev fwd rev fwd 
...

rle(evtlist)$lengths
[1]  1  1  1  2  1  3  1  1  1  2  1  1  3  9  2  4  2  1 12  1  2  1  1  1 
2  2  1  1

[29]  3  1  1  3  1  3  4  1  3  1  1  1  2  3  2  1  1  1  2  1  2  1



see ?rle

Rob
-Original Message- 
From: Mark Hills

Sent: Friday, March 16, 2012 3:55 PM
To: r-help@r-project.org
Subject: [R] Identifying a change in events between bins

Hi there,

First off, despite this being my first post here, I have scanned the R help 
forums a lot in the past few months to help with some questions, so a big 
thank you to the community as a whole for being so helpful!


I'm somewhat of an R newbie, and have run up against a problem that I can't 
seem to solve.  If anyone is able to help I would really appreciate it!


I'm looking at a number of events across a chromosome, and have written a 
program that collects them into different bins, based on a specified 
binsize.  The events are directional, either forward or reverse, and a 
chromosome can either be fwd/fwd (all the events fall into the fwd bins), 
rev/rev (all the events fall into the rev bins) or fwd/rev (events are 
evenly split).  In some cases, chromosomes switch from one state to another 
(eg fwd/fwd to fwd/rev).  There are a number of rules that dictate my data. 
First, while there is stochastic variation, the sum of fwd and rev in each 
bin should have approximately the same value. If I were to take the total 
number of events and divide them by the number of bins to get an average 
count per bin, I would expect approximately that value in each bin; in the 
case of fwd/fwd it would be about average number in the fwd column and close 
to zero in the rev column, in rev/rev it would be about the average number 
in the rev column and close to zero in the fwd column, and in fwd/rev it 
would be about half the average number in both.


Hopefully my png attachment worked and you can see an example.  The top plot 
shows fwd reads, the 2nd shows rev reads and the third shows fwd minus rev 
reads.


What I would like to be able to do is to automatically assign regions in 
which the chromosome switches from one state to another. From the graphs 
(and from the read.table output below) you can see that this particular 
chromosome is fwd/fwd from bin 1 to 59, fwd/rev from bin 61 to 73, and 
rev/rev for the remainder of the chromosomes.


These are generated from a read.table that looks like this:

bin  fwd  rev
50  484   2
51  366   4
52  527   6
53  635   2
54  573   6
55  506   4
56  600   6
57  560   2
58  504   2
59  545   0
60  501  68
61  419 223
62  252 109
63  259 138
64  355 189
65  218 125
66  140  57
67   45  31
68  276 144
69  263 152
70  330 193
71  439 204
72  347 207
73   10 611
746 619
752 578
767 372
776 436
784 373
798 417
802 276

My question is this:

1. Is there an obvious way to automatically identify these regions?

I am not sure how I can go about scanning previous lines within a read.table 
to find a point at which the values change.  In the above example, I would 
like the program to identify that the fwd graph shifts from ~1x the average 
to ~0.5x the average between bin 61 and 62, and from ~0.5x the average to 
~0x the average between bin 72 and 73. Conversely I'd like to identify the 
rev graph shifting from ~0x average to ~0.5x average between bins 59 and 60, 
and from 0.5x average to 1x average from bin 72 to 73.  Finally, I'd like to 
cross-reference the output from fwd and rev to  only pull out reciprocal 
switches (ie those that occur within 3 bins of each other in both fwd and 
rev data sets).


What I've been trying to gt to work is to generate values based on 0, 0.5 
and 1x the average events, and trying to pull out the range of bins that 
fall into each of those categories (possibly 1 SD higher or lower to account 
for the stochastic variation), but I'm not really sure how to go about that.


2. If I can find a way to identify a shift between bins, is there any way to 
then look in smaller bin sizes across those regions.  The bins shown above 
are for 200,000 bases of DNA.  If my program automatically found an event 
between bin 72 and 73 (14,400,000 bases to 14,600,000), is it possible to 
feed that 

Re: [R] Dotplot: how to change size in the y lab ?

2012-03-19 Thread ilai
On Mon, Mar 19, 2012 at 7:56 AM, Jose Bustos Melo jbustosm...@yahoo.es wrote:
 Hi everyone,

 I'm trying to reduce the font size in the Y  exe in this plot:

 dotplot( bank ~ MV2007 + MV2009 , data = d, horiz = T,
 par.settings = list(  superpose.symbol = list(  pch = 21,  fill = c(
lightblue, lightgreen),  cex = 4,  col = black   ) ) , xlab =
Market value ($Bn), key = k,  panel = function(x, y, ...){
panel.dotplot( x, y, ... ) grid.text(  unit( x, native) , unit( y,
native) ,  label = x, gp = gpar( cex = .7 ) ) } ### add this
, scales=list(y=list(cex=.5))

)

Cheers


 Thank you in advance!
 José

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] a very simple question

2012-03-19 Thread Petr Savicky
On Mon, Mar 19, 2012 at 12:47:12PM +, Dajiang Liu wrote:
 
 Thanks a lot for the clarification. I just find it very bizarre that if you 
 run a=0.1*(1:9);which(a==0.4) 
 it returns the right answer. Anyway, I will pay attention next time. Thanks a 
 lot. 

Hi.

Yes, these things are bizarre sometimes. Compare

  print(0.1, digits=20)   # [1] 0.1555
  print(4*0.1, digits=20) # [1] 0.4000222
  print(0.4, digits=20)   # [1] 0.4000222

Equality of the last two is the reason for

  which(0.1*(1:9) == 0.4)

  [1] 4

while for 0.3, we get

  print(3*0.1, digits=20) # [1] 0.30004441
  print(0.3, digits=20)   # [1] 0.2999889

See

  http://rwiki.sciviews.org/doku.php?id=misc:r_accuracy

for further hints.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plm function

2012-03-19 Thread Millo Giovanni
Dear Ieva,

plm(.., model=within) (which is the default for plm()) estimates a
within model on time-demeaned data, which is equivalent to using the
LSDV estimator. Therefore any time-constant dummy variable you add by
hand will be discarded because of perfect collinearity.

What kind of dummies are you trying to include? If they are
time-constant they will be incompatible with the within (FE) estimator,
but not with other uses of plm() like random effects ('model=random')
or pooling ('model=pooling').

A reproducible example, as requested by the posting guide, would have
clarified things.
Best wishes,
Giovanni

Giovanni Millo, PhD
Research Dept.,
Assicurazioni Generali SpA
Via Machiavelli 4,
34132 Trieste (Italy)
tel. +39 040 671184
fax  +39 040 671160

- original message -

Message: 15
Date: Wed, 14 Mar 2012 13:46:03 +0200
From: Ieva Sriubait? ieva.sriuba...@gmail.com
To: r-help@R-project.org
Subject: [R] plm function
Message-ID:

CAOCxseKEvj5uevHCNm-Or_E-yj=bacpb524tazq-9su+f+k...@mail.gmail.com
Content-Type: text/plain

Dear Sir/ Madam,

I am writing about the panel data for my bachelor degree.
I would really appreciate if You could help dealing with R functions.
I am trying to estimate the panel data lm model with plm function. When
i
include 3dummy variables into the regression it dont appear in the
sumarry
of the model, but when i estimate a simple lm model it appears.
Why is it so? What should i do to estimate the statistics for those
dummy
variables?

Thank You.
Ieva

- end original message -

 
Ai sensi del D.Lgs. 196/2003 si precisa che le informazi...{{dropped:12}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] glm: getting the confidence interval for an Odds Ratio, when using predict()

2012-03-19 Thread peter dalgaard

On Mar 19, 2012, at 03:32 , Dominic Comtois wrote:

 Say I fit a logistic model and want to calculate an odds ratio between 2
 sets of predictors. It is easy to obtain the difference in the predicted
 logodds using the predict() function, and thus get a point-estimate OR. But
 I can't see how to obtain the confidence interval for such an OR.
 
 
 
 For example:
 
 model - glm(chd ~age.cat + male + lowed, family=binomial(logit))
 
 pred1 - predict(model, newdata=data.frame(age.cat=1,male=1,lowed=1))
 
 pred2 - predict(model, newdata=data.frame(age.cat=2,male=0,lowed=0))
 
 OR - exp(pred2-pred1) 


There's no trivial way since you need the covariance of pred2 and pred1 to 
calculate the variance of the difference.

I think you can proceed somewhat like as follows (I can't be bothered to test 
it without a reproducible example to start from. You may need to throw in a few 
explicit t() and as.vector() here and there.) 

newd   - data.frame(age.cat=c(1,2),male=c(1,0),lowed=c(1,0))
M - model.matrix(model, data=newd)
V - vcov(model)
contr - c(-1,1) %*% M
se - contr %*% V %*% contr 

OR.ci - exp(pred2 - pred1 + qnorm(c(.025,.50,.975))*se)

(Sanity check:  contr %*% coef(model)  should be same as  pred2 - pred1 )

I'm not sure how general the model.matrix trick is. It works in cases like

 mm - glm(ff, data=trees)
 model.matrix(mm, data=trees[1,])
  (Intercept) log(Height) log(Girth)
1   14.248495   2.116256
attr(,assign)
[1] 0 1 2

but I see that there are cases where a data argument may be ignored. If that 
is the case, then you may have to construct the contr vector by hand.

 
 
 
 Thanks
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] fitting a histogram to a Gaussian curve

2012-03-19 Thread Vihan Pandey
Hello,

I am trying to fit my histogram to a smooth Gaussian curve(the data
closely resembles one except a few bars).

This is my code :

#!/usr/bin/Rscript

out_file = irc_20M_opencl_test.png
png(out_file)

scan(my.csv) - myvals

hist(myvals, breaks = 50, main = My Distribution,xlab = My Values)

pdens - density(myvals, na.rm=T)
plot(pdens, col=black, lwd=3, xlab=My values, main=Default KDE)

dev.off()

print(paste(Plot was saved in:, getwd()))

the problem here is that I a jagged distribution, you can see the result :

http://s15.postimage.org/9ucmkx3bf/foobar.png

this is the original histogram :

http://s12.postimage.org/e0lfp7d5p/foobar2.png

any ideas on how I can smoothen it to a Gaussian curve?

Thanks,

- vihan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Save File after order

2012-03-19 Thread MSousa
Thanks

--
View this message in context: 
http://r.789695.n4.nabble.com/Save-File-after-order-tp4484539p4485370.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 'Unexpected numeric constant'

2012-03-19 Thread HJ YAN
Dear R-help,

I am trying to rename the variables in a dataframe, called 'T1A' here.
Seems renaming was successful, but when I call one of the variable I got
error message and I wanted to know why.


The data frame contains 365 rows and 49 columns. I would like to name the
first column `DATE` and the others T0.5, T1, T1.5,...,T24 (as this is a set
of data collected every half hour for a whole year).

Original data is saved as csv file and column 2-49 are named in format
'00:30,01:00,01:30,...,23:30,00:00'. When I read them into R by using
read.csv, the column names are changed automatically as 'X0.30.00,
X1.00.00,...,X23.30.00,X0.00.00' , which dont look great (i mean I would
prefer it in a format as 'hh:mm', NOT using 'dot' between numbers that used
to indicate time, but I have not found a solution...). So I decided to use
a simplified version as above, e.g. T0.5, T1, T1.5,...,T24 and my code is:


TIME-paste(rep(T,48),as.character(seq(0.5,24,by=0.5)))
names(T1A)-c(DATE,TIME)

 class(T1A$T0.5)  ## without a space between 'T' and '0.5'
[1] NULL
 class(T1A$T 0.5)  ## with a space between 'T' and '0.5'
Error: unexpected numeric constant in class(T1A$T 0.5


I also tried the code below, but got same error message...

 TIME-paste(rep(T,48),seq(0.5,24,by=0.5))
names(T1A)-c(DATE,TIME)


However, if I do not change the columns' name then everything works
fine, e.g. I can call the variables with no problem.

class(T1A$X00.30.00)
[1] numeric

Any thoughts??


Many thanks!!!
HJ

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Issue with asin()

2012-03-19 Thread Letnichev
Hello,

you're totally right, I tried first to control the flow with if
(MaxDailyTemp  k  MinDailyTemp  k){statement} but it was a bit messy.
Then ifelse() was supposed to help me out, but it didn't.

Thank you for your time, your code works exactly as I want :)

P.

--
View this message in context: 
http://r.789695.n4.nabble.com/Issue-with-asin-tp4484462p4485206.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Issue with asin()

2012-03-19 Thread Letnichev
Yes, with the same input I had two different outputs with Excel and R. When
printing a debug report of Excel, it showed no anomalies and I am certain it
didn't calculate odd values (such as NaNs).
The way I coded was wrong, as Sarah said, I didn't follow completely the
algorithm. The solution she suggested works perfectly, so I am out of
trouble (for now :p ).

Thanks for your time,


P.

--
View this message in context: 
http://r.789695.n4.nabble.com/Issue-with-asin-tp4484462p4485185.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Output formatting in Latex and R

2012-03-19 Thread priyank
Use the eol=\n\n option. The records should have a 2 line space.

--
View this message in context: 
http://r.789695.n4.nabble.com/Output-formatting-in-Latex-and-R-tp4483631p4485457.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] hypergeometric function in ‘ mvtnorm’

2012-03-19 Thread statfan
Thanks for your advice.  I actually meant to ask about the pmvt for the
distribution function.  Viewing the source code pmvt uses the function
mvt which uses the function probval which sources the fortran code:

Fortran(mvtdst, N = as.integer(n), NU = as.integer(df), 
LOWER = as.double(lower), UPPER = as.double(upper), INFIN =
as.integer(infin), 
CORREL = as.double(corrF), DELTA = as.double(delta), 
MAXPTS = as.integer(x$maxpts), ABSEPS = as.double(x$abseps), 
RELEPS = as.double(x$releps), error = as.double(error), 
value = as.double(value), inform = as.integer(inform), 
PACKAGE = mvtnorm)

I wish to look at how this mvtdst calculates the hypergeometric function
(2_F_1).  Anyway that I can see that?
Thanks 

--
View this message in context: 
http://r.789695.n4.nabble.com/hypergeometric-function-in-mvtnorm-tp4483730p4485277.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Coverage Probability

2012-03-19 Thread hubinho
Hello.

I'm allready this far. I have a function which is calculating the lower (l)
and upper (u) limit for a confidence interval for the odds ratio.

For example for 5 simulated 2x2 tables the upper and lower limits are:

 u
[1] 2.496141 7.436524 8.209161 4.313587 3.318612
 l
[1] -0.9718608  1.1000713  1.5715373  0.1135158 -0.2700517

With (l[1]; u[1]) being the confidence interval for the odds ratio for the
first simulated table and so on.

Now I want to compute the coverage probability. For that I've created a
function which is return 1 if the odds ratio is in the interval and 0 if it
isn't.

cover - function(theta, u, l){
if(theta = l  theta   = u){z=1}
if(theta  l || thetau){z=0}; return(z)
}

This works but unfortunately not if I want to summarize the function and
divide it with the sample size to get the coverage probability.

I tried it this way

for(for(x in 1:5) {a = (sum(cover(theta, u[x], l[x]))/5; return(a)}

Maybe someone can help me. Thank you



--
View this message in context: 
http://r.789695.n4.nabble.com/Coverage-Probability-tp4485511p4485511.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coverage Probability

2012-03-19 Thread Jorge I Velez
Hi hubinho,

You are almost there.  Try this slightly modification of your function:

# theta, u and l are vectors of the same length
foo - function(theta, u, l) mean(theta = l  theta = u, na.rm = TRUE)
foo(theta, u, l)

HTH,
Jorge.-


On Mon, Mar 19, 2012 at 12:55 PM, hubinho  wrote:

 Hello.

 I'm allready this far. I have a function which is calculating the lower (l)
 and upper (u) limit for a confidence interval for the odds ratio.

 For example for 5 simulated 2x2 tables the upper and lower limits are:

  u
 [1] 2.496141 7.436524 8.209161 4.313587 3.318612
  l
 [1] -0.9718608  1.1000713  1.5715373  0.1135158 -0.2700517

 With (l[1]; u[1]) being the confidence interval for the odds ratio for the
 first simulated table and so on.

 Now I want to compute the coverage probability. For that I've created a
 function which is return 1 if the odds ratio is in the interval and 0 if it
 isn't.

 cover - function(theta, u, l){
 if(theta = l  theta   = u){z=1}
 if(theta  l || thetau){z=0}; return(z)
 }

 This works but unfortunately not if I want to summarize the function and
 divide it with the sample size to get the coverage probability.

 I tried it this way

 for(for(x in 1:5) {a = (sum(cover(theta, u[x], l[x]))/5; return(a)}

 Maybe someone can help me. Thank you



 --
 View this message in context:
 http://r.789695.n4.nabble.com/Coverage-Probability-tp4485511p4485511.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] car/MANOVA question

2012-03-19 Thread Ranjan Maitra
Dear colleagues,

I had a question wrt the car package. How do I evaluate whether a
simpler multivariate regression model is adequate?

For instance, I do the following:

ami - read.table(file =
http://www.public.iastate.edu/~maitra/stat501/datasets/amitriptyline.dat;,
col.names=c(TCAD, drug, gender, antidepressant,PR, dBP,
QRS))

ami$gender - as.factor(ami$gender)
ami$TCAD - ami$TCAD/1000
ami$drug - ami$drug/1000


library(car)

fit.lm - lm(cbind(TCAD, drug) ~ gender + antidepressant + PR + dBP +
QRS, data = ami)

fit.manova - Manova(fit.lm)

fit1.lm - update(fit.lm, .~ . - PR - dBP - QRS)

fit1.manova - Manova(fit1.lm)



Is there an easy way to find out whether the reduced model is adequate?

I am thinking of something similar to the anova() function, I guess?

Many thanks and best wishes,
Ranjan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 'Unexpected numeric constant'

2012-03-19 Thread Berend Hasselman

On 19-03-2012, at 17:39, HJ YAN wrote:

 Dear R-help,
 
 I am trying to rename the variables in a dataframe, called 'T1A' here.
 Seems renaming was successful, but when I call one of the variable I got
 error message and I wanted to know why.
 
 
 The data frame contains 365 rows and 49 columns. I would like to name the
 first column `DATE` and the others T0.5, T1, T1.5,...,T24 (as this is a set
 of data collected every half hour for a whole year).
 
 Original data is saved as csv file and column 2-49 are named in format
 '00:30,01:00,01:30,...,23:30,00:00'. When I read them into R by using
 read.csv, the column names are changed automatically as 'X0.30.00,
 X1.00.00,...,X23.30.00,X0.00.00' , which dont look great (i mean I would
 prefer it in a format as 'hh:mm', NOT using 'dot' between numbers that used
 to indicate time, but I have not found a solution...). So I decided to use
 a simplified version as above, e.g. T0.5, T1, T1.5,...,T24 and my code is:
 
 
 TIME-paste(rep(T,48),as.character(seq(0.5,24,by=0.5)))
 names(T1A)-c(DATE,TIME)
 
 class(T1A$T0.5)  ## without a space between 'T' and '0.5'
 [1] NULL
 class(T1A$T 0.5)  ## with a space between 'T' and '0.5'
 Error: unexpected numeric constant in class(T1A$T 0.5
 
 
 I also tried the code below, but got same error message...
 
 TIME-paste(rep(T,48),seq(0.5,24,by=0.5))
 names(T1A)-c(DATE,TIME)
 
 
 However, if I do not change the columns' name then everything works
 fine, e.g. I can call the variables with no problem.
 
 class(T1A$X00.30.00)
 [1] numeric
 
 Any thoughts??
 

Have you done ?paste
The default separator character is a singe space.

Use paste(., sep=)

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] hypergeometric function in ‘ mvtnorm’

2012-03-19 Thread Berend Hasselman

On 19-03-2012, at 16:54, statfan wrote:

 Thanks for your advice.  I actually meant to ask about the pmvt for the
 distribution function.  Viewing the source code pmvt uses the function
 mvt which uses the function probval which sources the fortran code:
 

No  it doesn't source. It call a compiled Fortran subroutine.

 Fortran(mvtdst, N = as.integer(n), NU = as.integer(df), 
LOWER = as.double(lower), UPPER = as.double(upper), INFIN =
 as.integer(infin), 
CORREL = as.double(corrF), DELTA = as.double(delta), 
MAXPTS = as.integer(x$maxpts), ABSEPS = as.double(x$abseps), 
RELEPS = as.double(x$releps), error = as.double(error), 
value = as.double(value), inform = as.integer(inform), 
PACKAGE = mvtnorm)
 
 I wish to look at how this mvtdst calculates the hypergeometric function
 (2_F_1).  Anyway that I can see that?

Yes. Download the source code of the package.
Obtainable from CRAN.
Unpack and browse.

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rolling regressions with sample extended one period at a time

2012-03-19 Thread pie'
hey, 

thnks a lot. I got exactly what I wanted.

--
View this message in context: 
http://r.789695.n4.nabble.com/Rolling-regressions-with-sample-extended-one-period-at-a-time-tp4470316p4485815.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Linear regression

2012-03-19 Thread Diviya Smith
Hello there,

I am new to using regression in R. I wanted to solve a simple regression
problem where I have 2 equations and 2 unknowns.

So lets say -
y1 = alpha1*A + beta1*B
y2 = alpha2*A + beta2*B

y1 - runif(10, 0,1)
y2 - runif(10,0,1)

alpha1 - 0.6
alpha2 - 0.75

beta1 - 1-alpha1
beta2 - 1-apha2

I now want this equation to estimate the values of A and B. Both A and B
are constrained to be between (0,1). I would like to use lm with these
constraints and I am having a little trouble in defining the equations
correctly. Any help would be most appreciated.

Thank you,
Diviya

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 'Unexpected numeric constant'

2012-03-19 Thread Peter Ehlers

On 2012-03-19 09:39, HJ YAN wrote:

Dear R-help,

I am trying to rename the variables in a dataframe, called 'T1A' here.
Seems renaming was successful, but when I call one of the variable I got
error message and I wanted to know why.


The data frame contains 365 rows and 49 columns. I would like to name the
first column `DATE` and the others T0.5, T1, T1.5,...,T24 (as this is a set
of data collected every half hour for a whole year).

Original data is saved as csv file and column 2-49 are named in format
'00:30,01:00,01:30,...,23:30,00:00'. When I read them into R by using
read.csv, the column names are changed automatically as 'X0.30.00,
X1.00.00,...,X23.30.00,X0.00.00' , which dont look great (i mean I would
prefer it in a format as 'hh:mm', NOT using 'dot' between numbers that used
to indicate time, but I have not found a solution...). So I decided to use
a simplified version as above, e.g. T0.5, T1, T1.5,...,T24 and my code is:


TIME-paste(rep(T,48),as.character(seq(0.5,24,by=0.5)))
names(T1A)-c(DATE,TIME)


class(T1A$T0.5)  ## without a space between 'T' and '0.5'

[1] NULL

class(T1A$T 0.5)  ## with a space between 'T' and '0.5'

Error: unexpected numeric constant in class(T1A$T 0.5


I also tried the code below, but got same error message...

  TIME-paste(rep(T,48),seq(0.5,24,by=0.5))
names(T1A)-c(DATE,TIME)


However, if I do not change the columns' name then everything works
fine, e.g. I can call the variables with no problem.

class(T1A$X00.30.00)
[1] numeric

Any thoughts??


Many thanks!!!
HJ


Berend has shown you the problem with your use of paste().
If you want the original (illegal in R) names, then you can
set the argument 'check.names' to FALSE in your read.csv() call.
You will then have to remember to always put quotes around any
use of these names in your code. But since it's generally
better to use T1A[[name]] rather than T1A$name anyway,
the need for quotes should not be a problem.
Still, I wouldn't use illegal names.

Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Automaticall adjust axis scales

2012-03-19 Thread Alaios
Dear all,

I have made a function that given a number of list elements plot them to the 
same window.

The first element is plotted by using plot and all the rest are plotted under 
the 

same window by using lines.

I have below a small and simple reproducible example.


x1-c(1:10)
plot(x1)

x2-c(11:20)
lines(x2)

x3-c(31:40)
lines(x3)




as you might notice 
the two consecutive lines fail to be plotted as the axis were formed by the 
first plot.
Would it be possible after the last lines to change the axis to the minimum and 
the maximum of all data sets to be visible?

Any idea how I can do that?

I would like to thank you for your help

B.R
Alex

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] car/MANOVA question

2012-03-19 Thread John Fox
Dear Ranjan,

As you no doubt noticed, the Manova() function in the car package, or the 
Anova() function for which Manova() is an alias, produces type II or III tests 
for a multivariate linear model. To compare two nested multivariate linear 
models, as you wish to do, you can use the standard R anova() function -- see 
?anova.mlm.

I hope this helps,
 John


John Fox
Sen. William McMaster Prof. of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

On Mon, 19 Mar 2012 12:31:48 -0500
 Ranjan Maitra mai...@iastate.edu wrote:
 Dear colleagues,
 
 I had a question wrt the car package. How do I evaluate whether a
 simpler multivariate regression model is adequate?
 
 For instance, I do the following:
 
 ami - read.table(file =
 http://www.public.iastate.edu/~maitra/stat501/datasets/amitriptyline.dat;,
 col.names=c(TCAD, drug, gender, antidepressant,PR, dBP,
 QRS))
 
 ami$gender - as.factor(ami$gender)
 ami$TCAD - ami$TCAD/1000
 ami$drug - ami$drug/1000
 
 
 library(car)
 
 fit.lm - lm(cbind(TCAD, drug) ~ gender + antidepressant + PR + dBP +
 QRS, data = ami)
 
 fit.manova - Manova(fit.lm)
 
 fit1.lm - update(fit.lm, .~ . - PR - dBP - QRS)
 
 fit1.manova - Manova(fit1.lm)
 
 
 
 Is there an easy way to find out whether the reduced model is adequate?
 
 I am thinking of something similar to the anova() function, I guess?
 
 Many thanks and best wishes,
 Ranjan
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] hgu133plus2hsentrezgprobe library

2012-03-19 Thread Eleni Christodoulou
Hello R community,

I am processing raw Affymetrix CEL files and I am using the Michigan custom
CDF library hgu133plus2hsentrezgprobe. I have been looking for
documentation on the function that it contains...I am specifically
interested in converting probe names to gene symbols. Does anybody know
where I can find it?

Thank a lot!
Eleni

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fitting a histogram to a Gaussian curve

2012-03-19 Thread R. Michael Weylandt
If I understand you correctly, a univariate Gaussian distribution is
uniquely determined by its first two moments so you can just fit those
directly (using sample mean for population mean and sample variance
with Besel's correction for population variance) and get the best
Gaussian (in a ML sense).

E.g.,

x - rnorm(500, 3, 2)

hist(x, freq = FALSE)
lines(seq(min(x), max(x), length.out = 300) - y, dnorm(y, mean(x),
sd(x)), col = 2)

Hope this helps,
Michael

On Mon, Mar 19, 2012 at 12:47 PM, Vihan Pandey vihanpan...@gmail.com wrote:
 Hello,

 I am trying to fit my histogram to a smooth Gaussian curve(the data
 closely resembles one except a few bars).

 This is my code :

 #!/usr/bin/Rscript

 out_file = irc_20M_opencl_test.png
 png(out_file)

 scan(my.csv) - myvals

 hist(myvals, breaks = 50, main = My Distribution,xlab = My Values)

 pdens - density(myvals, na.rm=T)
 plot(pdens, col=black, lwd=3, xlab=My values, main=Default KDE)

 dev.off()

 print(paste(Plot was saved in:, getwd()))

 the problem here is that I a jagged distribution, you can see the result :

 http://s15.postimage.org/9ucmkx3bf/foobar.png

 this is the original histogram :

 http://s12.postimage.org/e0lfp7d5p/foobar2.png

 any ideas on how I can smoothen it to a Gaussian curve?

 Thanks,

 - vihan

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Automaticall adjust axis scales

2012-03-19 Thread R. Michael Weylandt
I don't believe this is possible in base graphics: you need to plan
your graphics ahead with something like plot(, ylim = range(x1, x2,
x3)). There's a pen-and-paper approach which means once something is
there, it's on the device permanently (unless you write over it).
Perhaps an interactive graphics package would allow it -- but I'll
happily be corrected (and informed) by others.

As a style thing, your use of c() is unnecessary and confusing.

identical(1:10, c(1:10))

Michael

On Mon, Mar 19, 2012 at 2:40 PM, Alaios ala...@yahoo.com wrote:
 Dear all,

 I have made a function that given a number of list elements plot them to the 
 same window.

 The first element is plotted by using plot and all the rest are plotted under 
 the

 same window by using lines.

 I have below a small and simple reproducible example.


 x1-c(1:10)
 plot(x1)

 x2-c(11:20)
 lines(x2)

 x3-c(31:40)
 lines(x3)




 as you might notice
 the two consecutive lines fail to be plotted as the axis were formed by the 
 first plot.
 Would it be possible after the last lines to change the axis to the minimum 
 and the maximum of all data sets to be visible?

 Any idea how I can do that?

 I would like to thank you for your help

 B.R
 Alex

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fitting a histogram to a Gaussian curve

2012-03-19 Thread Vihan Pandey
I see, that could be an option, however isn't there a fitting function
which would do that on given data?

On 19 March 2012 19:49, R. Michael Weylandt michael.weyla...@gmail.com wrote:
 If I understand you correctly, a univariate Gaussian distribution is
 uniquely determined by its first two moments so you can just fit those
 directly (using sample mean for population mean and sample variance
 with Besel's correction for population variance) and get the best
 Gaussian (in a ML sense).

 E.g.,

 x - rnorm(500, 3, 2)

 hist(x, freq = FALSE)
 lines(seq(min(x), max(x), length.out = 300) - y, dnorm(y, mean(x),
 sd(x)), col = 2)

 Hope this helps,
 Michael

 On Mon, Mar 19, 2012 at 12:47 PM, Vihan Pandey vihanpan...@gmail.com wrote:
 Hello,

 I am trying to fit my histogram to a smooth Gaussian curve(the data
 closely resembles one except a few bars).

 This is my code :

 #!/usr/bin/Rscript

 out_file = irc_20M_opencl_test.png
 png(out_file)

 scan(my.csv) - myvals

 hist(myvals, breaks = 50, main = My Distribution,xlab = My Values)

 pdens - density(myvals, na.rm=T)
 plot(pdens, col=black, lwd=3, xlab=My values, main=Default KDE)

 dev.off()

 print(paste(Plot was saved in:, getwd()))

 the problem here is that I a jagged distribution, you can see the result :

 http://s15.postimage.org/9ucmkx3bf/foobar.png

 this is the original histogram :

 http://s12.postimage.org/e0lfp7d5p/foobar2.png

 any ideas on how I can smoothen it to a Gaussian curve?

 Thanks,

 - vihan

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fitting a histogram to a Gaussian curve

2012-03-19 Thread R. Michael Weylandt
Take a look at fitdistr in the MASS package.

fitdistr(x, normal)

I don't think you need to supply start values for the normal since its
loglikelihood function is nicely behaved. You may need to for harder
distributions.

Michael

On Mon, Mar 19, 2012 at 2:54 PM, Vihan Pandey vihanpan...@gmail.com wrote:
 I see, that could be an option, however isn't there a fitting function
 which would do that on given data?

 On 19 March 2012 19:49, R. Michael Weylandt michael.weyla...@gmail.com 
 wrote:
 If I understand you correctly, a univariate Gaussian distribution is
 uniquely determined by its first two moments so you can just fit those
 directly (using sample mean for population mean and sample variance
 with Besel's correction for population variance) and get the best
 Gaussian (in a ML sense).

 E.g.,

 x - rnorm(500, 3, 2)

 hist(x, freq = FALSE)
 lines(seq(min(x), max(x), length.out = 300) - y, dnorm(y, mean(x),
 sd(x)), col = 2)

 Hope this helps,
 Michael

 On Mon, Mar 19, 2012 at 12:47 PM, Vihan Pandey vihanpan...@gmail.com wrote:
 Hello,

 I am trying to fit my histogram to a smooth Gaussian curve(the data
 closely resembles one except a few bars).

 This is my code :

 #!/usr/bin/Rscript

 out_file = irc_20M_opencl_test.png
 png(out_file)

 scan(my.csv) - myvals

 hist(myvals, breaks = 50, main = My Distribution,xlab = My Values)

 pdens - density(myvals, na.rm=T)
 plot(pdens, col=black, lwd=3, xlab=My values, main=Default KDE)

 dev.off()

 print(paste(Plot was saved in:, getwd()))

 the problem here is that I a jagged distribution, you can see the result :

 http://s15.postimage.org/9ucmkx3bf/foobar.png

 this is the original histogram :

 http://s12.postimage.org/e0lfp7d5p/foobar2.png

 any ideas on how I can smoothen it to a Gaussian curve?

 Thanks,

 - vihan

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linear regression

2012-03-19 Thread Bert Gunter
1. Homework assignment? We don't do homework here.

2. If not, a mixture model of some sort?  I suggest you state the
context of the problem more fully. R has several packages to do
mixture modeling, if that's what you're trying to do.

3. In any case, this cannot be done with lm() (at least without tricks).

4. In your notation below, the separate regressions can be stacked
into a single constrained regression model.

5. You might do better to find local statistical help, as you may have
bitten off more than you can chew.

-- Bert

On Mon, Mar 19, 2012 at 11:29 AM, Diviya Smith diviya.sm...@gmail.com wrote:
 Hello there,

 I am new to using regression in R. I wanted to solve a simple regression
 problem where I have 2 equations and 2 unknowns.

 So lets say -
 y1 = alpha1*A + beta1*B
 y2 = alpha2*A + beta2*B

 y1 - runif(10, 0,1)
 y2 - runif(10,0,1)

 alpha1 - 0.6
 alpha2 - 0.75

 beta1 - 1-alpha1
 beta2 - 1-apha2

 I now want this equation to estimate the values of A and B. Both A and B
 are constrained to be between (0,1). I would like to use lm with these
 constraints and I am having a little trouble in defining the equations
 correctly. Any help would be most appreciated.

 Thank you,
 Diviya

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fitting a histogram to a Gaussian curve

2012-03-19 Thread Vihan Pandey
I'll check it out, thanks a million Micheal!

On 19 March 2012 19:59, R. Michael Weylandt michael.weyla...@gmail.com wrote:
 Take a look at fitdistr in the MASS package.

 fitdistr(x, normal)

 I don't think you need to supply start values for the normal since its
 loglikelihood function is nicely behaved. You may need to for harder
 distributions.

 Michael

 On Mon, Mar 19, 2012 at 2:54 PM, Vihan Pandey vihanpan...@gmail.com wrote:
 I see, that could be an option, however isn't there a fitting function
 which would do that on given data?

 On 19 March 2012 19:49, R. Michael Weylandt michael.weyla...@gmail.com 
 wrote:
 If I understand you correctly, a univariate Gaussian distribution is
 uniquely determined by its first two moments so you can just fit those
 directly (using sample mean for population mean and sample variance
 with Besel's correction for population variance) and get the best
 Gaussian (in a ML sense).

 E.g.,

 x - rnorm(500, 3, 2)

 hist(x, freq = FALSE)
 lines(seq(min(x), max(x), length.out = 300) - y, dnorm(y, mean(x),
 sd(x)), col = 2)

 Hope this helps,
 Michael

 On Mon, Mar 19, 2012 at 12:47 PM, Vihan Pandey vihanpan...@gmail.com 
 wrote:
 Hello,

 I am trying to fit my histogram to a smooth Gaussian curve(the data
 closely resembles one except a few bars).

 This is my code :

 #!/usr/bin/Rscript

 out_file = irc_20M_opencl_test.png
 png(out_file)

 scan(my.csv) - myvals

 hist(myvals, breaks = 50, main = My Distribution,xlab = My Values)

 pdens - density(myvals, na.rm=T)
 plot(pdens, col=black, lwd=3, xlab=My values, main=Default KDE)

 dev.off()

 print(paste(Plot was saved in:, getwd()))

 the problem here is that I a jagged distribution, you can see the result :

 http://s15.postimage.org/9ucmkx3bf/foobar.png

 this is the original histogram :

 http://s12.postimage.org/e0lfp7d5p/foobar2.png

 any ideas on how I can smoothen it to a Gaussian curve?

 Thanks,

 - vihan

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Automaticall adjust axis scales

2012-03-19 Thread Jorge I Velez
Perhaps matplot()?

matplot(cbind(x1, x2, x3), type = 'l')

See ?matplot for more information.

HTH,
Jorge.-


On Mon, Mar 19, 2012 at 2:40 PM, Alaios  wrote:

 Dear all,

 I have made a function that given a number of list elements plot them to
 the same window.

 The first element is plotted by using plot and all the rest are plotted
 under the

 same window by using lines.

 I have below a small and simple reproducible example.


 x1-c(1:10)
 plot(x1)

 x2-c(11:20)
 lines(x2)

 x3-c(31:40)
 lines(x3)




 as you might notice
 the two consecutive lines fail to be plotted as the axis were formed by
 the first plot.
 Would it be possible after the last lines to change the axis to the
 minimum and the maximum of all data sets to be visible?

 Any idea how I can do that?

 I would like to thank you for your help

 B.R
 Alex

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Automaticall adjust axis scales

2012-03-19 Thread Alaios
Thanks for the immediate answer.
is ther any alternative for the matplot? There might few limitations with 
matplot in my case. I will post again if needed when I will be at office 
tomorrow.

Regards
Alex




 From: Jorge I Velez jorgeivanve...@gmail.com

Cc: R help R-help@r-project.org 
Sent: Monday, March 19, 2012 9:03 PM
Subject: Re: [R] Automaticall adjust axis scales


Perhaps matplot()?

matplot(cbind(x1, x2, x3), type = 'l')

See ?matplot for more information.

HTH,
Jorge.-



On Mon, Mar 19, 2012 at 2:40 PM, Alaios  wrote:

Dear all,

I have made a function that given a number of list elements plot them to the 
same window.

The first element is plotted by using plot and all the rest are plotted under 
the

same window by using lines.

I have below a small and simple reproducible example.


x1-c(1:10)
plot(x1)

x2-c(11:20)
lines(x2)

x3-c(31:40)
lines(x3)




as you might notice
the two consecutive lines fail to be plotted as the axis were formed by the 
first plot.
Would it be possible after the last lines to change the axis to the minimum 
and the maximum of all data sets to be visible?

Any idea how I can do that?

I would like to thank you for your help

B.R
Alex

       [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regression with proportion data

2012-03-19 Thread Peter Ehlers

On 2012-03-19 07:35, S Ellison wrote:




-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of Georgiana May
Sent: 19 March 2012 14:06
To: r-help@r-project.org
Subject: [R] regression with proportion data

I understand that the binomial function concerns successes
vs. failures and can use those raw data, but the R Book and
other sources seem to suggest that proportion data are usable
as well.  Not so?


You _can_ use a two-column matrix with counts of successes and failures in the 
two columns

And if you know what the number n of observations was (which you would need to 
anyway for using proportions in a logistic regression) youcan calculate that 
matrix from the proportions and n, as long as you're reasonably careful about 
rounf=ding.



Yes, and you can also use the proportions directly; just specify
the corresponding vector of number of trials as the 'weights'
argument in the glm() call. See the Details section of ?glm.

Peter Ehlers



S Ellison***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] where this Error comes from?

2012-03-19 Thread Alaios
Dear all,
While I am executing my code I receive the error below

Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 
  'x' must be atomic


the weird thing that I am not calling anywhere sort function nor do I rely on 
anyh sorting.
How I can discover where this comes from (inside which function?).

I would like to thank you in advance for your help

B.R
Alex

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] where this Error comes from?

2012-03-19 Thread R. Michael Weylandt
I call upon the great and mighty Google (hallowed be its name) to discover:

traceback()

and its more powerful cousin

options(error = recover)

Michael

On Mon, Mar 19, 2012 at 3:22 PM, Alaios ala...@yahoo.com wrote:
 Dear all,
 While I am executing my code I receive the error below

 Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) :
   'x' must be atomic


 the weird thing that I am not calling anywhere sort function nor do I rely on 
 anyh sorting.
 How I can discover where this comes from (inside which function?).

 I would like to thank you in advance for your help

 B.R
 Alex

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] where this Error comes from?

2012-03-19 Thread William Dunlap
Call traceback() after seeing the error message.
E.g.,
   factor(list(1, 2:3, 4:6))
  Error in sort.list(y) : 'x' must be atomic for 'sort.list'
  Have you called 'sort' on a list?
   traceback()
  3: stop('x' must be atomic for 'sort.list'\nHave you called 'sort' on a 
list?)
  2: sort.list(y)
  1: factor(list(1, 2:3, 4:6))
  

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Alaios
 Sent: Monday, March 19, 2012 12:22 PM
 To: R help
 Subject: [R] where this Error comes from?
 
 Dear all,
 While I am executing my code I receive the error below
 
 Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) :
   'x' must be atomic
 
 
 the weird thing that I am not calling anywhere sort function nor do I rely on 
 anyh sorting.
 How I can discover where this comes from (inside which function?).
 
 I would like to thank you in advance for your help
 
 B.R
 Alex
 
   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] where this Error comes from?

2012-03-19 Thread Bert Gunter
?traceback
?options  ## consider changing error option to recover
?debug

Search on debugging in R to find more possibilities. R is a
programming language. You need to learn how to debug code if you wish
to program in R.

-- Bert

On Mon, Mar 19, 2012 at 12:22 PM, Alaios ala...@yahoo.com wrote:
 Dear all,
 While I am executing my code I receive the error below

 Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) :
   'x' must be atomic


 the weird thing that I am not calling anywhere sort function nor do I rely on 
 anyh sorting.
 How I can discover where this comes from (inside which function?).

 I would like to thank you in advance for your help

 B.R
 Alex

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Automaticall adjust axis scales

2012-03-19 Thread William Dunlap
Or look at the xlim and ylim arguments to plot.  E.g.,

 x1 - 1:10 ; x2 - 11:17 ; x3 - 21:23
 plot(NA, NA, xlim=range(1, length(x1), length(x2), length(x3)), 
 ylim=range(x1, x2, x3), type=n, xlab=, ylab=)
 points(x1, type=b)
 lines(x2)
 points(x3)
 title(xlab=The X Values, ylab=The Y Values)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Jorge I Velez
 Sent: Monday, March 19, 2012 12:03 PM
 To: Alaios
 Cc: R help
 Subject: Re: [R] Automaticall adjust axis scales
 
 Perhaps matplot()?
 
 matplot(cbind(x1, x2, x3), type = 'l')
 
 See ?matplot for more information.
 
 HTH,
 Jorge.-
 
 
 On Mon, Mar 19, 2012 at 2:40 PM, Alaios  wrote:
 
  Dear all,
 
  I have made a function that given a number of list elements plot them to
  the same window.
 
  The first element is plotted by using plot and all the rest are plotted
  under the
 
  same window by using lines.
 
  I have below a small and simple reproducible example.
 
 
  x1-c(1:10)
  plot(x1)
 
  x2-c(11:20)
  lines(x2)
 
  x3-c(31:40)
  lines(x3)
 
 
 
 
  as you might notice
  the two consecutive lines fail to be plotted as the axis were formed by
  the first plot.
  Would it be possible after the last lines to change the axis to the
  minimum and the maximum of all data sets to be visible?
 
  Any idea how I can do that?
 
  I would like to thank you for your help
 
  B.R
  Alex
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linear regression

2012-03-19 Thread Diviya Smith
Hello Bert,

This is definitely not for a homework problem. I am trying to estimate
frequencies of mutations in different groups. The mutation frequencies can
be modeled as a linear relation in cases of mixtures. So I have a lot of
populations that follow the relationship -

y = alpha*A + beta*B and I want to estimate A and B; given y, alpha and
beta. A and B are both vectors of the same size as y.

Can you suggest where I can find some information about your suggestion
#4...that is exactly what I was hoping to do.

Thanks,
Diviya



On Mon, Mar 19, 2012 at 3:02 PM, Bert Gunter gunter.ber...@gene.com wrote:

 1. Homework assignment? We don't do homework here.

 2. If not, a mixture model of some sort?  I suggest you state the
 context of the problem more fully. R has several packages to do
 mixture modeling, if that's what you're trying to do.

 3. In any case, this cannot be done with lm() (at least without tricks).

 4. In your notation below, the separate regressions can be stacked
 into a single constrained regression model.

 5. You might do better to find local statistical help, as you may have
 bitten off more than you can chew.

 -- Bert

 On Mon, Mar 19, 2012 at 11:29 AM, Diviya Smith diviya.sm...@gmail.com
 wrote:
  Hello there,
 
  I am new to using regression in R. I wanted to solve a simple regression
  problem where I have 2 equations and 2 unknowns.
 
  So lets say -
  y1 = alpha1*A + beta1*B
  y2 = alpha2*A + beta2*B
 
  y1 - runif(10, 0,1)
  y2 - runif(10,0,1)
 
  alpha1 - 0.6
  alpha2 - 0.75
 
  beta1 - 1-alpha1
  beta2 - 1-apha2
 
  I now want this equation to estimate the values of A and B. Both A and B
  are constrained to be between (0,1). I would like to use lm with these
  constraints and I am having a little trouble in defining the equations
  correctly. Any help would be most appreciated.
 
  Thank you,
  Diviya
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



 --

 Bert Gunter
 Genentech Nonclinical Biostatistics

 Internal Contact Info:
 Phone: 467-7374
 Website:

 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linear regression

2012-03-19 Thread Niloofar Javanrouh
hi
I think You Can Use  solve function to solve the equations.
 

___

 Niloofar.Javanrouh
MSc Student Of BioStatistics
Mashad University Of Medical Sciences
  


 From: Bert Gunter gunter.ber...@gene.com
To: Diviya Smith diviya.sm...@gmail.com 
Cc: r-help@r-project.org 
Sent: Monday, March 19, 2012 11:32 PM
Subject: Re: [R] Linear regression
  
1. Homework assignment? We don't do homework here.

2. If not, a mixture model of some sort?  I suggest you state the
context of the problem more fully. R has several packages to do
mixture modeling, if that's what you're trying to do.

3. In any case, this cannot be done with lm() (at least without tricks).

4. In your notation below, the separate regressions can be stacked
into a single constrained regression model.

5. You might do better to find local statistical help, as you may have
bitten off more than you can chew.

-- Bert

On Mon, Mar 19, 2012 at 11:29 AM, Diviya Smith diviya.sm...@gmail.com wrote:
 Hello there,

 I am new to using regression in R. I wanted to solve a simple regression
 problem where I have 2 equations and 2 unknowns.

 So lets say -
 y1 = alpha1*A + beta1*B
 y2 = alpha2*A + beta2*B

 y1 - runif(10, 0,1)
 y2 - runif(10,0,1)

 alpha1 - 0.6
 alpha2 - 0.75

 beta1 - 1-alpha1
 beta2 - 1-apha2

 I now want this equation to estimate the values of A and B. Both A and B
 are constrained to be between (0,1). I would like to use lm with these
 constraints and I am having a little trouble in defining the equations
 correctly. Any help would be most appreciated.

 Thank you,
 Diviya

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coverage Probability

2012-03-19 Thread hubinho
Thank you very much. This was, was i needed. Unfortunately I have one futher
problem with this Code. I don't only need the coverage probability for one
but for a range of different odds ratios. (for example [1;30]). I tried it
with a loop but I get an error. I think again, that I'm almost there but
having a little mistake. The complete code is:

#setting values

n1 - 10
n2 - 10
y - 100
alpha - 1
z-1.96

# creating 2x2 table

for (i in 1:30)

{

theta - i
x1 - exp(alpha +theta)/ (1+  exp(alpha +theta))
x2 - exp(alpha)/ (1+  exp(alpha))


n11 - rbinom(y, 10, x1)
n12 - n1 - n11
n21 - rbinom(y, 10, x2)
n22 - n2 - n21

# upper and lower limit gart interval

gartu -function(z,d,e, f, g){log(((d+.5)*(g+.5))/((e+.5)*(f+.5)))+
z*sqrt(1/(d+.5)+1/(e+.5)+1/(f+.5)+1/(g+.5))}
gartl -function(z,d,e, f, g){log(((d+.5)*(g+.5))/((e+.5)*(f+.5)))-
z*sqrt(1/(d+.5)+1/(e+.5)+1/(f+.5)+1/(g+.5))}


u - gartu(z, n11[i],n22[i],n12[i],n21[i])
l - gartl(z, n11[i],n22[i],n12[i],n21[i])

foo - function(theta, u, l) mean(theta = l  theta = u, na.rm = TRUE)
 foo(theta, u, l)
}

--
View this message in context: 
http://r.789695.n4.nabble.com/Coverage-Probability-tp4485511p4485865.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] acs package: analyze data from the U.S. American Community Survey

2012-03-19 Thread Ezra Haber Glenn

We are pleased to announce version 0.8 of the acs package for R, now
available on CRAN
(http://cran.r-project.org/web/packages/acs/index.html.

The package provides a general toolkit for managing, analyzing, and
presenting data from the U.S. Census American Community Survey
(ACS). Confidence intervals provided with the data are converted to
standard errors and bundled with estimates in complex acs-class
objects. The package provides new methods to conduct standard
operations, plots, and tests on acs objects in statistically
appropriate ways.

In addition to improved documentation and bug-fixes, highlights include:

* An improved read.acs function for importing data downloaded
  from the Census American FactFinder site.

* rbind and cbind functions to help create larger acs objects
  from smaller ones.

* A sum method to aggregate rows or columns of ACS data, dealing
  correctly with both estimates and standard errors.

* A new apply method to allow users to apply virtually any
  function to each row or column of an acs data object.

* A snazzy new plot method capable of plotting both density
  plots (for estimates of a single geography and variable) and
  multiple estimates with errors bars (for estimates of the same
  variable over multiple geographies, or vice versa).

* New functions two deal with adjusting the nominal values of
  currency from different years for the purpose of comparing
  between one survey and another.

* A new prompt method to serve as a helper function when changing
  geographic rownames or variable column names.

For more info, examples, and demo plots, see the package documentation
and/or
http://eglenn.scripts.mit.edu/citystate/2012/03/acs-package-updated-version-0-8-now-on-cran/.

--
Ezra Haber Glenn, AICP
Lecturer in Community Development
Department of Urban Studies and Planning
Massachusetts Institute of Technology
77 Massachusetts Ave., Room 7-337
Cambridge, MA 02139
egl...@mit.edu 
http://dusp.mit.edu/faculty/eglenn | http://eglenn.scripts.mit.edu/citystate/
617.253.2024 (w)
617.721.7131 (c)

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linear regression

2012-03-19 Thread Bert Gunter
Note that your equations can be written:

y = alpha*A + (1-alpha)*B,  which is equivalent to

y = (A-B) * alpha + B   , i.e. of form

y = C*alpha + B   a simple linear equation in alpha

You have two different values of alpha at which y was measured, so
just stack up all your results into a single regression setup with
these two different alphas. Except for your constraints.

But if I understand you correctly, the problem is not that C and B
must be between 0 and 1, it is that the response, y, must be (it is a
frequency). If so, this suggests that you need to set this up as a
glm, probably with a binomial link. Trivial to do, but I suspect you
don't know about glm's, which is why I said that you may be out of
your depth and seek local help.

If I'm wrong, my apologies for misunderstanding. If I'm not, I'm
sorry, but I don't wish to teach you about basic statistics on this
list. Read up on generalized linear models, for which there are
undoubtedly a host of good web tutorials available.

Cheers,
Bert

On Mon, Mar 19, 2012 at 12:48 PM, Diviya Smith diviya.sm...@gmail.com wrote:
 Hello Bert,

 This is definitely not for a homework problem. I am trying to estimate
 frequencies of mutations in different groups. The mutation frequencies can
 be modeled as a linear relation in cases of mixtures. So I have a lot of
 populations that follow the relationship -

 y = alpha*A + beta*B and I want to estimate A and B; given y, alpha and
 beta. A and B are both vectors of the same size as y.

 Can you suggest where I can find some information about your suggestion
 #4...that is exactly what I was hoping to do.

 Thanks,
 Diviya



 On Mon, Mar 19, 2012 at 3:02 PM, Bert Gunter gunter.ber...@gene.com wrote:

 1. Homework assignment? We don't do homework here.

 2. If not, a mixture model of some sort?  I suggest you state the
 context of the problem more fully. R has several packages to do
 mixture modeling, if that's what you're trying to do.

 3. In any case, this cannot be done with lm() (at least without tricks).

 4. In your notation below, the separate regressions can be stacked
 into a single constrained regression model.

 5. You might do better to find local statistical help, as you may have
 bitten off more than you can chew.

 -- Bert

 On Mon, Mar 19, 2012 at 11:29 AM, Diviya Smith diviya.sm...@gmail.com
 wrote:
  Hello there,
 
  I am new to using regression in R. I wanted to solve a simple regression
  problem where I have 2 equations and 2 unknowns.
 
  So lets say -
  y1 = alpha1*A + beta1*B
  y2 = alpha2*A + beta2*B
 
  y1 - runif(10, 0,1)
  y2 - runif(10,0,1)
 
  alpha1 - 0.6
  alpha2 - 0.75
 
  beta1 - 1-alpha1
  beta2 - 1-apha2
 
  I now want this equation to estimate the values of A and B. Both A and B
  are constrained to be between (0,1). I would like to use lm with these
  constraints and I am having a little trouble in defining the equations
  correctly. Any help would be most appreciated.
 
  Thank you,
  Diviya
 
         [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



 --

 Bert Gunter
 Genentech Nonclinical Biostatistics

 Internal Contact Info:
 Phone: 467-7374
 Website:

 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm





-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coverage Probability

2012-03-19 Thread Jorge I Velez
Hi hubinho,

You need to initialize the for() loop and then store the u and l values
properly:

# parameters
n1 - 10
n2 - 10
y - 100
alpha - 1
z-1.96

# creating B 2x2 tables
B - 50
u - l - vector('numeric', B)
for (i in 1:B){
theta - i
x1 - exp(alpha +theta)/ (1+  exp(alpha +theta))
x2 - exp(alpha)/ (1+  exp(alpha))

n11 - rbinom(y, 10, x1)
n12 - n1 - n11
n21 - rbinom(y, 10, x2)
n22 - n2 - n21

# upper and lower limit gart interval
gartu -function(z,d,e, f, g){log(((d+.5)*(g+.5))/((e+.5)*(f+.5)))+
z*sqrt(1/(d+.5)+1/(e+.5)+1/(f+.5)+1/(g+.5))}
gartl -function(z,d,e, f, g){log(((d+.5)*(g+.5))/((e+.5)*(f+.5)))-
z*sqrt(1/(d+.5)+1/(e+.5)+1/(f+.5)+1/(g+.5))}

# store results
u[i] - gartu(z, n11[i],n22[i],n12[i],n21[i])
l[i] - gartl(z, n11[i],n22[i],n12[i],n21[i])
}

# coverage
theta - 1:B
foo - function(theta, u, l) mean(theta = l  theta = u, na.rm = TRUE)
foo(theta, u, l)
#  [1] 0.14

HTH,
Jorge.-


On Mon, Mar 19, 2012 at 2:25 PM, hubinho  wrote:

 Thank you very much. This was, was i needed. Unfortunately I have one
 futher
 problem with this Code. I don't only need the coverage probability for one
 but for a range of different odds ratios. (for example [1;30]). I tried it
 with a loop but I get an error. I think again, that I'm almost there but
 having a little mistake. The complete code is:

 #setting values

 n1 - 10
 n2 - 10
 y - 100
 alpha - 1
 z-1.96

 # creating 2x2 table

 for (i in 1:30)

 {

 theta - i
 x1 - exp(alpha +theta)/ (1+  exp(alpha +theta))
 x2 - exp(alpha)/ (1+  exp(alpha))


 n11 - rbinom(y, 10, x1)
 n12 - n1 - n11
 n21 - rbinom(y, 10, x2)
 n22 - n2 - n21

 # upper and lower limit gart interval

 gartu -function(z,d,e, f, g){log(((d+.5)*(g+.5))/((e+.5)*(f+.5)))+
 z*sqrt(1/(d+.5)+1/(e+.5)+1/(f+.5)+1/(g+.5))}
 gartl -function(z,d,e, f, g){log(((d+.5)*(g+.5))/((e+.5)*(f+.5)))-
 z*sqrt(1/(d+.5)+1/(e+.5)+1/(f+.5)+1/(g+.5))}


 u - gartu(z, n11[i],n22[i],n12[i],n21[i])
 l - gartl(z, n11[i],n22[i],n12[i],n21[i])

 foo - function(theta, u, l) mean(theta = l  theta = u, na.rm = TRUE)
  foo(theta, u, l)
 }

 --
 View this message in context:
 http://r.789695.n4.nabble.com/Coverage-Probability-tp4485511p4485865.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Lag based on Date objects with non-consecutive values

2012-03-19 Thread Sam Albers
Hello all,

I need to figure out a way to lag a variable in by a number of days
without using the zoo package. I need to use a remote R connection
that doesn't have the zoo package installed and is unwilling to do so.
So that is, I want a function where I can specify the number of days
to lag a variable against a Date formatted column. That is relatively
easy to do. The problem arises when I don't have consecutive dates. I
can't seem to figure out a way to insert an NA when there is
non-consecutive date. So for example:


## A dataframe with non-consecutive dates
set.seed(32)
df1-data.frame(
   Date=seq(as.Date(1967-06-05,%Y-%m-%d),by=day, length=5),
   Dis1=rnorm(5, 1,10)
   )
df2-data.frame(
  Date=seq(as.Date(1967-07-05,%Y-%m-%d),by=day, length=10),
  Dis1=rnorm(5, 1,10)
  )

df - rbind(df1,df2); df

## A function to lag the variable by a specified number of days
lag.day - function (lag.by, data) {
  c(rep(NA,lag.by), head(data$Dis1, -lag.by))
}

## Using the function
df$lag1 - lag.day(lag.by=1, data=df); df
## returns this data frame

 Date  Dis1  lag1
1  1967-06-05  1.146405NA
2  1967-06-06  9.732887  1.146405
3  1967-06-07 -9.279462  9.732887
4  1967-06-08  7.856646 -9.279462
5  1967-06-09  5.494370  7.856646
6  1967-06-15  5.070176  5.494370
7  1967-06-16  3.847314  5.070176
8  1967-06-17 -5.243094  3.847314
9  1967-06-18  9.396560 -5.243094
10 1967-06-19  4.112792  9.396560


## When really what I would like is something like this:

 Date  Dis1  lag1
1  1967-06-05  1.146405NA
2  1967-06-06  9.732887  1.146405
3  1967-06-07 -9.279462  9.732887
4  1967-06-08  7.856646 -9.279462
5  1967-06-09  5.494370  7.856646
6  1967-06-15  5.070176  NA
7  1967-06-16  3.847314  5.070176
8  1967-06-17 -5.243094  3.847314
9  1967-06-18  9.396560 -5.243094
10 1967-06-19  4.112792  9.396560

So can anyone recommend a way (either using my function or any other
approaches) that I might be able to consistently lag values based on a
lag.by value and consecutive dates?

Thanks so much in advance!

Sam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coverage Probability

2012-03-19 Thread hubinho
Thank you very much again. But in this case I get the coverage probability as
an average over all values for the odds ratio.

I need a coverage probability for every value for the odds ratio.

So the coverage probability for odds ratio = 1, than for odds ratio = 2 and
so on.

Sorry to bother you again but I have some problems with loops.

--
View this message in context: 
http://r.789695.n4.nabble.com/Coverage-Probability-tp4485511p4486264.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Unexpected input in function

2012-03-19 Thread Schryver, Jack C.
Hi,

Although the following statements work individually in R, they produce an error 
if placed inside a function as below:

fsubt - function(a) {
b - 1:length(a)
b-a
}

The error message is:

Error: unexpected input in:
b - 1:length(a)
b-

Any insight would be greatly appreciated.

Thanks,
Jack

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected input in function

2012-03-19 Thread Sarah Goslee
I think you'll need to provide a reproducible example, because your
code works for me:

 fsubt - function(a) {
+ b - 1:length(a)
+ b-a
+ }


 fsubt(1:5)
[1] 0 0 0 0 0

 fsubt(sample(1:10))
 [1] -8 -6  1  1 -1  5  3  1  4  0

 fsubt(2)
[1] -1


On Mon, Mar 19, 2012 at 4:01 PM, Schryver, Jack C. schryve...@ornl.gov wrote:
 Hi,

 Although the following statements work individually in R, they produce an 
 error if placed inside a function as below:

 fsubt - function(a) {
 b - 1:length(a)
 b-a
 }

 The error message is:

 Error: unexpected input in:
 b - 1:length(a)
 b-

 Any insight would be greatly appreciated.

 Thanks,
 Jack

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] hgu133plus2hsentrezgprobe library

2012-03-19 Thread Iain Gallagher
Hi Eleni

Question like this are better served on the bioconductor mailing list.

Nonetheless try this

ALL - topTable(fit2, coef=1, number=Inf)
ALL$SYMBOL - unlist(mget(ALL$ID, hgu133plus2hsentrezgSYMBOL, ifnotfound=NA))

Here ALL is the output from limma for differential expression (ALL$ID is the 
probe on ENTREZ centric cdf from brainarray).

Best

Iain



- Original Message -
From: Eleni Christodoulou elenic...@gmail.com
To: r-help@r-project.org
Cc: 
Sent: Monday, 19 March 2012, 18:47
Subject: [R] hgu133plus2hsentrezgprobe library

Hello R community,

I am processing raw Affymetrix CEL files and I am using the Michigan custom
CDF library hgu133plus2hsentrezgprobe. I have been looking for
documentation on the function that it contains...I am specifically
interested in converting probe names to gene symbols. Does anybody know
where I can find it?

Thank a lot!
Eleni

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected input in function

2012-03-19 Thread R. Michael Weylandt
The OP's error suggests (to me) that there's a line break error
somewhere so it may be a funny quirk of encoding/OS incompatibility if
it's from a source()'d script.

Incidentally, the OP could also write the body of his function as a
one liner with:

seq_along(a) - a

Michael

On Mon, Mar 19, 2012 at 4:33 PM, Sarah Goslee sarah.gos...@gmail.com wrote:
 I think you'll need to provide a reproducible example, because your
 code works for me:

 fsubt - function(a) {
 + b - 1:length(a)
 + b-a
 + }


 fsubt(1:5)
 [1] 0 0 0 0 0

 fsubt(sample(1:10))
  [1] -8 -6  1  1 -1  5  3  1  4  0

 fsubt(2)
 [1] -1


 On Mon, Mar 19, 2012 at 4:01 PM, Schryver, Jack C. schryve...@ornl.gov 
 wrote:
 Hi,

 Although the following statements work individually in R, they produce an 
 error if placed inside a function as below:

 fsubt - function(a) {
 b - 1:length(a)
 b-a
 }

 The error message is:

 Error: unexpected input in:
 b - 1:length(a)
 b-

 Any insight would be greatly appreciated.

 Thanks,
 Jack

 --
 Sarah Goslee
 http://www.functionaldiversity.org

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coverage Probability

2012-03-19 Thread Jorge I Velez
Hi hubinho,

This starts to look as homework to me so this will be my last try in
helping you.

The general strategy would be along the lines of (1) write a function that
does what you want for a value of theta and (2) sapply() that function to
the vector of theta values you would like to evaluate:

# function
# -- B is the number of tables
foo2 - function(theta, n1, n2, B = 1000, alpha = 1, z = 1.96){
# 2x2 tables
x1 - exp(alpha +theta)/ (1+  exp(alpha +theta))
x2 - exp(alpha)/ (1+  exp(alpha))
n11 - rbinom(B, n1, x1)
n12 - n1 - n11
n21 - rbinom(B, n2, x2)
n22 - n2 - n21

# upper and lower limit gart interval
gartu -function(z,d,e, f, g) log(((d+.5)*(g+.5))/((e+.5)*(f+.5)))+
z*sqrt(1/(d+.5)+1/(e+.5)+1/(f+.5)+1/(g+.5))
gartl -function(z,d,e, f, g) log(((d+.5)*(g+.5))/((e+.5)*(f+.5)))-
z*sqrt(1/(d+.5)+1/(e+.5)+1/(f+.5)+1/(g+.5))

# calculations and results
u - gartu(z, n11, n22, n12, n21)
l - gartl(z, n11, n22, n12, n21)
theta = l  theta = u  # TRUE if theta is in (l, u)
}

# example
# -- B is the number of tables
res - foo2(theta = 1, n1 = 10, n2 = 10, B = 1000)
res

# coverage
mean(res)

# different values of theta
Theta - 1:30
colMeans(sapply(Theta, foo2, n1 = 10, n2 = 10, B = 1000))

HTH,
Jorge.-


On Mon, Mar 19, 2012 at 4:24 PM, hubinho  wrote:

 Thank you very much again. But in this case I get the coverage probability
 as
 an average over all values for the odds ratio.

 I need a coverage probability for every value for the odds ratio.

 So the coverage probability for odds ratio = 1, than for odds ratio = 2 and
 so on.

 Sorry to bother you again but I have some problems with loops.

 --
 View this message in context:
 http://r.789695.n4.nabble.com/Coverage-Probability-tp4485511p4486264.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected input in function

2012-03-19 Thread Ted Harding
I think the most likely explanation is that something in
the input string has had the effect of inserting an invisible
character between the - and the a in b-a, and a
possible suspect is pollution by UTF8: see the discussion at

http://r.789695.n4.nabble.com/unexpected-input-in-rpart-td3168363.html

Or a character copypasted from an editor that uses a
non-ASCII encoding for its characters. See e.g.:

http://support.rstudio.org/help/discussions/problems/
386-error-unexpected-input-in

and:

http://www.mail-archive.com/r-help@r-project.org/msg71798.html


On 19-Mar-2012 Sarah Goslee wrote:
 I think you'll need to provide a reproducible example, because your
 code works for me:
 
 fsubt - function(a) {
 + b - 1:length(a)
 + b-a
 + }


 fsubt(1:5)
 [1] 0 0 0 0 0

 fsubt(sample(1:10))
  [1] -8 -6  1  1 -1  5  3  1  4  0

 fsubt(2)
 [1] -1
 
 
 On Mon, Mar 19, 2012 at 4:01 PM, Schryver, Jack C. schryve...@ornl.gov
 wrote:
 Hi,

 Although the following statements work individually in R, they produce an
 error if placed inside a function as below:

 fsubt - function(a) {
 b - 1:length(a)
 b-a
 }

 The error message is:

 Error: unexpected input in:
 b - 1:length(a)
 b-

 Any insight would be greatly appreciated.

 Thanks,
 Jack
 
 -- 
 Sarah Goslee
 http://www.functionaldiversity.org

-
E-Mail: (Ted Harding) ted.hard...@wlandres.net
Date: 19-Mar-2012  Time: 20:56:04
This message was sent by XFMail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >